fasta | University of Tübingen

fasta -- this module contains classes for the reading and writing of FASTA files, and a record class for sequences

Module Functions (do not require an instance)

to_SeqRecord(_fRec, _alphabet=Bio.Alphabet.IUPAC.ambiguous_dna)

converts a FastaRecord instance into a SeqRecord instance. the _fRec object is not destroyed by this function. Since this is a function you do not need a class instance to use it.

convert_dict_to_fasta(_dict={})

this will convert a dictonary into FASTA records

convert_fasta_to_dict(_fastaList=[])

this will convert a list of FastaRecords and return a dictionary version

write_to_single_file(_fRecList, _filehandle)

this will write out FastaRecord list object to a SINGLE text file. The file handle must be previously opened by the user

write_to_files(_fRecList, _filehandle)

this will write out FastaRecord list object to a text file. The FASTA header is the longname. The file handle must be previously opened by the user

FASTA_names(_filepath)

returns a list of names from a FASTA file, using the path as input. If you already have a list of FastaRecords or FASTA_dict you have the names already. success = list; no FASTA names = empty list; failure to read file -1

Class FastaRecord (_short='',_long='',_seq='',_alphabet=Alphabet.IUPAC.ambiguous_dna)

a short unique name. it is not forced, but characters after > and before a whitespace are used when calling Loader or Iterator

holds the full name in the FASTA header

DNA sequence

sequence

returns sequence length

Class Loader (_filepath)

       	 self.filepath = _filepath		#path to FASTA file self.AllText = ''			#variable for holding the FASTA file self.flist = []				#list of FastaRecords captured from FASTA file self.fdict = {}				#same set of FastaRecords as a dictionary, the shortnames are used as keys

load(_list, _string, pos = 0)

After making an Loader instance load() is called which sends back 1 (success), 0 (not FASTA), -1 (not readable) files will be opened and closed when using load() when initializing, SplitFastaString is automatically called. if you change the _filepath later, you can call all these manually

split_fasta_string(alphabet=Alphabet.IUPAC.ambiguous_dna)

This will split the string and fill self.flist object with FastaRecords this requires that load() was already called to fill self.AllText

makedict(_alphabet=Alphabet.IUPAC.ambiguous_dna)

this converts the flist List into fdict Dictionary where the shortname is the key and the sequence is the value. The longname is discarded. If there are redundantly named sequences in the fasta file, the following sequence names are appended with a running number to distinguish the entries from eachother. The user is responsible for making sure there are no duplicate names.

Class Iterator (filepath, alphabet=Alphabet.IUPAC.ambiguous_dna)

this class provides a method for iterating through a FASTA file. No 'proper' format check for FASTA conformity is made. The class uses the readline() function - therefore if a different line divider is used, you need to convert the file before hand and save it as a separate file. To use, make and instance and call the next() function.