fasta -- this module contains classes for the reading and writing of FASTA files, and a record class for sequences
Module Functions (do not require an instance)
to_SeqRecord(_fRec, _alphabet=Bio.Alphabet.IUPAC.ambiguous_dna)
- converts a FastaRecord instance into a SeqRecord instance. the _fRec object is not destroyed by this function. Since this is a function you do not need a class instance to use it.
convert_dict_to_fasta(_dict={})
- this will convert a dictonary into FASTA records
convert_fasta_to_dict(_fastaList=[])
- this will convert a list of FastaRecords and return a dictionary version
write_to_single_file(_fRecList, _filehandle)
- this will write out FastaRecord list object to a SINGLE text file. The file handle must be previously opened by the user
write_to_files(_fRecList, _filehandle)
- this will write out FastaRecord list object to a text file. The FASTA header is the longname. The file handle must be previously opened by the user
FASTA_names(_filepath)
- returns a list of names from a FASTA file, using the path as input. If you already have a list of FastaRecords or FASTA_dict you have the names already. success = list; no FASTA names = empty list; failure to read file -1
Class FastaRecord (_short='',_long='',_seq='',_alphabet=Alphabet.IUPAC.ambiguous_dna)
- shortname
- a short unique name. it is not forced, but characters after > and before a whitespace are used when calling Loader or Iterator
- holds the full name in the FASTA header
- DNA sequence
- Compares sequence. You can compare FASTA_record objects with FASTA_record,
SeqRecord.SeqRecord or string objects using = = ,case independent
- returns sequence length
Class Loader (_filepath)
- After making the instance load() is called which sends back 1 (success), 0 (not FASTA), -1 (not readable) files will be opened and closed when using load() when initalizing, SplitFastaString is automatically called. if you change the _filepath later, you can call all these manually you can also just set the self.filepath directly after initialization but then you must call load() again, otherwise the previous file will be loaded
self.filepath = _filepath #path to FASTA file self.AllText = '' #variable for holding the FASTA file self.flist = [] #list of FastaRecords captured from FASTA file self.fdict = {} #same set of FastaRecords as a dictionary, the shortnames are used as keys
load(_list, _string, pos = 0)
- After making an Loader instance load() is called which sends back 1 (success), 0 (not FASTA), -1 (not readable) files will be opened and closed when using load() when initializing, SplitFastaString is automatically called. if you change the _filepath later, you can call all these manually
split_fasta_string(alphabet=Alphabet.IUPAC.ambiguous_dna)
- This will split the string and fill self.flist object with FastaRecords this requires that load() was already called to fill self.AllText
makedict(_alphabet=Alphabet.IUPAC.ambiguous_dna)
- this converts the flist List into fdict Dictionary where the shortname is the key and the sequence is the value. The longname is discarded. If there are redundantly named sequences in the fasta file, the following sequence names are appended with a running number to distinguish the entries from eachother. The user is responsible for making sure there are no duplicate names.
Class Iterator (filepath, alphabet=Alphabet.IUPAC.ambiguous_dna)
- this class provides a method for iterating through a FASTA file. No 'proper' format check for FASTA conformity is made. The class uses the readline() function - therefore if a different line divider is used, you need to convert the file before hand and save it as a separate file. To use, make and instance and call the next() function.