aGBSQL | Universität Tübingen

aGBSQL -- this module reads GenBank files from disk and extracts upstream regulatory sequences. These sequences are saved internally as FastaRecord objects (see <link fasta.html>fasta) and can be saved to file in FASTA format.

Class gbSQL(_files = [], _start = -1000, _end = 0, _root = 0, _mRNAannot = 0, _overlap = 1)

this will hold the list of GB files

this will hold the list of FASTA seqs as FastaRecord

start position (5')

start position (3')

if overlap should be allowed with upstream genes (1=yes, 0=no)

0 = uses the gene annotation (many more files) or 1 = mRNA annotation (not available for all files) for choosing where the 5' of the transcript is. Default is set to gene.

checkvalues(self)

this is called to check to see if the values are appropiate before running. if there is an error, an error message is sent and the script terminated.

clear(self)

clears data in self.extracted and self.dict

getsubset(locustags = [])

this will return a subset of genes that match the FastaRecord.shortname string

savefastatofile(_filehandle)

this will save a fasta list to a text file. It works by calling an instance of Fasta.write_to_file()

savedicttofile(_filehandle)

this will save the dict entries to a text file.

getsubset(_locustags = [])

this will return a subset of genes from the dictionary object (only)

getpromoters_gene(self):

this is the main function to call for extracting promoters, use when only the gene tag is present in the GenBank file; global variables are used for toggle parameters

getpromoters_locustag(self):

this is the main function to call for extracting promoters, use when the locus tag is present in the GenBank file; global variables are used for toggle parameters