Motif Mapper for Python v1.4
Download Current Version: v1.4
Requirements*:
- Python 2.5* or better, BioPython*, Numpy (not needed on install)
License:
GNU GPL v.2, Motif Mapper Creative License. Any restrictions for non-academics: as described in GPL v.2
Description
Python versions of some Motif Mapper scripts were written. All modules are independent to allow more Python like use compared to the VB versions. For example, you can retain motif objects and counts or FASTA sequences in memory instead of saving directly to disk.
- GenBank parser for extracting promoter sequences
simple counting motifs in FASTA files
Promoter PointMaps (new) and Curves (see Berendzen et al., 2006)
small motif generators, etc. automatic folder reading and building for large mapping projects.
Documentation :
aGBSQL provides a class for extracting upstream regulatory sequence from genomic DNA
folders provides a class for stoage and output management
fasta provides classes for reading, writing and loading sequences in FASTA
mapping provides a class for implementing mapping projects
motifs provides classes for the generation, processing and conversion of DNA motifs
History
Python 1.1
- Core scripts.
Python 1.2
- Addition of PointCurve algorithms.
Improvement of GenBank parser use the Gene or LocusTag annotations
Python 1.3
- motifs.smartmotifs
Added Smart Motifs to the motifs module and optional calls in module mapping Smart Motifs does automatic sense/antisense counts and for composite motifs / modules (more than one motif seperated by a fixed or variable spacing) then it will check all watson and crick word versions for each motif while maintaining their relative positions to each other. e.g. A and P where A is a non-palindrome (A and B words are possible) and P a palindrome; then smart motifs looks for A-P and B-P. If you have two non-palindromes (A/B and C/D) then it looks for A-C, A-D, B-C, B-D and returns one value.
mapping.MapFasta()
integrated smartmotifs as a property option to the MapFasta class but not for point maps / curves yet. added a property ".globalallfiles" to return the gobal counts for all sequences. in all files instead of the previous all sequences for one file. This can be
toggled on or off.
aGBSQL
some GenBank files do not have a xdb_ref property which caused the script to crash. This is now avoided using a try block.
Python 1.4
- Variable assigment error discovered in aGBSQL which caused incorrect promoter extraction when not allowing gene overlap in the previous versions. Corrected.
Example Scripts
Extracting Promoter Sequences
- Here is a script that loads a GenBank flat-file from disk and extracts upstream regulatory regions and saves them to a file:
from MotifMapper import aGBSQL #import the aGBSQL moduleFor knowing what the different toggle states are, you can run the command print for the ClassInstance which calls __str__. In this case we call, g.
g = aGBSQL.gbSQL() #create an instance of gbSQL
g.gbfiles = ['File1.gbk', 'File2.gbk'] #set files, are in current working directory
g.getpromoters_gene() #execute promoter extraction
g.savetofile(open('output.txt' ,'w' )) #save extracted promoters in FASTA to file
print gHere is a hint for saving extracted promoters sequences to the default Motif Mapper output folder:
set self.gbfiles as a list of paths; then run getpromoters()
names are also variables
start: -1000
end: 0
root: 0
overlap: 1
option mRNAannot: 0
number GBfiles files loaded (gbfiles) 0
number fasta seqs loaded (extracted) 0
number dictionary entries (dict) 0
>>>
from MotifMapper import folders #load the folders module
f = folders.Folders() #make an instance of the Folders class
g.savetofile(open(f.rootfolder + '\\output_name.txt' ,'w' )) #save the promoters to file in FASTA
Mapping CREs in FASTA sequences
- Here is a script that loads files with sequences in FASTA, maps CREs to them and saves the results to disk:
from MotifMapper import mapping #imports the mapping moduleFor knowing what the different toggle states are, you can run the command print for the ClassInstance which calls __str__. In this case we call, m:
m = mapping.MapFasta() #create and instance of MapFasta
m.loadMotifs(man=1,ls=['cacgtg{0,30}cacgtg','cacgtg']) #load motifs manually, sets the list ls with members
m.loadFiles(man=1) #load files manually
>>>myFastaFile1.txt #entered filepath of a file with sequences in FASTA
>>>myFastaFile2.txt #entered filepath of a file with sequences in FASTA
>>>q #terminate entering filespaths
m.mapFiles() #map entered motifs onto entered files, the results are automatically deposited
# into a folder named the same as the input file, under the default master folder; see below.
print m
Your root path is: E:\Documents and Settings\user\Motif-Mapper #default folder, Windows version
this is under self.Fld.settings['ROOTFOLDER'] #tells where you can change the RootFolder
To use: loadMotifs() and loadFiles(), then run mapFiles() #hints for simple use
Sequences in Fdict (mapObjs): 0 #tells number of dictionary sequences loaded
fileAppend: #string for appending to output folder names (is currently '')
outputMode: 0 #tells current output mode
fileScope: 0 #tells file scope
motifs loaded: 0 #tells number of motifs loaded
files queued: 0 #tells number of files loaded
>>>
Changing output location
- You can change the default folder for mapping output by either changing the dictionary key 'ROOTFOLDER' or you can also just change the path in the text file directly in the permanent settings folder.
from Motif Mapper import folders #load the folders module
f = folders.Folders() #make an instance of the Folders class
f.settings['ROOTFOLDER'] = 'newpath' #change the default save path by changing the value of the key ROOTFOLDER
f.saveDefaults() #save changes to the Motif Mapper settings file
Reading Fasta using the Iterator
- You can change the default folder for mapping output by either changing the dictionary key 'ROOTFOLDER' or you can also just change the path in the text file directly in the permanent settings folder.
from Motif Mapper import fasta #load the fasta module
i = fasta.Iterator('fasta_file.txt') #make an instance of the Iterator class
try:
while i.next(): #make a loop to go through the file
print i.fRec #you would process the FastaRecord
except StopIteration: #the end of the Iteration raises and exception that has to be caught
pass #do nothing