Zentrum für Molekularbiologie der Pflanzen (ZMBP)

Motif Mapper for Python v1.4

Download Current Version: v1.4

Requirements*:

    Python 2.5* or better, BioPython*, Numpy (not needed on install)

License:

GNU GPL v.2, Motif Mapper Creative License. Any restrictions for non-academics: as described in GPL v.2

Description

    Python versions of some Motif Mapper scripts were written. All modules are independent to allow more Python like use compared to the VB versions. For example, you can retain motif objects and counts or FASTA sequences in memory instead of saving directly to disk.

      GenBank parser for extracting promoter sequences
      simple counting motifs in FASTA files
      Promoter PointMaps (new) and Curves (see Berendzen et al., 2006)
      small motif generators, etc. automatic folder reading and building for large mapping projects.

      Documentation :

      aGBSQL provides a class for extracting upstream regulatory sequence from genomic DNA
      folders provides a class for stoage and output management
      fasta provides classes for reading, writing and loading sequences in FASTA
      mapping provides a class for implementing mapping projects
      motifs provides classes for the generation, processing and conversion of DNA motifs

History

Python 1.1

    Core scripts.

Python 1.2

    Addition of PointCurve algorithms.
    Improvement of GenBank parser use the Gene or LocusTag annotations

Python 1.3

    motifs.smartmotifs
    Added Smart Motifs to the motifs module and optional calls in module mapping Smart Motifs does automatic sense/antisense counts and for composite motifs / modules (more than one motif seperated by a fixed or variable spacing) then it will check all watson and crick word versions for each motif while maintaining their relative positions to each other. e.g. A and P where A is a non-palindrome (A and B words are possible) and P a palindrome; then smart motifs looks for A-P and B-P. If you have two non-palindromes (A/B and C/D) then it looks for A-C, A-D, B-C, B-D and returns one value.

    mapping.MapFasta()
    integrated smartmotifs as a property option to the MapFasta class but not for point maps / curves yet. added a property ".globalallfiles" to return the gobal counts for all sequences. in all files instead of the previous all sequences for one file. This can be
    toggled on or off.

    aGBSQL
    some GenBank files do not have a xdb_ref property which caused the script to crash. This is now avoided using a try block.

Python 1.4

    Variable assigment error discovered in aGBSQL which caused incorrect promoter extraction when not allowing gene overlap in the previous versions. Corrected.

 

Example Scripts

Extracting Promoter Sequences

    Here is a script that loads a GenBank flat-file from disk and extracts upstream regulatory regions and saves them to a file:
    from MotifMapper import aGBSQL		#import the aGBSQL module
    g = aGBSQL.gbSQL() #create an instance of gbSQL
    g.gbfiles = ['File1.gbk', 'File2.gbk'] #set files, are in current working directory
    g.getpromoters_gene() #execute promoter extraction
    g.savetofile(open('output.txt' ,'w' )) #save extracted promoters in FASTA to file
    For knowing what the different toggle states are, you can run the command print for the ClassInstance which calls __str__. In this case we call, g.
    print g

    set self.gbfiles as a list of paths; then run getpromoters()
    names are also variables
    start: -1000
    end: 0
    root: 0
    overlap: 1
    option mRNAannot: 0

    number GBfiles files loaded (gbfiles) 0
    number fasta seqs loaded (extracted) 0
    number dictionary entries (dict) 0
    >>>
    Here is a hint for saving extracted promoters sequences to the default Motif Mapper output folder:
    from MotifMapper import folders					#load the folders module
    f = folders.Folders() #make an instance of the Folders class
    g.savetofile(open(f.rootfolder + '\\output_name.txt' ,'w' )) #save the promoters to file in FASTA

Mapping CREs in FASTA sequences

    Here is a script that loads files with sequences in FASTA, maps CREs to them and saves the results to disk:
    from MotifMapper import mapping				#imports the mapping module
    m = mapping.MapFasta() #create and instance of MapFasta
    m.loadMotifs(man=1,ls=['cacgtg{0,30}cacgtg','cacgtg']) #load motifs manually, sets the list ls with members
    m.loadFiles(man=1) #load files manually
    >>>myFastaFile1.txt #entered filepath of a file with sequences in FASTA
    >>>myFastaFile2.txt #entered filepath of a file with sequences in FASTA
    >>>q #terminate entering filespaths
    m.mapFiles() #map entered motifs onto entered files, the results are automatically deposited
    # into a folder named the same as the input file, under the default master folder; see below.
    For knowing what the different toggle states are, you can run the command print for the ClassInstance which calls __str__. In this case we call, m:
    print m
    Your root path is: E:\Documents and Settings\user\Motif-Mapper #default folder, Windows version
    this is under self.Fld.settings['ROOTFOLDER'] #tells where you can change the RootFolder
    To use: loadMotifs() and loadFiles(), then run mapFiles() #hints for simple use
    Sequences in Fdict (mapObjs): 0 #tells number of dictionary sequences loaded

    fileAppend: #string for appending to output folder names (is currently '')
    outputMode: 0 #tells current output mode
    fileScope: 0 #tells file scope
    motifs loaded: 0 #tells number of motifs loaded
    files queued: 0 #tells number of files loaded
    >>>

Changing output location

    You can change the default folder for mapping output by either changing the dictionary key 'ROOTFOLDER' or you can also just change the path in the text file directly in the permanent settings folder.
    from Motif Mapper import folders	#load the folders module
    f = folders.Folders() #make an instance of the Folders class
    f.settings['ROOTFOLDER'] = 'newpath' #change the default save path by changing the value of the key ROOTFOLDER
    f.saveDefaults() #save changes to the Motif Mapper settings file

Reading Fasta using the Iterator

    You can change the default folder for mapping output by either changing the dictionary key 'ROOTFOLDER' or you can also just change the path in the text file directly in the permanent settings folder.
    from Motif Mapper import fasta		#load the fasta module
    i = fasta.Iterator('fasta_file.txt') #make an instance of the Iterator class
    try:
    while i.next(): #make a loop to go through the file
    print i.fRec #you would process the FastaRecord
    except StopIteration: #the end of the Iteration raises and exception that has to be caught
    pass #do nothing