Algorithmen der Bioinformatik

Master Theses

Here is a listing of possible topics for projects. Further suggestions are welcome. Additional Diploma and Master theses in bioinformatics can be found here.


  • Sexes in the rockpools: molecular basis and evolution of male versus female differentiation in brown algae AVAILABLE - co-advisor: Susana Coelho, Phd, Director MPI for Developmental Biology
  • Use assembly graphs in contig binning - develop methods and implement in Java  - Janus Bieler 
  • Use of GTDB and AnnoTree for protein-alignment-based microbiome analysis - Hendrik Felderhoff 
  • Using profile-HMMs to identify protein families/domains on raw error-prone long-reads - AVAILABLE 
    Profile-HMMs are used in the context of finding proteins, protein families, or shared domains on proteins from sequence data. However, they are designed to work with perfect data, which fails when used on error-prone long-reads or long-read assemblies. The aim of this thesis is to develop HMM based methods that can specifically handle insertion and deletion errors in error-prone long-reads.
  • Targeted functional annotation of bacterial genomes using DIAMOND and MEGAN in a phylogenomics approach  - AVAILABLE 
    This phylogenomics project aims at analyzing the taxonomic distribution and association of plant growth-promoting traits (PGPT) to host-beneficial and -pathogenic bacteria by applying our adapted DIAMOND-MEGAN pipeline for genes and proteins and the novel PGPT ontology to all complete and incomplete bacterial genomes.
  • Identifying the antibiotic resistance potential of healthy human hosts  - AVAILABLE 
    Different approaches have been used in understanding the antibiotics resistance potential microbes associated with different disease states but very few studies have focused to understand it in the healthy human host. we will aim to compare the available tools to find the best-suited approach for the task. And further will explore the resistance potential of microbiota residing in a healthy human.
  • qiime2megan- develop a set of tools that allows one to import qiime2 data into MEGAN and vice versa, export MEGAN analyses into qiime2 (Python or Java) - Luis Lederhaas - co-advisor: Prof. Ruth Ley, Director MPI for Developmental Biology
  • Metagenome-wide association studies (MWAS) of plant growth-promoting traits (PGPT)  - AVAILABLE
    This project is to investigate the distribution of PGPTs (genes/proteins) in various microbiomes. Taxonomic and functional diversity will be studied regarding their significant differences to plant-associated metagenomes and their taxonomic affiliations, using e.g. DIAMOND and MEGAN.
  • Composition of paralogous ribosomal protein genes across multiple Arabidopsis thaliana ​genomes - AVAILABLE - advisor: Prof. Detlef Weigel, Director MPI for Developmental Biology
  • In silico analysis of the evolution and function of ABC transporters in bacterial secondary metabolite gene clusters  - AVAILABLE - Co-advisor: Prof. Nadine Ziemert (Biology Tübingen)
  • Prediction of genes in genomes with frequent (~10%) translational frameshifting (i.e. Euplotes - - with a fragmented genome like Oxytricha) - AVAILABLE-  Co-advisor: Dr. Estienne Swart, MPI for Developmental Biology
  • Prediction of genes in genomes with ambiguous genetic codes (where “stop” codons can be sense or stop, depending on the context): (  - David-Emanuel Vetter-  Co-advisor: Dr. Estienne Swart, MPI for Developmental Biology


  • Visualization and analysis of autocatalytic networks (implementation using JavaFX) - Marietta Hamberger - co-advisor: Prof. Mike Steel (University of Canterbury, NZ)
  • Protein k-mer methods for microbiome analysis - Joachim Fritscher, co-advisor: Prof. Ruth Ley, MPI for Developmental Biology
  • Identifying the Role of ALPs in Methanobacteriaceae - Patrizia Ricca - co-advisor: Prof. Ruth Ley
  • Inclusion of Environmental Data in Machine Learning Models for Genomic Prediction in Rice - Mirjam Fröschl
  • Database analysis of function - Sylvia Siegel - Co-advisor: Prof. Lars Angenent, Dept. of Biological and Environmental Engineering
  • Computational analysis of metagenome data from caprylate producing bioreactors” - Timo Lucas, Co-advisor: Prof. Lars Angenent
  • Interpretability of Machine Learning Models for Genomic Selection in Maize” - Lisa Marie Dussling
  • Effect of reference genome choice on variant calling”- Sebastian Felix Vorbrugg, co-advisor: Prof. Detlef Weigel
  • Assessment of assembly strategies for bioreactor metagenomics” - Steffen Kienzler, Co-advisor: Prof. Lars Angenent
  •  Improved metagenomic contig binning using haplotagging data - Ersoy Kocak - co-advisor - Dr. Frank Chan 


  • Analysis of twin study microbiome samples - done
  • Performance of DIAMOND+MEGAN on CAMI data - done
  • Machine learning algorithms applied to protein sequences - done


  • JavaFX implementation of haplotype networks  (SplitsTree5 - part II) - done
  • Evolution and horizontal gene transfer of regulatory elements in bacterial secondary metabolite gene clusters  - done
  • Exploring the use of TPR "Trough to Peak Ratio" analysis to determine which bacteria are growing and which are stagnant in the human gut during a course of antibiotics - done
  • Design and implementation of a full-featured Time Series Analysis tool - done



  • JavaFX implementation of Phylogenetic network drawing and GUI (SplitsTree5 - part I) - done
  • Pathogen identification - done
  • Correspondence between KEGG and InterPro in metagenome analysis - done


  • Assembly of ancient mtDNA genomes - done
  • Real-time monitoring of resistance evolution - done
  • SamSifter - A toolbox for metagenomic analysis - done
  • Fingerprinting of microbial genomes - done


  • Haplotype profile sharing in Arabidopsis thaliana - done
  • Visualization of very large numbers metagenome samples - done



  • Functional analysis of trinucleotide repeats in plants - done
  • Sequencing and assembly strategies for a new plant genome - done



  • Annotation of bacterial genomes - done
  • Reference-guided protein assembly - done
  • Naive Bayesian classifier for metagenomics - done


  • Pathway evaluation in (meta) transcriptomics - done
  • Analysis of 16S data - done
  • Finding confidence interval for multiple metagenome comparison networks - done
  • Assembly and annotation of the Guppy transcriptome- done
  • Correlating taxonomy and gene function with environmental parameters - done
  • Short-Read aligners in Metagenomics -- done



  • Faster BLAST analysis of metagenomic data - done
  • New methods for the comparison of phylogenetic trees and networks- done
  • Hybridization networks - done
  • TE Discovery by Next Gen Sequencing - done

  • Simulation of 3rd generation sequencing technologies - done

  • Analysis of human gut data - done
  • Analysis of 16 S rRNA - done
  • Datenbankgestützte Analyse von Metagenomikdaten - done
  • Functional and pathway analysis of metagenomic data - done

  • Efficient data mining techniques for two-locus association mapping- done
  • Finding Patterns in Intervals - done