Algorithmen der Bioinformatik

Master Theses

Here is a listing of possible topics for projects. Further suggestions are welcome.


  • Phylogenetic analysis of phosophate-accummulation genes for short-read and long-read assembled genomes for waste-water sludge - AVIALABLE - co-advisor: Dr. Rohan Williams (NUS)
  • Using profile-HMMs to identify protein families/domains on raw error-prone long-reads - AVAILABLE
    • Profile-HMMs are used in the context of finding proteins, protein families, or shared domains on proteins from sequence data. However, they are designed to work with perfect data, which fails when used on error-prone long-reads or long-read assemblies. The aim of this thesis is to develop HMM based methods that can specifically handle insertion and deletion errors in error-prone long-reads.
  • Identifying the antibiotic resistance potential of healthy human hosts  - AVAILABLE 
    • Different approaches have been used in understanding the antibiotics resistance potential microbes associated with different disease states but very few studies have focused to understand it in the healthy human host. we will aim to compare the available tools to find the best-suited approach for the task. And further will explore the resistance potential of microbiota residing in a healthy human.
  • Metagenome-wide association studies (MWAS) of plant growth-promoting traits (PGPT)  - AVAILABLE
    • This project is to investigate the distribution of PGPTs (genes/proteins) in various microbiomes. Taxonomic and functional diversity will be studied regarding their significant differences to plant-associated metagenomes and their taxonomic affiliations, using e.g. DIAMOND and MEGAN.
  • Sexes in the rock pools: molecular basis and evolution of male versus female differentation in brown algae- AVAILABLE - Co-advisor: Dr. Susana Coelho, Director MPI for Developmental Biology
    • The brown algae are a eukaryotic supergroup that has been evolving independently of animals and land plants for more than a billion years. During that time, they acquired multicellularity to become the third most developmentally complex lineage on the planet, rivalling land plants in terms of body size and complexity. The Coelho lab has recently identified several major developmental regulators and dissected the chromosomal basis of sex determination in this group (e.g. Ahmed et al., 2014; Cock et al., 2010; Coelho et al., 2018), providing a solid foundation for the future development of brown algal developmental biology and comparative molecular biology. Research currently focuses on the origin, evolution and regulation of sexual systems diversity and on the molecular and evolutionary mechanisms that underlie the complex developmental patterns and reproductive features in the brown algae. RNAseq datasets for male and females across several species of brown algae, and the master project involves producing more data from cultivated algae, analyzing and interpreting datasets using bioinformatic and molecular evolution approaches. 
  • Composition of paralogous ribosomal protein genes across multiple Arabidopsis thaliana ​genomes - AVAILABLE - advisor: Prof. Detlef Weigel, Director MPI for Developmental Biology
  • In silico analysis of the evolution and function of ABC transporters in bacterial secondary metabolite gene clusters  - AVAILABLE - Co-advisor: Prof. Nadine Ziemert (Biology Tübingen)
  • RNAseq and Iso-seq data analysis - Paul Epperlein - Co-advisor: Dr. Susana Coelho, Director MPI for Developmental Biology
  • Prediction of genes in genomes with frequent (~10%) translational frameshifting (i.e. Euplotes - - with a fragmented genome like Oxytricha) - AVAILABLE-  Co-advisor: Dr. Estienne Swart, MPI for Developmental Biology


  • Targeted functional annotation of bacterial genomes using DIAMOND and MEGAN in a phylogenomics approach  - Lakshmipriya-sivaprakash Thrukonda - co-advisor: Prof. Kay Nieselt
  • Use assembly graphs in contig binning - develop methods and implement in Java  - Janus Bieler 
  • Use of GTDB and AnnoTree for protein-alignment-based microbiome analysis - Hendrik Felderhoff 
  • qiime2megan- develop a set of tools that allows one to import qiime2 data into MEGAN and vice versa, export MEGAN analyses into qiime2 (Python or Java) - Luis Lederhaas - co-advisor: Prof. Ruth Ley, Director MPI for Developmental Biology
  • Prediction of genes in genomes with ambiguous genetic codes (where “stop” codons can be sense or stop, depending on the context): (  - David-Emanuel Vetter-  Co-advisor: Dr. Estienne Swart, MPI for Developmental Biology


  • Visualization and analysis of autocatalytic networks (implementation using JavaFX) - done
  • Protein k-mer methods for microbiome analysis - done
  • Identifying the Role of ALPs in Methanobacteriaceae - done
  • Inclusion of Environmental Data in Machine Learning Models for Genomic Prediction in Rice - done
  • Database analysis of function - done
  • Computational analysis of metagenome data from caprylate producing bioreactors” -done
  • Interpretability of Machine Learning Models for Genomic Selection in Maize” - done
  • Effect of reference genome choice on variant calling”- done
  • Assessment of assembly strategies for bioreactor metagenomics” - done
  •  Improved metagenomic contig binning using haplotagging data - done


  • Analysis of twin study microbiome samples - done
  • Performance of DIAMOND+MEGAN on CAMI data - done
  • Machine learning algorithms applied to protein sequences - done


  • JavaFX implementation of haplotype networks  (SplitsTree5 - part II) - done
  • Evolution and horizontal gene transfer of regulatory elements in bacterial secondary metabolite gene clusters  - done
  • Exploring the use of TPR "Trough to Peak Ratio" analysis to determine which bacteria are growing and which are stagnant in the human gut during a course of antibiotics - done
  • Design and implementation of a full-featured Time Series Analysis tool - done


  • JavaFX implementation of Phylogenetic network drawing and GUI (SplitsTree5 - part I) - done
  • Pathogen identification - done
  • Correspondence between KEGG and InterPro in metagenome analysis - done


  • Assembly of ancient mtDNA genomes - done
  • Real-time monitoring of resistance evolution - done
  • SamSifter - A toolbox for metagenomic analysis - done
  • Fingerprinting of microbial genomes - done


  • Haplotype profile sharing in Arabidopsis thaliana - done
  • Visualization of very large numbers metagenome samples - done



  • Functional analysis of trinucleotide repeats in plants - done
  • Sequencing and assembly strategies for a new plant genome - done



  • Annotation of bacterial genomes - done
  • Reference-guided protein assembly - done
  • Naive Bayesian classifier for metagenomics - done


  • Pathway evaluation in (meta) transcriptomics - done
  • Analysis of 16S data - done
  • Finding confidence interval for multiple metagenome comparison networks - done
  • Assembly and annotation of the Guppy transcriptome- done
  • Correlating taxonomy and gene function with environmental parameters - done
  • Short-Read aligners in Metagenomics -- done



  • Faster BLAST analysis of metagenomic data - done
  • New methods for the comparison of phylogenetic trees and networks- done
  • Hybridization networks - done
  • TE Discovery by Next Gen Sequencing - done

  • Simulation of 3rd generation sequencing technologies - done

  • Analysis of human gut data - done
  • Analysis of 16 S rRNA - done
  • Datenbankgestützte Analyse von Metagenomikdaten - done
  • Functional and pathway analysis of metagenomic data - done

  • Efficient data mining techniques for two-locus association mapping- done
  • Finding Patterns in Intervals - done