Pattern Identification and Clustering of Single Cell RNA-sequencing Data

Single-cell RNA-sequencing (scRNA-seq) enables a massive acquisition of gene expressions. The measured single-cell transcription profiles can be used to identify cell types, cell sub-types and continuous gene expression gradients, e.g., during the developmental or disease processes. However, a key challenge is the high levels of sampling noise and missing data, which can obscure transcriptional measures of cell type similarity. Moreover, such impairments in the data render the identification of co-regulated groups of genes difficult. Furthermore, it is currently not possible to systematically determine the origin of cell types in complex organisms. The focus of this project is on developing and adopting new analytical approaches to efficiently use single-cell data to solve the following problems:

  • Finding informative genes that allow clustering of cells and identification of cell types
  • Analysis of co-regulated gene modules
  • Integration of other data types including lineage barcodes that allow tracing cell origins

In order to address these challenges, we propose to adapt methods from computational network science and data mining to the problem of cell type identification and to reconstruct gene regulatory networks as well as cell lineage trees. These include statistical network inference, network completion, and network extrapolation as well as tools such as dimensionality reduction, clustering, and anomaly detection.

This project belongs to the framework of the HEIBRIDS Graduate School of Data Science. It is performed in cooperation with Prof. Uwe Ohler at the Max-Delbrück-Cetner for Molecular Medicine in Helmholtz Association. The duration is 2018-2022. The responsible researcher for this project is Ms. Anna Vlot.