Algorithmen der Bioinformatik

MuLan-Methyl

Multiple transformer-based language models for accurate DNA methylation prediction

A MuLan-Methyl web server is provided here.

MuLan-Methyl is a new a deep learning framework for predicting DNA methylation sites, which is based on multiple (five) popular transformer-based language models. The framework identifies methylation sites for three different types of DNA methylation, namely N6-adenine (6mA), N4-cytosine (4mC), and 5-hydroxymethylcytosine (5hmC). Each of the five employed language models is adapted to the task using the "pre-train and fine-tune'" paradigm. Pre-training is performed on a custom corpus consisting of  DNA fragments and taxonomy lineages using self-supervised learning. Fine-tuning then aims at predicting the DNA-methylation status of each type. The five models are used to collectively predict the DNA methylation status. MuLan-Methyl performs very well on a benchmark dataset. Moreover, the model captures characteristic differences between different species that are relevant for methylation. This work demonstrates that language models can be successfully adapted to this domain of application and that joint utilisation of different language models improves model performance.

Wenhuan Zeng, Anupam Gautam, Daniel H. Huson, MuLan-Methyl - Multiple Transformer-based Language Models for Accurate DNA Methylation Prediction, GigaScience, Volume 12, 2023, giad054

A MuLan-Methyl web server is provided here.​​​​​​​

Datenschutzeinstellungen

Auf unserer Webseite werden Cookies verwendet. Einige davon werden zwingend benötigt, während es uns andere ermöglichen, Ihre Nutzererfahrung auf unserer Webseite zu verbessern. Ihre getroffenen Einstellungen können jederzeit bearbeitet werden.

oder

Essentiell

in2code

Videos

in2code
YouTube
Google