Dr. Hajk-Georg Drost
Group Leader – Computational Biology
I am humbled by the vast variation of natural forms available on our planet. Today, I am among the lucky few that can devote time and resources to explore the molecular mechanisms generating this wondrous diversity as a hedge against natural selection.
My academic path was enabled by studying bioinformatics with a strong focus on statistical learning, machine learning, and predictive modeling with applications in comparative genomics and evolutionary transcriptomics. I obtained my BSc and MSc in Bioinformatics, and a PhD in Computer Science at the Institute of Computer Science – Martin-Luther University Halle, Germany where I studied natural variation through the lens of the developmental hourglass model with Ivo Grosse and Marcel Quint. New questions concerning the dynamics and epigenetic control of genomes led me to pursue a postdoc in the lab of Jerzy Paszkowski at the Sainsbury Laboratory and Genetics Department of the University of Cambridge, where I studied the epigenetic control of transposable elements and how these elements can generate natural variation in genomic landscapes. As a next step, I wanted to integrate my insights from evo-devo and (epi)-genomics research and joined the second lab of Elliot Meyerowitz at the Sainsbury Laboratory in Cambridge. With Elliot, I sought to understand how regulatory changes during organ evolution in plants and animals can generate natural variants in organ morphology across diverse plant and animal species. Here, we were especially interested in the roles of lncRNAs and circRNAs in regulating these developmental processes.
After studying the mechanisms generating natural variation on different levels of organismal complexity, here in Detlef Weigel’s Department of Molecular Biology, I now seek to understand how gene regulatory instructions are retained over evolutionary time to either conserve biological functions or diversify functions to enable the emergence of adaptive traits. For this attempt, I bring together a range of experts from computer science, bioinformatics, and molecular biology backgrounds to develop the tools and methodologies to approach this question.
PhD Student – Causal Inference and Deep Learning
PhD Student – Algorithmic Bioinformatics
I studied computer science and mathematics at the University of Tübingen, Germany. Since 2013, I develop the protein sequence aligner Diamond, which was spurred by the challenges of handling the growing amounts of NGS data. In metagenomics studies, DNA is sequenced at terabase-scale from the entire microbial biosphere and needs to be computationally analyzed for remote evolutionary relationships. While traditional methods like NCBI BLAST required a supercomputer to be viable for such data, Diamond was the first tool to achieve a 20,000x speedup over BLAST for short read alignment in protein space and make this analysis accessible to any scientist. The tool has evolved over the years and is now widely used in metagenomics and phylogenomics applications.
In 2013, I won the U.S. Defense Threat Reduction Agency’s $1 Million Algorithm Challenge. This open competition involving over 3,000 researchers around the world awarded the best algorithm solution for rapidly and accurately characterising a complex clinical sample based on raw DNA sequence, an effort that contributed to the capabilities of the U.S. armed forces to diagnose and treat biothreats.
I specialized in high-performance and low-level programming in C++. My research is focused on enhancing Diamond, scaling its computational power and broadening its range of applications to address problems in phylogenomics.
Master Student – Algorithmic Bioinformatics
I am a Bioinformatics-Master student at the University of Tübingen with a background in biology. At the MPI (Department of Molecular Biology, DrostLab) I am currently broadening the range of Diamond’s applications by developing a fast and sensitive DNA-Alignment option.
Time period: March 2022 – March 2023
Master Student – Statistical Learning, Deep Learning and Causal Inference
I am currently studying Artificial Intelligence at Charles University in Prague. Before diving into the world of AI, I studied Bioinformatics at the same university. My main interests lie in complex networks and their applications in biology. In my Bachelor’s thesis, I explored the usage of community detection algorithms for predicting protein function. At the Drost lab, I am working on causal inference and on simulating gene regulatory systems.
Master Student – Machine Learning and Predictive Analytics in Bioinformatics
I’m currently doing my master at the University of Tübingen in Computer Science.
In my bachelor and master I focused on Machine Learning and Datas Science. Now I have started to write my master thesis in “Machine Learning based Homology detection” at the MPI (Department of Molecular Biology, DrostLab), to find methods that combine existing sequence alignment approaches with the latest machine learning techniques to improve homology detection.
Time period: Nov 2022 – May 2023
Erasmus Plus Research Intern – Molecular Evolution of Transposable Elements
- Role: Erasmus Plus Student (6 Months) and Research Intern (3 Months)
- Time period: Nov 2020 – August 2021
- Project: Simulating the evolution of retrotransposon recombination and de novo transposon annotation
- Current Role: PhD Student with Susana Coelho at the Max Planck Institute for Developmental Biology
- Role: Bachelor Student (Bachelor Thesis Project)
- Time period: Nov 2020 – April 2021
- Project: Protein Sequence Clustering with DIAMOND
- Current Role: Erasmus Student in Spain
- Role: Erasmus Student (6 months), ErasmusPlus Student (6 months), and Research Intern (2 months)
- Time period: March 2020 – May 2021
- Project: Gene Regulatory Network Evolution
- Current Role: PhD Student (Microbiome and Applied Bioinformatics) with Prof. Florian Fricke at University Hohenheim, Germany.
- Publication from DrostLab:
- noisyR: Enhancing biological signal in sequencing datasets by characterising random technical noise. Nucleic Acids Research, in press (2021).