Speeding up Single Cell
Tumor Mutation Tree Inference

Deep Dive into Computational Statistic in Cancer Genomics

Feb 2023 - Oct 2023

For my 2nd Master thesis at ETH Zürich in Biotechnology, I wanted to learn more about computational/bayesian statistics applied in genomics. This is the project am currently doing:

Description

Cancer progression is an evolutionary process in which cells with different characteristics compete with each other. Understanding the cell heterogeneity is important for prognosis and suggesting the right treatment. Single-cell DNA sequencing allows one to measure mutations in selected cells at a specific time point. The tumor phylogeny problem aims at reconstruction of the whole evolutionary history of the tumor based on this single snapshot, in the form of a mutation tree.

Although there exist principled methods, such as SCITE [1], which traverse the space of all possible mutation histories to suggest the most likely ones, they can be too slow and computationally intense to run. Hence, there is a sparking interest in fast heuristic methods, which can be used to approximate a single most-likely tree.

[1] Tree inference for single-cell data, Genom Biology, Jahn et al. 2016

Why this?

Learn advanced computational statistics.

Working on single-cell genomic data sounds cool.

Throw in some high-performacne computing.

Escaping the burden of feeding your cells in the wet-lab.

Code

The code for the project, I'll be a contributer to will be here
Github: cbg-ethz/PYggdrasil