Skip to main content
Figure 1 | Genome Biology

Figure 1

From: THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data

Figure 1

Algorithm overview. A mixture of three subpopulations with two distinct genomes: a normal genome (represented here with one copy of each interval for simplicity), and an aneuploid genome with a duplication of one interval (red). If reads are distributed uniformly over the aggregate DNA in the sample, then the observed distribution of reads over the blue, red and yellow intervals will follow a multinomial distribution with parameter C μ ^ . Here C is the interval count matrix giving the integral number of copies of each interval in each genome in the mixture, and μ is the genome mixing vector giving the proportion of each subpopulation in the mixture. We find the pair (C, μ) that maximizes the likelihood of the observed read depth vector r.

Back to article page