Comments on the model parameters in “SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models”

*Correspondence: KChen3@mdanderson.org; nakhleh@rice.edu 2Department of Bioinformatics and Computational Biology, the University of Texas M.D. Anderson Cancer Center, Houston, Texas, USA 1Department of Computer Science, Rice University, Houston, Texas, USA Full list of author information is available at the end of the article • The statement “It is important to note that out of the three different types of events that could hint at a deviation from the infinite-sites assumption, SiFit currently models events (deletions, LOH, etc.) that affect the same genomic site more than once and the FP and FN errors in SCS data." should be replaced by “It is important to note that out of the three different types of events that could hint at a deviation from the infinite-sites assumption, SiFit currently models events (recurrent point mutations, deletions, LOH, etc.) that affect the same genomic site more than once and the FP and FN errors in SCS data.” • The statement “These parameters being relative quantities (they denote the rates of deletion and LOH, respectively, relative to the rate of point mutations), we choose a beta distribution as their prior.” should be replaced by “These parameters being relative quantities (they denote the rates of recurrent point mutation and deletion/LOH, respectively, relative to the rate of point mutations), we choosea beta distribution as their prior.”

We introduce two rate parameters: • λ d : the rate of recurrent point mutation at a site.
• λ l : the combined rate of deletion and loss of heterozygosity (LOH).
Using these five states and rate parameters, we have the following instantaneous rate matrix Q: We now abstract the genotypes as follows: • Genotype 0 corresponds to states 0/-and 0/0. • Genotype 1 corresponds to state 0/1. • Genotype 2 corresponds to states 1/-and 1/1.
Under this abstraction and the assumptions detailed in the caption of Table 1, we obtain the matrix Q given in Eq. (5) in the main text.
The following clarifications also apply to the main text: • Given the explanation above, the following statement should be removed: "LOH events can result in the genotype transitions 1 → 0 and 1 → 2 whereas deletions can result in the genotype transitions 1 → 0, 1 → 2 or 2 → 1. To compute the infinitesimal rates for these transitions, we introduce two parameters λ d and λ l that account for the effects of deletion and LOH respectively." • The statement "It is important to note that out of the three different types of events that could hint at a deviation from the infinite-sites assumption, SiFit currently models events (deletions, LOH, etc.) that affect the same genomic site more than once and the FP and FN errors in SCS data." should be replaced by "It is important to note that out of the three different types of events that could hint at a deviation from the infinite-sites assumption, SiFit currently models events (recurrent point mutations, deletions, LOH, etc.) that affect the same genomic site more than once and the FP and FN errors in SCS data." • The statement "These parameters being relative quantities (they denote the rates of deletion and LOH, respectively, relative to the rate of point mutations), we choose a beta distribution as their prior." should be replaced by "These parameters being relative quantities (they denote the rates of recurrent point mutation and deletion/LOH, respectively, relative to the rate of point mutations), we choosea beta distribution as their prior." Table 1 Expanded Q matrix for ternary data (i, j) denotes transition from state i to state j. The transitions for which the entry is 'NA', are not allowed. In particular, we do not allow the transitions 0/-→ 1/-or 1/-→0/as a reflection of a simplifying assumption that a recurrent point mutation and deletion/LOH occurring at the same site is a very rare event. Furthermore, we do not model copy number gain. © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.