Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologies

Fig. 3

DiMSum error model performance. Leave-one-out cross-validation to test error model performance. In turn, error models are trained on all but one replicate of a dataset, and z-scores of the differences in fitness scores between the training set \( {\overline{f}}^{\mathrm{train}} \) and the remaining test replicate \( {f}^{\mathrm{test}} \) are calculated (i.e., fitness score differences normalized by the estimated error in the training set σtrain and test replicate σtest; importantly, σtest is estimated from error model parameters fit only on the training set replicates). Because fitness scores from replicate experiments should only differ by random chance, if the error models estimate the error magnitude correctly, z-scores should be normally distributed, and corresponding P values from a z test should be uniformly distributed. The tested error models are described in the “Results and discussion” and “Methods” sections. a, c Quantile-quantile plots of z-scores in TDP-43 290-331 library (a) and FOS library (c) compared to the expected normal distribution. b, d Quantile-quantile plots of P values from two-sided z test in TDP-43 290-331 library (a) and FOS library (c) compared to the expected uniform distribution. e Estimated error magnitude relative to the differences observed between replicate fitness scores in twelve DMS datasets in leave-one-out cross-validation (see the “Methods” section). Relative error magnitude = 1 means the estimated magnitude of errors fits the data. Relative error magnitude < 1 means the estimated errors are too small. Boxplots indicate median and 1st and 3rd quartiles (box), and whiskers extend to 1.5× interquartile range

Back to article page