Skip to main content

Table 1 Genotyping accuracy on a subset of the HapMap CEU population

From: cnvHiTSeq: integrative models for high-resolution copy number variation detection and genotyping using population sequencing data

      cnvHiTSeq
      cnvHiTSeq* cnvHiTSeq†
Location Predicted Location Predicted length (bp) cnvHap r2 SCIMM r2 r2 Accuracy Missing rate r2 Accuracy Missing rate
chr1:35098051-35115368 chr1:35100671-35112111 11,440 1.00 1.00 0.77 0.88 0.23 1.00 1.00 0.00
chr1:152759872-152770356 chr1:152760173-152770753 10,580 0.90 0.94 0.85 0.82 0.23 1.00 1.00 0.00
chr3:151625213-151657165 N/A N/A 0.00 N/A N/A N/A N/A N/A N/A N/A
chr7:97395305-97402641 chr7:97395365-97402646 7,281 1.00 1.00 0.66 0.81 0.05 1.00 1.00 0.00
chr7:115930472-115941073 chr7:115931453-115941632 10,179 1.00 1.00 N/A 0.95 0.00 1.00 1.00 0.00
chr8:51030941-51038331 chr8:51031082-51038282 7,200 0.92 0.93 0.63 0.94 0.11 1.00 1.00 0.00
chr8:144700485-144714694 chr8:144700505-144714606 14,101 1.00 0.97 0.54 0.67 0.00 1.00 1.00 0.00
chr10:71280989-71291079 chr10:71280949-71291070 10,121 0.90 0.82 0.58 0.86 0.05 0.89 0.95 0.00
chr11:5783630-5809284 chr11:5784450-5809211 24,761 1.00 1.00 0.61 0.83 0.00 0.94 0.94 0.00
chr11:107238222-107244154 chr11:107238422-107244103 5,681 0.94 0.97 0.83 0.90 0.00 1.00 1.00 0.00
chr15:34694542-34817215 chr15:34701483-34817043 115,560 0.80 1.00 0.69 1.00 0.05 1.00 1.00 0.00
chr15:76884597-76907042 chr15:76884597-76896918 12,321 0.56 N/A 0.96 0.95 0.00 1.00 1.00 0.00
chr19:35851153-35861684 chr19:35851134-35863213 12,079 1.00 1.00 0.81 0.90 0.05 1.00 1.00 0.00
chr19:52132525-52148984 chr19:52132606-52149186 16,580 0.96 0.96 0.88 0.86 0.00 1.00 1.00 0.00
chr20:1558407-1585809 chr20:1561187-1585928 24,741 0.90 N/A 0.95 0.94 0.00 1.00 1.00 0.00
chr22:23154417-23243496 chr22:23186037-23241798 55,761 0.22 N/A 0.61 0.73 0.00 0.77 0.80 0.09
chr22:24323894-24418396 chr22:24343395-24397295 53,900 0.00 N/A 0.91 0.86 0.00 1.00 1.00 0.00
chr22:39366812-39386139 chr22:39358773-39383652 24,879 1.00 0.96 0.46 0.85 0.00 1.00 1.00 0.00
  1. Genotyping accuracy as measured by the concordance between copy number estimates on 22 HapMap CEU samples from the low-coverage pilot of the 1000 Genomes Project and reference copy number estimates obtained using PCR. Concordance is quantified using two different metrics: the correlation coefficient r2 between the reference and the predicted genotypes as well as the fraction of calls with the correct genotype for both alleles. r2 measurements for SCIMM (SNP-Conditional Mixture Modeling) were obtained from the supplementary material of [16]. r2 measurements for cnvHap were obtained from the supplementary material of [9]. Two different versions of cnvHiTSeq were used: cnvHiTSeq*, which is a single-sample version of the algorithm that does not take advantage of the population modeling capabilities, and cnvHiTSeq†, which trains the parameters of the model using the entire low-coverage HapMap CEU population from the 1000 Genomes Project (currently consisting of 94 samples). The genotyping accuracy was calculated using 22 of the 94 samples, since these were the only samples for which PCR copy number estimates were available. When all the samples are predicted to be copy neutral for a given location, the accuracy and r2 are undefined and denoted by N/A. cnvHiTSeq calls with posterior probabilities lower than 80% were excluded and declared as missing.