Skip to main content

Table 2 Genomic landscape and structural variants in the Venter genome*

From: Towards a comprehensive structural variation map of an individual human genome

  Total non-redundant gainsb Total non-redundant lossesc
Genomic feature (number of entries)a Number of (%) genomic features Number of (%) structural variants P-values Number of (%) genomic features Number of (%) structural variants P-values
RefSeq gene locid (20,174) 14,268 (70.72%) 159,250 (38.17%) 0.000 13,951 (69.15%) 149,568 (38.26%) 0.000
RefSeq gene entire transcript locie (20,174) 101 (0.50%) 41 (0.01%) 0.000 91 (0.45%) 47 (0.01%) 0.000
RefSeq gene exonsf (20,174) 3,126 (15.50%) 3,890 (0.93%) 0.999 3,025 (14.99%) 3,723 (0.95%) 0.999
Enhancer elements (837) 80 (9.56%) 85 (0.02%) 0.999 84 (10.04%) 93 (0.02%) 0.999
Promoters (20,174) 2,007 (9.95%) 2,071 (0.50%) 0.999 1,812 (8.98%) 1,922 (0.49%) 0.999
Stop codonsg (30,885) 225 (0.73%) 99 (0.02%) 0.000 272 (0.88%) 134 (0.03%) 0.563
OMIM disease gene loci (3,737) 1,658 (44.37%) 20,589 (4.93%) 0.000 1,664 (44.53%) 19,396 (4.96%) 0.000
OMIM disease gene exons (3,737) 367 (9.82%) 458 (0.11%) 0.999 383 (10.25%) 492 (0.13%) 0.999
Autosomal dominant gene loci (316) 247 (78.16%) 2,773 (0.66%) 0.023 245 (77.53%) 2,593 (0.66%) 0.031
Autosomal dominant gene exons (316) 60 (18.99%) 70 (0.02%) 0.999 64 (20.25%) 78 (0.02%) 0.999
Autosomal recessive gene loci (472) 386 (81.78%) 3,931 (0.94%) 0.065 402 (85.17%) 3,749 (0.96%) 0.009
Autosomal recessive gene exons (472) 58 (12.29%) 78 (0.02%) 0.999 86 (18.22%) 109 (0.03%) 0.999
Cancer disease gene loci (363) 301 (82.92%) 4,202 (1.01%) 0.651 307 (84.57%) 3,899 (1.00%) 0.821
Cancer disease gene exons (363) 66 (18.18%) 85 (0.02%) 0.999 71 (19.56%) 98 (0.03%) 0.999
Dosage sensitive gene loci (145) 120 (82.76%) 2,995 (0.72%) 0.604 125 (86.21%) 2,794 (0.71%) 0.728
Dosage sensitive gene exons (145) 39 (26.90%) 51 (0.01%) 0.999 41 (28.28%) 58 (0.01%) 0.999
Genomic disorders (52) 50 (96.15%) 14,178 (3.40%) 0.999 51 (98.08%) 13,373 (3.42%) 0.996
Pharmacogenetic gene loci (186) 97 (52.15%) 853 (0.20%) 0.517 96 (51.61%) 838 (0.21%) 0.105
Pharmacogenetic gene exons (186) 21 (11.29%) 27 (0.01%) 0.998 23 (12.37%) 29 (0.01%) 0.984
Imprinted gene loci (59) 39 (66.10%) 405 (0.10%) 0.989 37 (62.71%) 378 (0.10%) 0.982
Imprinted gene exons (59) 13 (22.03%) 15 (0.00%) 0.998 11 (18.64%) 13 (0.00%) 0.999
MicroRNAs (685) 8 (1.17%) 9 (0.00%) 0.785 11 (1.61%) 9 (0.00%) 0.836
GWAS loci (419) 415 (99.05%) 9,413 (2.26%) 0.000 416 (99.28%) 8,852 (2.26%) 0.000
GWAS SNPs (419) 1 (0.24%) 1 (0.00%) 0.786 2 (0.48%) 2 (0.00%) 0.810
CpG islands (14,867) 287 (1.93%) 1,516 (0.36%) 0.999 299 (2.01%) 1,508 (0.39%) 0.999
DNAseI hypersensitivity sites (95,709) 6,524 (6.82%) 7,165 (1.72%) 0.999 6,392 (6.68%) 6,914 (1.77%) 0.999
Recombination hotspots (32,996) 16,839 (51.03%) 30,315 (7.27%) 0.000 16,211 (49.13%) 28,407 (7.27%) 0.000
Segmental duplications (51,809) 17,172 (33.14%) 13,864 (3.32%) 0.999 16,518 (31.88%) 13,177 (3.37%) 0.999
Ultra-conserved elements (481) 2 (0.42%) 2 (0.00%) 0.999 2 (0.42%) 2 (0.00%) 0.999
Affy 6.0 SNPsh (907,691) 1,556 (0.17%) 389 (0.09%) 0.999 3,022 (0.33%) 934 (0.24%) 0.999
Illumina 1 M SNPsi (1,048,762) 2,318 (0.22%) 601 (0.14%) 0.999 4,789 (0.46%) 1,536 (0.39%) 0.999
  1. *This table shows how structural variation affects different functional annotations and sequence characteristics in the Venter genome. The leftmost column shows the names and total number of genomic features. The rest of the table is divided between gains and losses. Within the gain category, the first left column shows the number of (and percentage of total) genomic features impacted, and the second column shows the corresponding number of (and percentage of total) gain variants, and the last column shows the significance of the overlap as determined by simulations. An identical format is used for the losses. aSee Additional file 17 for a list of data sources. bBased on a non-redundant list of 417,206 gains and insertions detected in this and the Levy et al. [1] study of the Venter genome. cBased on a non-redundant list of 390,973 deletions detected in this and the Levy et al. [1] study of the Venter genome. dGenes where a structural variant resides anywhere within the transcript (exonic and intronic). eGenes from the RefSeq data set where the entire transcript locus is encompassed by the structural variant. fGenes from the RefSeq data set where exonic sequence is impacted by the structural variant. The non-redundant number of genes altered in some way by duplications and deletions is 4,867. gStructural variants that overlap/impact a stop codon from the RefSeq gene set. hProbes on the Affymetrix 6.0 Commercial array. iProbes on the Illumina 1 M array. GWAS, genome-wide association studies; OMIM, Online Mendelian Inheritance in Man.