Skip to main content

Table 2 Genomic landscape and structural variants in the Venter genome*

From: Towards a comprehensive structural variation map of an individual human genome

 

Total non-redundant gainsb

Total non-redundant lossesc

Genomic feature (number of entries)a

Number of (%) genomic features

Number of (%) structural variants

P-values

Number of (%) genomic features

Number of (%) structural variants

P-values

RefSeq gene locid (20,174)

14,268 (70.72%)

159,250 (38.17%)

0.000

13,951 (69.15%)

149,568 (38.26%)

0.000

RefSeq gene entire transcript locie (20,174)

101 (0.50%)

41 (0.01%)

0.000

91 (0.45%)

47 (0.01%)

0.000

RefSeq gene exonsf (20,174)

3,126 (15.50%)

3,890 (0.93%)

0.999

3,025 (14.99%)

3,723 (0.95%)

0.999

Enhancer elements (837)

80 (9.56%)

85 (0.02%)

0.999

84 (10.04%)

93 (0.02%)

0.999

Promoters (20,174)

2,007 (9.95%)

2,071 (0.50%)

0.999

1,812 (8.98%)

1,922 (0.49%)

0.999

Stop codonsg (30,885)

225 (0.73%)

99 (0.02%)

0.000

272 (0.88%)

134 (0.03%)

0.563

OMIM disease gene loci (3,737)

1,658 (44.37%)

20,589 (4.93%)

0.000

1,664 (44.53%)

19,396 (4.96%)

0.000

OMIM disease gene exons (3,737)

367 (9.82%)

458 (0.11%)

0.999

383 (10.25%)

492 (0.13%)

0.999

Autosomal dominant gene loci (316)

247 (78.16%)

2,773 (0.66%)

0.023

245 (77.53%)

2,593 (0.66%)

0.031

Autosomal dominant gene exons (316)

60 (18.99%)

70 (0.02%)

0.999

64 (20.25%)

78 (0.02%)

0.999

Autosomal recessive gene loci (472)

386 (81.78%)

3,931 (0.94%)

0.065

402 (85.17%)

3,749 (0.96%)

0.009

Autosomal recessive gene exons (472)

58 (12.29%)

78 (0.02%)

0.999

86 (18.22%)

109 (0.03%)

0.999

Cancer disease gene loci (363)

301 (82.92%)

4,202 (1.01%)

0.651

307 (84.57%)

3,899 (1.00%)

0.821

Cancer disease gene exons (363)

66 (18.18%)

85 (0.02%)

0.999

71 (19.56%)

98 (0.03%)

0.999

Dosage sensitive gene loci (145)

120 (82.76%)

2,995 (0.72%)

0.604

125 (86.21%)

2,794 (0.71%)

0.728

Dosage sensitive gene exons (145)

39 (26.90%)

51 (0.01%)

0.999

41 (28.28%)

58 (0.01%)

0.999

Genomic disorders (52)

50 (96.15%)

14,178 (3.40%)

0.999

51 (98.08%)

13,373 (3.42%)

0.996

Pharmacogenetic gene loci (186)

97 (52.15%)

853 (0.20%)

0.517

96 (51.61%)

838 (0.21%)

0.105

Pharmacogenetic gene exons (186)

21 (11.29%)

27 (0.01%)

0.998

23 (12.37%)

29 (0.01%)

0.984

Imprinted gene loci (59)

39 (66.10%)

405 (0.10%)

0.989

37 (62.71%)

378 (0.10%)

0.982

Imprinted gene exons (59)

13 (22.03%)

15 (0.00%)

0.998

11 (18.64%)

13 (0.00%)

0.999

MicroRNAs (685)

8 (1.17%)

9 (0.00%)

0.785

11 (1.61%)

9 (0.00%)

0.836

GWAS loci (419)

415 (99.05%)

9,413 (2.26%)

0.000

416 (99.28%)

8,852 (2.26%)

0.000

GWAS SNPs (419)

1 (0.24%)

1 (0.00%)

0.786

2 (0.48%)

2 (0.00%)

0.810

CpG islands (14,867)

287 (1.93%)

1,516 (0.36%)

0.999

299 (2.01%)

1,508 (0.39%)

0.999

DNAseI hypersensitivity sites (95,709)

6,524 (6.82%)

7,165 (1.72%)

0.999

6,392 (6.68%)

6,914 (1.77%)

0.999

Recombination hotspots (32,996)

16,839 (51.03%)

30,315 (7.27%)

0.000

16,211 (49.13%)

28,407 (7.27%)

0.000

Segmental duplications (51,809)

17,172 (33.14%)

13,864 (3.32%)

0.999

16,518 (31.88%)

13,177 (3.37%)

0.999

Ultra-conserved elements (481)

2 (0.42%)

2 (0.00%)

0.999

2 (0.42%)

2 (0.00%)

0.999

Affy 6.0 SNPsh (907,691)

1,556 (0.17%)

389 (0.09%)

0.999

3,022 (0.33%)

934 (0.24%)

0.999

Illumina 1 M SNPsi (1,048,762)

2,318 (0.22%)

601 (0.14%)

0.999

4,789 (0.46%)

1,536 (0.39%)

0.999

  1. *This table shows how structural variation affects different functional annotations and sequence characteristics in the Venter genome. The leftmost column shows the names and total number of genomic features. The rest of the table is divided between gains and losses. Within the gain category, the first left column shows the number of (and percentage of total) genomic features impacted, and the second column shows the corresponding number of (and percentage of total) gain variants, and the last column shows the significance of the overlap as determined by simulations. An identical format is used for the losses. aSee Additional file 17 for a list of data sources. bBased on a non-redundant list of 417,206 gains and insertions detected in this and the Levy et al. [1] study of the Venter genome. cBased on a non-redundant list of 390,973 deletions detected in this and the Levy et al. [1] study of the Venter genome. dGenes where a structural variant resides anywhere within the transcript (exonic and intronic). eGenes from the RefSeq data set where the entire transcript locus is encompassed by the structural variant. fGenes from the RefSeq data set where exonic sequence is impacted by the structural variant. The non-redundant number of genes altered in some way by duplications and deletions is 4,867. gStructural variants that overlap/impact a stop codon from the RefSeq gene set. hProbes on the Affymetrix 6.0 Commercial array. iProbes on the Illumina 1 M array. GWAS, genome-wide association studies; OMIM, Online Mendelian Inheritance in Man.