Skip to main content

Table 1 Dark and camouflaged regions vary by genome build. We identified dark and camouflaged regions throughout the genome for three different builds, including GRCh37, GRCh38, and GRCh38+alt, across five different sequencing technologies (or read lengths for Illumina). Specifically, we measured dark regions for Illumina based on 100-nucleotide read lengths, Illumina based on 250-nucleotide read lengths, 10x Genomics, PacBio, and Oxford Nanopore Technologies (ONT). Here, the counts for dark and camouflaged regions are combined. We found that the number of dark regions and nucleotides, both within gene bodies (represented as GB in the table) and outside gene bodies, varies dramatically by build and technology. Overall, each technology has its respective strengths. GRCh38 including alternate contigs has > 3x more dark nucleotides than GRCh37, and more than 2x more dark regions. Results presented throughout the manuscript are based on GRCh38 (in gray)

From: Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight

Dark regions

GRCh37

GRCh38

GRCh38+alt

il100

il250

10x

PacBio

ONT

il100

il250

10x

PacBio

ONT

il100

PacBio

ONT

Non-GB nucs.

22.4M

15.7M

5.4M

11.1M

6.7M

68.7M

42.5M

57.0M

56.8M

52.1M

88.4M

69.5M

59.1M

Non-GB regs.

38,931

16,247

17,481

10,615

13,441

84,174

54,418

20,650

20,276

23,613

91,263

35,136

25,682

GB nucs.

16.3M

11.4M

4.2M

6.7M

3.7M

15.1M

12.2M

4.3M

6.4M

3.3M

41.6M

26.9M

16.2M

GBs

5857

4424

3828

2095

4454

6054

4227

3993

2170

4465

7396

3332

4465

 Protein-coding

3792

2814

2845

1251

3464

3804

2437

2875

1275

3406

4291

1741

3041

 Pseudogenes

1134

955

454

483

417

1232

1080

518

474

425

1701

876

668

 lincRNAs

732

492

398

254

476

753

513

459

284

546

920

417

529

 Others

199

163

131

107

97

265

197

141

137

88

484

298

227

GB regions

37,874

20,030

15,076

9729

9757

36,794

21,052

14,878

8999

8701

59,703

29,302

20,657

 Intronic

28,751

13,971

11,700

6632

8000

27,982

14,405

11,322

6126

7371

41,219

18,842

14,029

 ncRNA exons

4188

2799

1052

1734

959

4351

3396

1216

1738

878

6589

3573

2117

 CDS

2657

1836

1313

731

416

2855

2221

1452

766

222

7885

4754

2952

 5′UTR

1106

613

617

258

132

908

518

580

191

90

2238

1221

861

 3′UTR

1135

785

381

369

233

698

512

307

178

140

1769

910

695

 Other UTR

37

26

13

5

6

0

0

1

0

0

3

2

3

Total nucs.

38.7M

27.1M

9.6M

17.8M

10.4M

83.8M

54.7M

61.3M

63.2M

55.4M

130.0M

96.4M

75.3M

Total regs.

76,805

36,277

32,557

20,344

23,198

120,968

75,470

35,528

29,275

32,314

150,966

64,438

46,339