Skip to main content

Table 2 Indel density for annotation features (across all 44 ENCODE regions)

From: Functional constraint and small insertions and deletions in the ENCODE regions of the human genome

 

Indels

Rate (number per 100 kb)

Rate (bp per 100 kb)

 
 

n

bp

n

99% CI

bp

99% CI

Feature length (kb)

Manual

2,186

6,504

14.6

11.7 to 18.2

43.4

34.4 to 54.7

14,998

Random

2,300

6,506

15.3

13.6 to 17.3

43.4

37.5 to 50.2

15,000

Overall

4,486

13,010

15.0

13.4 to 16.7

43.4

38.3 to 49.1

29,998

RNA transcription

       

   CDS

5

5

0.7

0.1 to 8.6

0.7

0.1 to 8.6

675

   TSS

2

2

3.3

 

3.3

 

61

   RACEfrags

9

28

2.1

0.8 to 5.4

6.6

1.3 to 33.9

425

   TARs/transfrags

37

78

5.8

3.5 to 9.6

12.3

6.8 to 22.3

634

   Pseudo-exons

9

26

6.6

2.6 to 16.6

19.1

5.8 to 63.3

136

   3' UTR

48

103

11.0

7.2 to 16.7

23.6

13.5 to 41.3

436

   5' UTR

7

32

6.0

1.6 to 22.3

27.4

3.8 to 198.7

117

   TUF

53

160

12.2

7.8 to 19.2

36.9

20.2 to 67.6

433

Open chromatin

       

   FAIRE-sites

106

327

7.7

5.6 to 10.6

23.8

15.5 to 36.7

1,372

   DHS (NHGRI)

19

61

6.1

3.3 to 11.3

19.7

8.3 to 46.9

310

   DHS (Regulome)

43

135

8.6

5.3 to 14.0

27.0

13.4 to 54.4

499

DNA-protein intreraction/transcript regulation

       

   HisPolTAF

141

348

13.1

10.0 to 17.2

32.4

22.5 to 46.5

1,076

   Seq_specific (all motifs)

131

420

11.2

8.3 to 15.0

35.8

23.1 to 55.3

1,174

   SeqSp (sequence specific factors)

54

225

10.2

6.2 to 16.7

42.5

20.1 to 89.5

530

Ancestral repeats

532

1,592

7.9

6.7 to 9.2

26.5

21.7 to 32.5

5,998

Evolutionary constraint

       

   MCS strict

19

31

2.5

1.3 to 5.1

4.1

1.6 to 10.4

748

   MCS moderate

78

170

5.1

3.5 to 7.6

11.2

6.8 to 18.5

1,515

   MCS loose

356

960

9.8

8.2 to 11.7

26.4

20.9 to 33.4

3,637

Cell cycle

       

   EarlyRepSeg

1,124

2,989

16.4

13.8 to 19.4

43.5

33.3 to 56.9

6,868

   MidRepSeg

1,190

3,352

15.4

13.5 to 17.5

43.2

35.3 to 53.0

7,751

   LateRepSeg

1,110

3,345

13.9

12.1 to 15.9

41.9

32.9 to 53.3

7,991

  1. bp, base pairs; CDS, coding sequence; CI, confidence interval; DHS, DNAse hypersensitive sites; ENCODE, Encyclopedia of DNA Elements; FAIRE, formaldehyde assisted isolation of regulatory elements; kb, kilobases; MCS, multi-species conserved sequence; NHGRI, National Human Genome Research Institute; transfrag, transcribed fragment; RACEfrag, rapid amplification of cDNA ends fragment; TAR, transcriptionally active region; TSS, transcription start site; TUF, transcripts of unknown function; UTR, untranslated region.