Skip to main content

Table 3 Performance of text-based sequence extraction for cis-regulatory annotation

From: Text-mining assisted regulatory annotation

 

dm2

hg18

mm8

ce2

rn4

All

Number of ORegAnno annotations

2,079

589

255

178

107

3,208

Number of PMIDs with ORegAnno annotation

389

283

113

30

48

850

Number of PMIDs with Ensembl target gene name(s)

388

253

107

29

42

819

Number of text hits from PMIDs with ORegAnno annotation

188

128

51

16

32

415

Number of text hits that overlap ORegAnno annotation

149

54

36

13

17

269

Percent text hits that overlap ORegAnno annotation (PPV)

79.3%

42.2%

70.6%

81.3%

53.1%

64.8%

Number of ORegAnno annotations overlapped by a text hits

681

133

149

22

64

1,049

Percent ORegAnno annotations overlapped by a text hits (SN)

32.8%

22.6%

58.4%

12.4%

59.8%

32.7%

Number of PMIDs with text hits

124

91

44

12

24

295

Percent PMIDs with text hits (coverage)

31.9%

32.2%

38.9%

40.0%

50.0%

32.2%

Number of PMIDs with text hits to correct species

123

84

37

12

18

274

Percent PMIDs with text hits to correct species (PPV)

99.2%

92.3%

84.1%

100.0%

75.0%

92.9%

Number of PMIDs with text hits and Ensembl target gene name(s)

122

77

33

11

16

259

Number of PMIDs with text hits and perfect match to correct target gene name(s)

67

57

24

4

10

162

Number of PMIDs with text hits and partial match to correct target gene name(s)

16

12

5

3

4

40

Percent PMIDs with text hits and match to correct target gene name (PPV)

68.0%

89.6%

87.9%

63.6%

87.5%

78.0%

Number of PMIDs without ORegAnno annotation with text hits

76

1,291

841

13

459

2,680

Number of text hits from PMIDs without ORegAnno annotation

126

2,602

2,131

14

1,002

5,875

Number of text hits from PMIDs without ORegAnno annotation that overlap ORegAnno annotation

59

202

58

1

18

338

Number of ORegAnno annotations overlapped by text hits from PMIDs without ORegAnno annotation

200

347

139

3

33

722