Skip to main content

Table 3 Performance of text-based sequence extraction for cis-regulatory annotation

From: Text-mining assisted regulatory annotation

  dm2 hg18 mm8 ce2 rn4 All
Number of ORegAnno annotations 2,079 589 255 178 107 3,208
Number of PMIDs with ORegAnno annotation 389 283 113 30 48 850
Number of PMIDs with Ensembl target gene name(s) 388 253 107 29 42 819
Number of text hits from PMIDs with ORegAnno annotation 188 128 51 16 32 415
Number of text hits that overlap ORegAnno annotation 149 54 36 13 17 269
Percent text hits that overlap ORegAnno annotation (PPV) 79.3% 42.2% 70.6% 81.3% 53.1% 64.8%
Number of ORegAnno annotations overlapped by a text hits 681 133 149 22 64 1,049
Percent ORegAnno annotations overlapped by a text hits (SN) 32.8% 22.6% 58.4% 12.4% 59.8% 32.7%
Number of PMIDs with text hits 124 91 44 12 24 295
Percent PMIDs with text hits (coverage) 31.9% 32.2% 38.9% 40.0% 50.0% 32.2%
Number of PMIDs with text hits to correct species 123 84 37 12 18 274
Percent PMIDs with text hits to correct species (PPV) 99.2% 92.3% 84.1% 100.0% 75.0% 92.9%
Number of PMIDs with text hits and Ensembl target gene name(s) 122 77 33 11 16 259
Number of PMIDs with text hits and perfect match to correct target gene name(s) 67 57 24 4 10 162
Number of PMIDs with text hits and partial match to correct target gene name(s) 16 12 5 3 4 40
Percent PMIDs with text hits and match to correct target gene name (PPV) 68.0% 89.6% 87.9% 63.6% 87.5% 78.0%
Number of PMIDs without ORegAnno annotation with text hits 76 1,291 841 13 459 2,680
Number of text hits from PMIDs without ORegAnno annotation 126 2,602 2,131 14 1,002 5,875
Number of text hits from PMIDs without ORegAnno annotation that overlap ORegAnno annotation 59 202 58 1 18 338
Number of ORegAnno annotations overlapped by text hits from PMIDs without ORegAnno annotation 200 347 139 3 33 722