Gene mention normalization and interaction extraction with context models and sentence motifs

Table 1 Results for gene mention normalization

Short description of the submitted run	Precision	Recall	F measure (%)	True positives (n)	False positives (n)	False negatives (n)
Training set	82.1	81.6	81.8	522	114	118
Training set, no filtering, no disambiguation	20.2	92.7	33.1	593	2,348	47
Test set	78.9	83.3	81.0	654	175	131
Test set, no disambiguation	49.6	87.5	63.3	687	699	98
Test set, unextended lexicon	70.7	72.5	71.6	569	236	216
Test set, current performance	90.7	82.4	86.4	647	66	138

Performance of the gene mention normalization component on the BioCreative II gene normalization sets. Each run includes the extended gene name lexicon, all false-positive filters, and the disambiguation, unless indicated otherwise. Results on the test set reflect official results achieved in the external evaluation; the last row shows the current performance, resulting from improvements added in the aftermath of BioCreative II.

ISSN: 1474-760X