From: Automating curation using a natural language processing pipeline
File type
Normalization
Total interactions
Correct interactions
% of gold
Estimated recall
PDF
Exact
9,015
495
24.3
66.5
HTML
8,394
485
23.8
65.8
Fuzzy
15,383
550
27.0
54.9