From: Automating curation using a natural language processing pipeline
File type
Normalization
Total interactions
Correct interactions
% of gold
Estimated recall
PDF
Exact
953
349
17.1
70.5
HTML
1,007
345
16.9
71.1
Fuzzy
1,186
371
18.2
67.5