Skip to main content
Figure 2 | Genome Biology

Figure 2

From: Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge

Figure 2

BioCreative II tasks. This figure illustrates the basic processing steps covered by the tasks and subtasks posed in BioCreative II. Note that not all of the data collections were aligned (the gene mention [GM], gene normalization [GN], and protein-protein interaction [PPI] tasks used different document collections). (A) Preprocessing of full-text articles was provided in different commonly available formats including HTML, PDF, and automatic plain text conversions from these formats was covered by the interaction pair subtask (IPS), interaction method subtask (IMS), and interaction sentences subtask (ISS). The detection and ranking of abstracts relevant for a given biological topic (in this case protein-protein interactions) was part of the interaction article subtask (IAS). (B) Labeling text with bio-entities of interest was part of the GM task, in which participants had to find gene and protein mentions automatically. (C) To provide direct links of abstracts and full-text articles to database entries, a process often called protein or gene normalization was part of the GN and IPS tasks, respectively. (D) Extraction of specific biological relation types (physical protein-protein interactions) was addressed in the IPS, together with the detection of experimental interaction detection methods used for characterizing these interactions. For human interpretation, retrieval of evidence passages summarizing a particular biological association is crucial. This aspect was addressed in the ISS. Different participating systems were evaluated and compared based on test data collections released by the BioCreative II organizers. To allow integration of different strategies, the BioCreative MetaServer (BCMS) was developed.

Back to article page