Anatomical ontologies: names and places in biology
© BioMed Central Ltd 2005
Published: 15 March 2005
Ontology has long been the preserve of philosophers and logicians. Recently, ideas from this field have been picked up by computer scientists as a basis for encoding knowledge and with the hope of achieving interoperability and intelligent system behavior. In bioinformatics, ontologies might allow hitherto impossible query and data-mining activities. We review the use of anatomy ontologies to represent space in biological organisms, specifically mouse and human.
Ontologies and biology
Biological science is a knowledge-intensive discipline. To become expert in any field in biology requires an extensive apprenticeship and a long experience in the field. Use of bioinformatic resources often requires similar expertise, and having both together is rare within a research group let alone in an individual. Ontologies are emerging as the key mechanism for encoding structured knowledge, and when used in the context of resources such as bioinformatics databases they open the possibility for more automated use of biological data.
Although some of the conceptualization that is represented by an ontology will be independent of the domain of knowledge that is being considered - as exemplified by the Dublin Core Metadata Initiative, which provides "an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models"  - domain-specific ontologies are needed to support particular areas, such as bioinformatics. In this context, the best known ontology is the gene ontology, GO, developed by the Gene Ontology Consortium , which describes molecular functions, biological processes and cell components. Various other bio-ontologies, including some for anatomy, can be found on the Open Biological Ontologies (OBO) website . Under the umbrella of the group Standards and Ontologies for Functional Genomics (SOFG), a community effort is under way to integrate human and mouse anatomy ontologies . Our experience is in the development of an anatomical ontology for the mouse, as part of a project to develop a database of mouse anatomy and gene expression , and it is to this example that we return throughout this article.
The representation of these ontologies varies greatly, ranging from fairly simple lists to complex structures expressed in specific ontology languages, such as OWL . And tools have been created to support the development and management of ontologies; examples include OilEd, OntoEdit and Protege2000 (for a brief survey, see ). There are also bioinformatics-specific tools, such as DAG-Edit, COBrA and AmiGO (all described on the GO website ). An important goal for any ontology is standardization, at the syntactic as well as the semantic level. For computational systems to interact effectively, everyone concerned must agree on the representation and meaning of the concepts that form part of the computational interaction.
Anatomy: parts and types
The graphical form, illustrated by column (b) in Figure 2, may also have a number of representations, but most importantly may include alternative views of the underlying concepts. This brings to the fore a critical development of the notion of what constitutes an ontology. By definition an ontology should be consistent, but here we try to capture alternative views of the underlying terms, so we need to build in inconsistency. Consistency is of course rescued by subdividing the concept into separate classes, such as 'hindbrain-expert-1' and 'hindbrain-expert-2' to denote views from two researchers, but the idea is to capture the current state of knowledge, which will evolve as understanding changes. At this point the ontology is almost a database. The ontology forms part of the theoretical framework for the field  and what was experimental data at one stage will be part of the current model or theory at a later stage.
The graphical representation is an extension of the definition of a concept to a graphical form. This definition may, however, be in terms of a particular individual. For example, in the case of the Mouse Atlas the graphical representation is part or all of a mouse embryo. The representation may be from a single animal or may be synthesized and averaged from a group of individuals. Either way, there is selection of a representative model within which the ontological concepts can be interpreted. The graphical representations of the parts is usually referred to as an atlas. Of course, there could be many such atlases, as indicated by column (c) in Figure 2. An atlas, therefore, consists of at least three parts, an ontology of terms (sometimes implicit, for example in the case of a list of countries, which need not be provided as an actual list but can still serve as one), a representative individual example on which to define the spatial extent and coordinates (which may include time), and a mapping, or interpretation, between the two.
A simple example of an anatomy ontology is the one we have developed as part of the Edinburgh Mouse Atlas Project (EMAP) [8, 11–13]. This ontology is designed to capture the structural changes that occur during embryonic development and consists of a set of 26 hierarchies, one for each developmental stage, where a stage is characterized by the internal and external morphological features of an embryo recognizable during that period of development (as defined by Theiler ). The ontology can be displayed as a set of hierarchical trees, with each term subdivided into its constituent parts. There is no requirement that each anatomical term is divided into non-overlapping structures, or that each component has only one parent, so the ontology can be represented as a DAG. Each node represents the biological concept, such as heart, at that particular time. Many of the terms and structures are repeated at each stage and it is possible to collapse the set of terms onto a single large hierarchy that includes all of the terms from all stages. This large DAG is stage-independent (with a few exceptions) and is referred to as the 'abstract-mouse'; terms within the DAG now represent the biological concepts for all stages. Within the EMAP database the abstract mouse and stage terms can be independently referenced via unique identifiers. In addition, EMAP can include a 'derived-from' link as a putative lineage relationship between tissues. These link the stage-specific components so that it becomes possible to query the derivation (and destination) of any given tissue.
An anatomy ontology for the adult mouse that is compatible with the EMAP ontology has been developed for the Mouse Genome Informatics (MGI) databases at the Jackson Laboratory, USA . A similar ontology was designed for human developmental anatomy , building on the work carried out by EMAP. Ontologies for adult human anatomy have been created as part of two projects, the General Architecture for Languages, Encyclopedias and Nomenclatures in Medicine (GALEN)  and the Digital Anatomist's Foundational Model (FMA)  projects. GALEN provides an ontology aimed at clinical applications, contains more than 10,000 anatomical concepts and uses the description logic language GRAIL (GALEN Representation and Integration Language) for representation. Relationship types between concepts are defined, including, for example, 'part-of', 'branch-of', 'contains' and 'connects'. Unlike the EMAP developmental anatomy, GALEN subdivides 'part-of into a number of different partonomic relationships. (A review of 10 years of experience developing GALEN has been published .) On the basis of work on the FMA, Rosse and Mejino  provide a comprehensive discussion of the ontological issues involved with developing an anatomical nomenclature. The FMA  uses a set of well defined principles and structures provided by Protégé-2000, a software tool for the creation of knowledge-based systems, developed by Stanford University . As in the case of GALEN, the FMA not only supports the basic relationships of 'part-of and 'type-of, but also further subdivides these.
Although GALEN and FMA cover the same domain of knowledge, namely human adult anatomy, attempts to develop methods to align the two ontologies have enabled no more than 7% of FMA's and 17% of GALEN's concepts to be matched . This should not be too surprising, however, considering that the creation of such ontologies not only requires the identification and naming of the concepts involved, but also often includes the identification of a set of attributes and a general definition describing the properties of these concepts. In addition, the relationships between concepts and rules for the propagation of properties need to be determined. Where all these activities are carried out independently by two groups, one should indeed expect to find significant differences - reflecting the purpose and expertise of each group - in the ontologies.
Whereas FMA and GALEN are text-based, Höhne et al. , within their Voxel-Man system of graphical human representation, have pioneered the use of sophisticated three-dimensional graphics and rendering to provide visual and interactive access to an atlas of anatomy including links to microscopic and functional data. (A voxel is the three-dimensional volume equivalent to a two-dimensional pixel.) Schubert and Höhne  discuss the specific challenges this has provided in terms of an anatomical partonomic hierarchy. As is the case for GALEN, they determine that certain properties can only be propagated along particular relationships and that this depends both on the nature of the data - they have microscopic, topographical, and functional information - and the type of part-of relationship. They use the six basic types of part-of relationships, developed by Gerstl and Pribbenow , extended to include a notion of topographical relationship, such as containment. Knowledge representation within the Voxel-Man system has similarities to the model presented in Figure 2. Its semantic network corresponds to a symbolic representation (Figure 2, column (a)) in our model view, and its image volume can be seen as an iconic representation (Figure 2, column (b)), whereas other attribute volumes are similar to the mappings discussed earlier. In our model, however, we recognize not only the possibility of multiple mappings but also the existence of multiple symbolic and iconic representations and the additional links across representations that follow from that.
An ontology that encompasses both the spatial mapping aspects discussed here (in two dimensions) and the notion of alternative interpretations of the 'same' term is provided by the Brainlnfo atlas . Here, the authors have collated anatomical terms from a number of published brain atlases for mammalian brains, principally primate but with reference to rat and mouse; they provide a tool for navigating either via ontological terms or via location on standard views of the brain.
So far we have discussed anatomies that are expressed in the form of an ontology. Of course other sets of anatomical terms exist. The most methodical and complete is the Terminalogica Anatomica (formerly Anatomica Nomina) developed over many years by the Federative Committee on Anatomical Terminology (FCAT) . This is an unstructured list, not in an open electronic form and is not widely used - so, for bioinformatics purposes it is not useful except as a set of reference terms. More structured and available is the Unified Medical Language System (UMLS) which provides a standardized set of terms, particularly with respect to medical and clinical terminology. As with other anatomies, however, it is not easy to use outside of the tools provided.
The ontologies discussed so far together undoubtedly provide an exhaustive set of terms that will, in principle, cover all bioinformatic requirements for a reference anatomy with a set of relationships to allow reasoning about anatomy and function. But, so far, the terms are not used anywhere except within the domains of application for which they were developed, unlike the Gene Ontology (GO) which has rapidly found widespread use. Why should this be the case? The answer seems to be partly accessibility and partly community. Useful ontologies must be easy to pick up and reuse and must include a sense that anybody with expertise can contribute. In addition, for many applications the complexity is a barrier. An example of an attempt to break down such barriers is the Standard Anatomy Entry List (SAEL) (see ) which is a small, unstructured list of anatomical terms, useful in particular for annotating genomic and proteomic data from gene-expression microarrays and serial analysis of gene expression (SAGE). Each of the terms in the SAEL will be mapped to the corresponding terms in the more detailed anatomy ontologies. Simplicity and accessibility are provided while retaining the links to more complex ontologies that can provide sophisticated reasoning capability.
Towards the next generation of anatomy ontologies
In this article we have discussed anatomy and how emerging ontologies are attempting to capture not only structural knowledge of anatomy but also some of the functional and spatial relationships between tissues. There are, however, some omissions in these attempts to formalize anatomical knowledge. The first is that they are only just beginning to become community enterprises that not only admit submissions from all parts of a scientific community but also allow alternative views of what purport to be the same biological concepts. How do we capture this knowledge? The task is large but no funds are available for bringing together the necessary expertise into a single project. A more plausible model is provided by the open-source software mechanism, which relies on contributions from committed experts in a distributed and altruistic fashion. In many cases the people collaborating will never meet. We need mechanisms to support such virtual organizations.
The second omission is that existing anatomy ontologies are basically about known concepts and are very limited for properties that are poorly expressed in words. A good example of such a property is geometry. The existing ontologies can to some extent encode something of the topological relationships - adjacency, overlap and enclosure - but are not useful for encoding distance, direction and spatial measures. For a proper understanding and modeling of development, as well as the simple capture of data such as phenotype, geometry is critical. To include geometry implies a representation of an 'individual' or standard specimen. This defines a real geometric space and the anatomical concepts can then be mapped into that space. In terms of a framework of understanding, the natural way to think of this is as an extension of the ontology to include geometry. Interestingly, informal feedback from a group of graduate students at the Human Genetics Unit in Edinburgh suggests that they found it perfectly natural to consider the geometric atlas with its associated anatomical domains linked to an anatomical nomenclature to be an ontology. Extending ontologies in a natural way to include more iconic forms of information is required.
A third omission, related to the other forms of information that are discussed above, is the issue of uncertainty. All scientific reasoning is ultimately based on an understanding of uncertainty. We need to manage and reason with uncertainty. It is clear that probability is the right language , but how do we merge this with the current logical approaches to ontologies? Finally, this discussion of anatomy has been founded on the underlying understanding of anatomy in the context of structure visualized by traditional dissection and histology. We now have a much more informative view of an organism's internal organization by looking at genetic activity. Now the 'structure' is also found in the high-dimensional gene-expression space, and the developmental trajectory is not only through the geometric space and time of the embryo but also through this 'gene space'. In spatiotemporal coordinates we know that the cellular trajectory is connected, since every cell has a parent. What do such paths or trajectories look like in gene-space? What can be considered 'close' in the 30,000-dimensional space of gene expression? These are questions to be answered as the structural view evolves to encompass the informational anatomy of gene expression and not just the morphological and functional anatomy derived from standard histology.
We are in need of a new generation of ontologies that go beyond the current preoccupation with predicate logic and expand into other representations of knowledge. This has echoes in many areas of understanding in science and touches on the basic meaning of scientific inference and scientific 'truth', an open philosophical debate that now has practical importance in the issue of encoding our current beliefs, even in such away as to allow limited reasoning capability within a highly constrained system. The attempt to make computers more useful in a practical sense is forcing to the foreground the basic meaning of biological knowledge and how can it be used computationally.
- Berners-Lee T, Hendler J, Lassila O: The semantic web. Sci Am Digital. 2001, 284: 34-43.View ArticleGoogle Scholar
- de Roure D, Jennings N, Shadbolt N: The semantic grid: a future e-science infrastructure. Grid Computing - Making the Global Infrastructure a Reality. Edited by: Berman F, Fox G, Hey A. 2003, Hoboken NJ: John Wiley, 437-470.Google Scholar
- Gruber T: A translation approach to portable ontology specifications. Knowledge Acquisition. 1993, 5: 199-220. 10.1006/knac.1993.1008.View ArticleGoogle Scholar
- Dublin Core Metadata Initiative. [http://www.dublincore.org]
- Gene Ontology. [http://www.geneontology.org]
- Open Biological Ontologies. [http://obo.sourceforge.net]
- SOFG - Standards and Ontologies for Functional Genomics. [http://www.sofg.org]
- Edinburgh Mouse Atlas Project. [http://genex.hgu.mrc.ac.uk/]
- World Wide Web Consortium (W3C). [http://www.w3.org/]
- Fensel D: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. 2001, Berlin: SpringerView ArticleGoogle Scholar
- Davidson D, Baldock R: Bioinformatics beyond sequence: mapping gene function in the embryo. Nat Rev Genet. 2001, 2: 409-418. 10.1038/35076500.PubMedView ArticleGoogle Scholar
- Baldock R, Bard J, Kaufman M, Davidson D: A real mouse for your computer. BioEssays. 1992, 14: 501-502. 10.1002/bies.950140713.PubMedView ArticleGoogle Scholar
- Burger A, Davidson D, Baldock R: Formalization of mouse embryo anatomy. Bioinformatics. 2004, 20: 259-267. 10.1093/bioinformatics/btg400.PubMedView ArticleGoogle Scholar
- Theiler K: The House Mouse. 1989, New York: SpringerView ArticleGoogle Scholar
- Mouse Genome Informatics (MGI). [http://www.informatics.jax.org/]
- Hunter A, Kaufman MH, McKay A, Baldock R, Simmen MW, Bard JBL: An ontology of human developmental anatomy. J Anat. 2003, 203: 347-355. 10.1046/j.1469-7580.2003.00224.x.PubMedPubMed CentralView ArticleGoogle Scholar
- OpenGalen. [http://www.opengalen.org]
- Foundational Model of Anatomy. [http://sig.biostr.washington.edu/projects/fm]
- Rogers J, Roberts A, Solomon D, van der Haring E, Wroe C, Zanstra P, Rector A: GALEN ten years on: tasks and supporting tools. MEDINFO. 2001, 10: 256-260.Google Scholar
- Rosse C, Mejino J: A reference ontology for biomedical informatics: the foundational model of anatomy. Biomedical Informatics. 2003, 36: 478-500. 10.1016/j.jbi.2003.11.007.PubMedView ArticleGoogle Scholar
- Protégé. [http://protege.stanford.edu]
- Zhang S, Mork P, Bodenreider O: Lessons learned from aligning two representations of anatomy. Proceedings of First International Workshop on Formal Biomedical Knowledge Representation. Edited by: Hahn U. 2004, Aachen: Technical University of Aachen, 102-108.Google Scholar
- Höhne KH, Pflesser B, Pommert A, Riemer M, Schiemann T, Schubert R, Tiede U: A new representation of knowledge concerning human anatomy and function. Nat Med. 1995, 1: 506-511. 10.1038/nm0695-506.PubMedView ArticleGoogle Scholar
- Schubert R, Höhne KH: Partonomies for interactive explorable 3D-models of anatomy. A Paradigm Shift in Health Care Information Systems: Clinical Infrastuctures for the 21st Century. Proceedings 1998, AMIA Annual Fall Symposium. Edited by: Chute CG. 1998, Orlando FL: American Medical Informatics Association, 433-437.Google Scholar
- Gerstl P, Pribbenow S: Midwinters, end games, and body parts: a classification of part-whole relations. Int J Hum-Comput Stud. 1995, 43: 865-889. 10.1006/ijhc.1995.1079.View ArticleGoogle Scholar
- BrainInfo. [http://braininfo.rprc.washington.edu/]
- Federative Committee on Anatomical Terminology: Terminologica Anatomica. 1998, Stuttgart: ThiemeGoogle Scholar
- Jaynes ET: Probability Theory: The Logic of Science. 2003, Cambridge: Cambridge University PressView ArticleGoogle Scholar