Proteopedia- a scientific 'wiki' bridging the rift between three-dimensional structure and function of biomacromolecules
© Hodis et al.; licensee BioMed Central Ltd. 2008
Received: 14 April 2008
Accepted: 3 August 2008
Published: 03 August 2008
Many scientists lack the background to fully utilize the wealth of solved three-dimensional biomacromolecule structures. Thus, a resource is needed to present structure/function information in a user-friendly manner to a broad scientific audience. Proteopedia http://www.proteopedia.org is an interactive, wiki web-resource whose pages have embedded three-dimensional structures surrounded by descriptive text containing hyperlinks that change the appearance (view, representations, colors, labels) of the adjacent three-dimensional structure to reflect the concept explained in the text.
Structural biology has played a central role in fueling the massive advances made by the life sciences in the last few decades. More than a dozen Nobel prizes have been awarded for achievements in structural biology since solution of the structure of the DNA double helix in the early 1950s was followed by solution of the first protein structures at the end of the same decade. Beautiful images of three-dimensional structures regularly adorn the covers of Science, Nature and Cell. Indeed, a wealth of protein structures has been solved in recent years, and entries in the Protein Data Bank (PDB) [1, 2] now number over 50,000. But structural information is surprisingly still not in the mainstream of biology for the simple reason that three-dimensional structures are often hard to understand, even for a structural biologist. The widely held impression is that these structures are understood in detail and put to use in research; in fact, the structures are hardly discussed at all, especially by biologists lacking a structural background. While computer graphics software greatly aids in the understanding of these structures by displaying them in three-dimensions, the pages of printed scientific journals flatten the structures to a two-dimensional image, with much of the three-dimensional information thus being lost. It should be noted, however, that a number of journals (Nature, Nature Structural and Molecular Biology, ACS Chemical Biology and Molecular Biosystems) have begun to offer links to FirstGlance in Jmol  for interactive three-dimensional structure visualization, and two journals (ACS Chemical Biology and Biochemical Journal) occasionally offer interactive three-dimensional figures crafted by Molecules In Motion ; but these still lack the simple direct link between the printed information and the three-dimensional structures that is provided by Proteopedia. Moreover, many biologists have a limited knowledge of chemistry; thus, structural biologists need to make a special effort to develop tools that make macromolecular structures accessible and useful to the life science and clinical communities.
One such tool is molecular animation. Movies are successful at making biomacromolecules and their complexes come to life on the screen, and thus are often able to preserve and convey three-dimensional information far better than static two-dimensional images. Previous efforts to communicate the structural and functional features of a biomacromolecule have largely focused on creation of such movies and on interactive visualizations (for example, Kinemage , MovieMaker , Protein Explorer [7, 8], Protein Movie Generator , and PDB2MGIF [10, 11]). Until recently, the time and technical knowledge required to make such macromolecular animations were daunting. This has been partly rectified with the advent of eMovie , a plug-in for the molecular visualization program PyMOL , and PolyView3D [14, 15], which have both simplified the creation process and lowered the threshold for sharing molecular three-dimensional information via movies. However, although movies are excellent for individual presentations, they are not an adequate solution to the problem that we are attempting to address, because they are fixed once created, and provide neither an interactive environment nor integration with textual information.
What is missing is a common resource that would make three-dimensional structures easier to understand, permit linking of function to structure, and at the same time simplify the sharing of structural information. This should be accomplished not by reducing the amount of information conveyed, but rather by making three-dimensional information intuitive, and thus more accessible to all. Already, valuable attempts have been made to tackle this problem. Perhaps the most notable recent example is iSee , which, like Kinemage, makes three-dimensional structures more intuitive by linking textual information to three-dimensional views of the structure. However, iSee uses both proprietary authoring tools, which must be purchased, and a proprietary viewer that has to be downloaded and installed in order to view both text and three-dimensional structures.
For non-structural biologists, the issue is not understanding a structure as an end in itself, but relating the structural information to biological applications: for example, how do mutations cause disease? Or, to be more specific, what mutation can be performed that will prevent one protein from interacting with another? How can one design a drug that will stabilize a protein destabilized by mutagenesis? Which part of a protein may be useful as an epitope? What happens in an organism in which a given protein domain is missing? In order for structural biology to provide genuine added value for non-structural biologists, we need a resource that will allow the relevant information and its analysis to be entered by the appropriate, knowledgeable scientists - and easily accessed and understood by users without a formal background in structural biology.
Proteopedia is a wiki-based web-resource that has been designed to address what is missing from structural biology: a mechanism for making three-dimensional structures easier to understand, a linking of function to interactive three-dimensional structure visualization, and a simplified sharing of structural and functional knowledge (a wiki is a resource or website where users can edit the pages in the website using simple text-editing tools). This resource is a tool for all scientists who need to utilize three-dimensional structural information in their research, as well as for educators requiring a medium for compelling presentation of structure-function relationships. Proteopedia is also meant for structural biology specialists in need of a more effective method of communicating their results. As a website, Proteopedia is freely accessible to all users without the need for downloading and installing any software. (Java is required. Most users will find that they already have Java installed on their computers. Should they need to download Java, they will be directed to the Java website for the free and simple download.). Furthermore, adding content to the website is simple: textual content is added in the same way as it is added in Wikipedia , taking advantage of an interface that is familiar to millions. Interactive, customized scenes of three-dimensional structures linked to the text are simple to add via Proteopedia's easy-to-use Scene Authoring Tools. Proteopedia is intended to be the website of first-resort for everyone from research scientists to students seeking integrated three-dimensional structural and functional information about a particular protein or molecule.
Proteopediashows and tells
At first sight, Proteopedia looks a lot like Wikipedia. Indeed, Proteopedia runs on the same open software wiki package used by Wikipedia, MediaWiki . However, a Proteopedia user will soon notice several differences. For one, most pages include at least one instance of the molecular visualization applet Jmol  (an applet is a small program embedded in a webpage), displaying a slowly revolving three-dimensional protein structure. Instead of a flattened, two-dimensional image of a protein structure, users are greeted by a three-dimensional structure that may be rotated and explored in real-time. The second most obvious difference is the existence of green hyperlinks within the text. Clicking on these hyperlinks changes the three-dimensional molecular scene displayed within the adjacent Jmol applet to one that better illustrates the concept referred to in the relevant text. In some sense this follows the familiar and important English essay-writing adage "Show, don't tell".
For example, a user interested in hemoglobin visits the page of that name in Proteopedia. A slowly rotating three-dimensional crystal structure of hemoglobin is displayed in an interactive Jmol applet. While reading the text, the user clicks on the embedded green hyperlinks to display new molecular scenes illustrating the points in the text (Figure 1). Each of the links, which can be traversed in any order, smoothly transitions from the previous scene to the next one, enhancing the user's spatial comprehension of relative locations on and within the protein. In contrast, two-dimensional images of protein structures often leave the user grappling with the spatial relations of one image to another.
Creating molecular scenes without tears
The key breakthrough in Proteopedia is the ease with which any user can create 'text-to-molecular-scene links' using the Scene Authoring Tools (for example, see  for a narrated video tutorial). The Scene Authoring Tools strive for user-friendliness, and they can be accessed by virtually any system, be it Windows, Linux, or Mac, running any of the most popular web browsers (Internet Explorer, Firefox, Safari, and others).
A Proteopedia user who wants to create a scene uses the Scene Authoring Tools to manipulate his or her three-dimensional structure into the desired viewing-perspective and zoom, colors, representations and labels (like a two-dimensional picture). That particular scene of the three-dimensional structure is then saved and married to a green link in the text of the page. Whenever that green link is clicked, the Jmol applet will recall the saved scene, and will automatically transition smoothly to it. Conformational changes (or morphs) can be animated as well. Previously created scenes are easily recalled and edited within the Scene Authoring Tools.
Content from the user community, wiki-style
Each page in Proteopedia can be modified by the members of the user community, thus permitting addition and editing of content. Modifications become visible and searchable immediately. Adding and editing content is quick, easy, and accessible to the common non-technical user and scientist.
Compared to other three-dimensional structural databases that solely archive, in a rigid format, data from scientists working on a given protein, Proteopedia, because it is a wiki, permits anyone knowledgeable with respect to that particular protein to add information regarding its function and to relate the information directly to the three-dimensional structure. Mistakes and errors are easily corrected by users who have opted to receive e-mail notification whenever the page on which they are expert is changed. Each change made to a page is logged in that page's history, so that pages can easily be reverted to a previous state. When appropriate or necessary, a page may be protected from being edited except by a selected group of stewards who can evaluate proposed changes to the page.
Adaptation of the wiki concept for the scientific community
In creating a wiki for the scientific community, two chief concerns are to ensure that only knowledgeable users are authoring content, and to ensure that authors receive proper credit for their contributions. Proteopedia addresses these issues in the following manner. While anyone can view Proteopedia pages, only registered users can edit pages and add content. In contrast to Wikipedia, Proteopedia user accounts are exclusive to the scientific community, and only scientists, educators, and students of science are invited to request accounts by clicking on "log in/request account" at the upper right-hand side of the webpage. Approved accounts are created using the users' real names so that the authors both receive appropriate credit for their contributions (each page lists the names of the people who have contributed to the page) and take responsibility for their entries.
Proteopediafor lectures and for supplementing journal articles: protected pages
In a departure from the purist wiki model, Proteopedia provides each user with a section where she or he can create pages that are protected from editing by others. By so doing, Proteopedia encourages educators and lecturers to take advantage of the three-dimensional visualization features of Proteopedia to create interactive three-dimensional 'lecture slides' for projection from the website, without having to worry that the content might be changed by someone else. Students can access this lecture material at any time, anywhere, even after the lecture. Additionally, scientific papers discussing three-dimensional macromolecular structures may also benefit from the three-dimensional visualization features of Proteopedia via protected pages with interactive, three-dimensional material supplementary to the publication.
50,000 pages and growing
But Proteopedia is not a one-to-one mapping of the PDB. The seeded PDB entry pages in Proteopedia provide a base level in a hierarchical organization. A higher level consists of pages that explain and summarize structure/function knowledge about particular molecules or classes of molecules. For example, the hemoglobin and acetylcholinesterase pages provide general overviews of these molecules along with rotatable/zoomable three-dimensional structures and links to all of the related PDB entry pages in Proteopedia.
If you build it, they will come
To have real value to a diverse audience, three-dimensional structures of proteins, RNA, DNA, and other biomacromolecules must be communicated, wherever possible, together with their biochemical and biological functions. While Proteopedia makes this integrated communication possible, and even simple, it is a resource that relies on community-annotation, and there is no guarantee that enough knowledgeable users will take to Proteopedia en masse to reach a critical level of users. To minimize this risk, Proteopedia attempts to be as enticing as possible to these knowledgeable users, with intuitive visualization features, with user-friendly authoring tools, with attribution of content, with special protected pages for lectures, tutorials, and supplementary information for journal articles, and with a familiar interface (from Wikipedia). In addition, all textual content and scenes added by users to Proteopedia are licensed under the GNU Free Documentation License (as in Wikipedia), thus ensuring that the content is free, and that Proteopedia is solely a vehicle for content creation and dissemination. Proteopedia will also continue to cater to its knowledgeable users by listening to their feedback and actively developing in ways that satisfy their needs and desires. For example, Proteopedia will shortly offer the option to display the amino acids in three-dimensional protein structures color-coded according to their degree of evolutionary conservation (using ConSurf ).
How Proteopediais being used today
Key advantages of Proteopedia
Unique features of Proteopedia in comparison to existing resources with similar purposes
Contents (April 2008)
Contains all entries in the PDB, updated automatically
Interactive three-dimensional within site with molecular scenes linked to text
User-friendly three-dimensional authoring tools, freely available
A free, collaborative, three-dimensional encyclopedia of proteins and other molecules
One page for every PDB entry with abstract and interactive three-dimensional views, including functional sites and ligands (> 50,000 pages), plus several dozen well-developed higher-level pages (such as hemoglobin)
To communicate the results of the SGC and ideally of other groups that purchase the software
Results of the Structural Genomics Consortium (about 400 datapacks available)
To communicate scientific illustrations as interactive computer displays
Estimated to be in the thousands for a wide variety of proteins and biomacromolecules, and created by a diverse group of authors
An annotation platform limited to the targets of the Protein Structure Initiative
Small subset of structural genomics results (< 2,000 pages)
A community annotated knowledge base of biological molecular structures
One-to-one mapping of the PDB with additional links and images (> 50,000 pages)
Protein structures are not ends in themselves. Structural information must be placed in the appropriate biological context in order to be useful. To borrow from Greg Petsko, "Structures have value when they are part of a larger effort to understand the biochemical and biological functions of the protein in question... [Structure determination] is not the end in itself, nor should it be, not anymore..." . Structures have value to a more diverse audience when three-dimensional structural information is smoothly integrated with biochemical and biological information. For example, it would be ideal if each new deposition in the PDB were accompanied by a well-developed page in Proteopedia by its authors, serving at least as a sort of 'News and Views', and touching on deeper details about the structure as necessary.
Proteopedia enhances the scientific community's ability to communicate complex three-dimensional information. Its integrated text and graphics allow for structural information to be conveyed in a manner that is accessible to a broad repertoire of scientists. Relevance of structure to function can be transmitted in a transparent fashion, and shared via simple tools for contributing to the website. Furthermore, Proteopedia has the capacity to leverage the resources of many diverse experts in varied fields rather than just the curators at a database site - and the ability to do so in an exciting, new medium.
Proteopedia is built upon a customized version of the MediaWiki  open software package, and integrates the Jmol  open-source Java applet viewer for chemical structures in three-dimensions using an adapted version of the Jmol MediaWiki Extension  with novel Scene Authoring Tools built specifically for Proteopedia. Kinemages are visualized in Proteopedia using MageJava . PDB entry pages are automatically seeded using a script driven by OCA  (the browser/database for protein structure/function), which aggregates information from various resources (listed at ). SGKB annotation plays a key part in OCA's data collection for seeding the PDB entry pages, and two-dimensional images for these pages are seeded from the RCSB PDB  and the Jena Library . Proteopedia is backed up daily to both local and remote locations at the Weizmann Institute of Science, with incremental backups daily and full backups weekly.
Protein Data Bank.
This study was supported by the Divadol Foundation, the Nalvyco Foundation, the Jean and Julia Goldwurm Memorial Foundation, the Benoziyo Center for Neuroscience, the Neuman Foundation, a research grant from Mr. Erwin Pearl, the Kimmelman Center, the European Commission Sixth Framework Research and Technological Development Programme 'SPINE2-COMPLEXES' Project under contract number LSHG-CT-2006-031220 and 'Teach-SG' Project, under contract number ISSG-CT-2007-037198. JLS is the Morton and Gladys Pickman Professor of Structural Biology. EH is grateful to the Karyn Kupcinet Program and the Feinberg Graduate School (Weizmann Institute of Science) for a fellowship. EM's visit to the Weizmann Institute of Science was funded by the Divadol Foundation. The authors are very grateful to the Jmol and MediaWiki development teams for their support and development of their respective software packages. Special thanks go to Bob Hanson, the current lead developer of Jmol, whose timely incorporation of requested features and bug fixes is unparalleled. The authors are further very grateful to all of the resources whose information is aggregated on the Proteopedia seeded pages (PDB code-titled pages) and wish to thank David Lipman for his advice on the proper usage of PubMed abstracts. We also greatly appreciate the useful discussions with Karl Oberholser, Frieda Reichsman, Gideon Schreiber, Yigal Burstein, Harry Greenblatt, Anat Kats, Steven Brenner and David Givol, as well as the generous permission to incorporate content and images developed by Jane and David Richardson [5, 39] and David S Goodsell . We wish to thank, in particular, Nir Ben-Tal and Elana Erez for making ConSurf data available in Proteopedia and Tali Wiesel, of the Weizmann Institute of Science's Graphics Department, for designing Proteopedia's logo.
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res. 2000, 28: 235-242. 10.1093/nar/28.1.235.PubMedPubMed CentralView ArticleGoogle Scholar
- Sussman JL, Lin D, Jiang J, Manning NO, Prilusky J, Abola EE: The protein data bank at Brookhaven. International Tables for Crystallography, Volume F Crystallography of Biological Macromolecules. Edited by: Rossmann MG, Arnold E. 2001, Dordrecht: Kluwer Academic Publishers, 649-656. IUCr Tables F.Google Scholar
- FirstGlance in Jmol. [http://firstglance.jmol.org]
- Molecules in Motion. [http://www.moleculesinmotion.com/]
- Richardson DC, Richardson JS: The kinemage: A tool for scientific communication. Protein Sci. 1992, 1: 3-9.PubMedPubMed CentralView ArticleGoogle Scholar
- Maiti R, Van Domselaar GH, Wishart DS: MovieMaker: a web server for rapid rendering of protein motions and interactions. Nucleic Acids Res. 2005, 33: W358-W362. 10.1093/nar/gki485.PubMedPubMed CentralView ArticleGoogle Scholar
- Martz E: Protein Explorer: easy yet powerful macromolecular visualization. Trends Biochem Sci. 2002, 27: 107-109. 10.1016/S0968-0004(01)02008-4.PubMedView ArticleGoogle Scholar
- Animations in Protein Explorer. [http://proteinexplorer.org/morfdoc.htm]
- Autin L, Tuffery P: PMG: online generation of high-quality molecular pictures and storyboarded animations. Nucleic Acids Res. 2007, 35: W483-W488. 10.1093/nar/gkm277.PubMedPubMed CentralView ArticleGoogle Scholar
- Bohne A: PDB2MultiGIF: A Web ToPDB2MultiGIF: a web tool to create animated images of molecules. J Mol Model. 1998, 4: 344-346. 10.1007/s008940050092.View ArticleGoogle Scholar
- PDB2multiGIF. [http://www.glycosciences.de/modeling/pdb2mgif/]
- Hodis E, Schreiber G, Rother K, Sussman JL: eMovie: a storyboard-based tool for making molecular movies. TIBS. 2007, 32: 199-204.PubMedGoogle Scholar
- The PyMOL Molecular Graphics System. [http://pymol.sourceforge.net]
- Porollo A, Meller J: Versatile annotation and publication quality visualization of protein complexes using POLYVIEW-3D. BMC Bioinformatics. 2007, 8: 316-10.1186/1471-2105-8-316.PubMedPubMed CentralView ArticleGoogle Scholar
- Porollo AA, Adamczak R, Meller J: POLYVIEW: a flexible visualization tool for structural and functional annotations of proteins. Bioinformatics. 2004, 20: 2460-2462. 10.1093/bioinformatics/bth248.PubMedView ArticleGoogle Scholar
- Abagyan R, Lee WH, Raush E, Budagyan L, Totrov M, Sundstrom M, Marsden BD: Disseminating structural genomics data to the public: from a data dump to an animated story. TIBS. 2006, 31: 76-78.PubMedGoogle Scholar
- Wikipedia. [http://www.wikipedia.org]
- Levinthal C: Molecular model-building by computer. Sci Am. 1966, 214: 42-52.PubMedView ArticleGoogle Scholar
- MediaWiki. [http://www.mediawiki.org]
- Jmol. [http://jmol.sourceforge.net/]
- Proteopedia Video Guide. [http://proteopedia.org/wiki/index.php/Proteopedia:Video_Guide]
- NCBI PubMed. [http://www.pubmed.gov]
- Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N: ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res. 2005, 33: W299-W302. 10.1093/nar/gki370.PubMedPubMed CentralView ArticleGoogle Scholar
- Proteopedia Recoverin Page. [http://proteopedia.org/wiki/index.php/Recoverin%2C_a_calcium-activated_myristoyl_switch]
- Proteopedia 2rkx Page. [http://proteopedia.org/wiki/index.php/2rkx]
- Röthlisberger D, Khersonsky O, Wollacott AM, Jiang L, DeChancie J, Betker J, Gallaher JL, Althoff EA, Zanghellini A, Dym O, Albeck S, Houk KN, Tawfik DS, Baker D: Novel Kemp elimination catalysts by computational enzyme design. Nature. 2008, 453: 190-195. 10.1038/nature06879.PubMedView ArticleGoogle Scholar
- Proteopedia 2rd0 Page. [http://proteopedia.org/wiki/index.php/2rd0]
- Huang CH, Mandelker D, Schmidt-Kittler O, Samuels Y, Velculescu VE, Kinzler KW, Vogelstein B, Gabelli SB, Amzel LM: The structure of a human p110alpha/p85alpha complex elucidates the effects of oncogenic PI3Kalpha mutations. Science. 2007, 318: 1744-1748. 10.1126/science.1150799.PubMedView ArticleGoogle Scholar
- Proteopedia Photosystem II Page. [http://proteopedia.org/wiki/index.php/Photosystem_II]
- Proteopedia Highest Impact Structures Page. [http://proteopedia.org/wiki/index.php/Highest_impact_structures]
- MageJava. [http://kinemage.biochem.duke.edu/software/javamage.php]
- Petsko G: An idea whose time has gone. Genome Biol. 2007, 8: 107-10.1186/gb-2007-8-6-107.PubMedPubMed CentralView ArticleGoogle Scholar
- Jmol MediaWiki Extension. [http://jmol.svn.sourceforge.net/viewvc/jmol/trunk/Jmol-extensions/wiki/MediaWiki/]
- J. Prilusky, OCA, a Browser-database for Protein Structure/Function. [http://oca.weizmann.ac.il/oca-bin/ocamain]
- OCA Sources. [http://oca.weizmann.ac.il/oca-docs/sources.html]
- PSI Structural Genomics Knowledgebase. [http://kb.psi-structuralgenomics.org/KB/]
- RCSB PDB. [http://www.pdb.org]
- Jena Library of Biological Macromolecules. [http://www.fli-leibniz.de/IMAGE.html]
- Kinemage. [http://kinemage.biochem.duke.edu]
- PDB Molecule of the Month. [http://mgl.scripps.edu/people/goodsell/illustration/pdb]
- Proteopedia Hemoglobin Page. [http://proteopedia.org/wiki/index.php/Hemoglobin]
- Proteopedia 2ac0 Page. [http://proteopedia.org/wiki/index.php/2ac0]
- Kitayner M, Rozenberg H, Kessler N, Rabinovich D, Shaulov L, Haran TE, Shakked Z: Structural basis of DNA recognition by p53 tetramers. Mol Cell. 2006, 22: 741-753. 10.1016/j.molcel.2006.05.015.PubMedView ArticleGoogle Scholar
- Proteopedia 2bbn Page. [http://proteopedia.org/wiki/index.php/2bbn]
- Ikura M, Clore GM, Gronenborn AM, Zhu G, Klee CB, Bax A: Solution structure of a calmodulin-target peptide complex by multidimensional NMR. Science. 1992, 256: 632-638. 10.1126/science.1585175.PubMedView ArticleGoogle Scholar
- Karl Oberholser's Proteopedia Ramachandran Plots Page. [http://proteopedia.org/wiki/index.php/User:Karl_Oberholser/Ramachandran_Plots]
- Proteopedia. [http://www.proteopedia.org]
- iSee: interactive Structurally enhanced experience. [http://www.sgc.ox.ac.uk/iSee]
- TOPSAN: The Open Protein Structure Annotation Network. [http://www.topsan.org]
- PDBWiki. [http://www.pdbwiki.org]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.