- Web report
- Open Access
Finding the needle in the haystack
- Alessandro Guffanti
© BioMed Central Ltd 2002
- Received: 27 November 2001
- Published: 21 January 2002
The Expression Profiler is an ensemble of web-based computational resources (still under development) for clustering gene-expression data with different algorithms and distance measures, obtaining graphical displays of the results (EPCLUST) and linking the cluster with annotation resources (URLMAP)
- Graphical Display
- Cluster Procedure
- Steep Learning Curve
- Sequence Logo
- Advanced User
The Expression Profiler is an ensemble of web-based computational resources (still under development) for clustering gene-expression data with different algorithms and distance measures, obtaining graphical displays of the results (EPCLUST) and linking the cluster with annotation resources (URLMAP). The latter is a URL mapper that uses Sequence Accession numbers or other identifiers within clusters to create on-the-fly links to external resources such as KEGG, PFAM, SPEXS (a tool for extracting patterns from selected sequence datasets), a SWISS-PROT browser, a Gene Ontology (GO) browser and MEDLINE. It is also possible to match the patterns against sequences stored on the server and make direct graphical comparison with expression profiles (PATMATCH), and to visualize the information content of the patterns using SEQUENCE LOGO, a beautiful and useful sequence logo generator that starts from aligned or unaligned patterns. Sequence logos are an intuitive graphical way of representing consensus sequences or patterns. Clustering is very fast and the number of available options is noticeably larger thansimilar PC-based solutions. Another interesting tool is GENOMES, a full genome-sequence and expression-data extractor (limited at present to Saccharomyces cerevisiae open reading frames).
The software is in continuous development and the latest available version at the time of writing was 1 December 2001.
EPCLUST (the clustering and analysis part of the Expression Profiler) is very fast, rich in options and produces nice GIF images that can be downloaded and used for presentation. The author gives prompt answers to any question; a mailing list (ep-users) is available for discussions.
The lack of an extensive, centralized tutorial sometimes makes it hard to follow all the possible paths and to understand all the possibilities of the software. Some options allow one to make a wrong input and hence will not work; this may happen when selecting a subset of genes or linking to PATMATCH from the clustering results. The EP:PPI (Protein-Protein Interaction) feature is not ready yet, and should therefore not be included in the main page.
The site is functional and useful, but the enormous number of options explained in a very summarized form makes for a steep learning curve. A better page design would also help with program usability. I would personally vote for a basic versus advanced user double menu. A 'step-by-step' guided tutorial would be very useful. In its present state, I would not define the site as suitable for the faint of heart or for the absolute beginner in clustering. With some improvements, however, it will become a very useful and easy-to-use research and didactic tool.