A novel approach to analyzing SNPs
- Chang-Su Lim
© BioMed Central Ltd 2001
Received: 6 November 2001
Published: 21 December 2001
PicSNP is a browseable catalog of nonsynonymous single-nucleotide polymorphisms (nsSNPs - that is, base changes that alter the amino-acid sequence) in the human genome
PicSNP is a browseable catalog of nonsynonymous single-nucleotide polymorphisms (nsSNPs - that is, base changes that alter the amino-acid sequence) in the human genome. Out of 1,190,295 SNPs extracted from public databases, 3,793 nsSNPs have been identified among 2,162 genes, and 2,826 nsSNPs (distributed among 1,506 genes) could be connected to 1,247 Gene Ontology categories of function, biological process or cellular components. Of the sites and domains annotated in the SwissProt database, 495 were found to include nsSNPs, including two nsSNPs in disulfide-binding sites and 38 in transmembrane regions. This website would, for example, be useful for searches for SNPs involved in common diseases.
The nsSNPs are classified according to the functions of the affected genes and are searchable under the guidance of the Gene Ontology hierarchical listsof protein functions. Amino-acid changes in known functional domains and sites in proteins are highlighted within individual nsSNP records by showing affected amino-acids in red. Also, changes to the nucleotide sequence are easily viewable. Hierarchical lists of functions, together with associated nsSNPs, are implemented in a set of HTML pages that can be explored dynamically with the usual web browsers.
All the processes of constructing the catalog, from downloading the public database through creating the HTML pages will be fully automated by locally developed computer programs soon according to the website manager. This will ensure that data are fully coordinated with those of public data sources.
The search output provides a brief summary of useful information on each gene identified as that contains an nsSNP, including information such as the biological processes in which the gene is involved (for example, behavior, cell-cell communication, cell growth).
The database is not very well known and it is not yet very extensive.
The National Center for Biotechnology Information (NCBI) hosts the general database of single-nucleotide polymorphisms dbSNP. LocusLink and UniGene are useful for finding information about a particular gene or expressed sequence tag (EST).