A dictionary for genomes

Wells, William

doi:10.1186/gb-spotlight-20000830-03

Research news
Published: 30 August 2000

A dictionary for genomes

William Wells

Genome Biology volume 1, Article number: spotlight-20000830-03 (2000) Cite this article

641 Accesses
Metrics details

With sequence information in hand, the search for regulatory sites in promoters can be done by computers rather than cloning. But the primary tools for analysis, multiple-alignment algorithms, can only handle a small amount of sequence data. In the August 29 Proceedings of the National Academy of Sciences, Bussemaker et al. introduce an alternative algorithm that they dub 'MobyDick' (Proc Nat Acad Sci USA 2000, 97: 10096-10100). MobyDick treats DNA sequence as text in which allthewordshavebeenruntogether. It attempts to build a dictionary of 'words' by first finding over-represented pairs of letters. Letter frequency is used to determine the probability that the pairs exist thanks to chance, and this helps determine how larger fragments continue to be built. Bussemaker et al. test their algorithm on a space-less version of the first ten chapters of the novel Moby Dick, then attack a list of all of the upstream regions in the yeast genome. For yeast, approximately 500 dictionary entries fall above a plausible significance level, including 114 of the 443 experimentally confirmed sites, and good matches to approximately half of the motifs found in previous analyses of co-regulated genes, the cell cycle, and sporulation.

References

Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
Proceedings of the National Academy of Sciences, [http://www.pnas.org/]
The Promoter Database of Saccharomyces cerevisiae, [http://cgsigma.cshl.org/jian]
Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies.
Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.
The transcriptional program of sporulation in budding yeast.

Download references

Authors

William Wells
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wells, W. A dictionary for genomes. Genome Biol 1, spotlight-20000830-03 (2000). https://doi.org/10.1186/gb-spotlight-20000830-03

Download citation

Published: 30 August 2000
DOI: https://doi.org/10.1186/gb-spotlight-20000830-03

A dictionary for genomes

References

Rights and permissions

About this article

Cite this article

Keywords

Genome Biology

Contact us

A dictionary for genomes

References

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Genome Biology

Contact us