© BioMed Central Ltd 2001
Published: 28 November 2001
Finding the beginning of genes within genomic sequence presents a formidable challenge to projects to annotate the human genome sequence. In the Advanced Online Publication of Nature Genetics, Ramana Davuluri and colleagues at Cold Spring Harbor Laboratory, in New York describe a bioinformatic strategy to predict gene promoters and first exons (DOI: 10.1038/ng780). They developed a new program, called FirstEF, that attempts to predict the starts of genes. They collected over two thousand first-exons to use as a training dataset, and characterized those that were associated with a CpG island. FirstEF is designed to recognize CpG islands, promoter regions and first splice-donor sites. The program could predict 86% of all first exons with about 17% false positives (92% of CpG-related first exons and 74% of non-CpG exons). FirstEF gave a similar performance when tested against the finished sequences for human chromosomes 21 and 22.