Skip to main content

Clustering round the links

Computational analysis of genomic DNA aids in the prediction of gene structure, and gene clustering predictions can provide clues to function. Four programs - Rosetta Stone, Phylogenetic Profile, Operon, and Conserved Gene Neighbor - identify linkage and infer function of genes and proteins in whole genome analysis, complementing the BLAST approach based on sequence homology. In the December 15 Nucleic Acids Research, Michael Strong and colleagues at the University of California at Los Angeles utilize the predictions and analyses derived from these programs to generate a genome-wide functional linkage map that portrays linkages in a two-dimensional scatter-plot. Applying this computational technology to the Mycobacterium tuberculosis genome results in the assignment of function to previously uncharacterized genes together with the prediction of a hypothetical serine/threonine kinase pathway involving cell wall genes, as well as the identification of potential novel targets for drug development (Nucleic Acids Research 2003, 31:7099-7109).

Strong et al. displayed 9766 functional linkages involving 1381 unique genes predicted in the M. tuberculosis genome in a single scatter-plot diagram. The x and y axes were organized according to the order of genes along the bacterial chromosome, with each point representing a functional linkage between two genes at the corresponding chromosomal positions. Clustering occurring near or on the diagonal reflects both bacterial operon organization and close chromosome proximity of genes of related function, information not derivable from standard node and edge graphing. Protein linkages can be viewed interactively either by examination at the whole genome level or by zooming-in to reveal small groups of clustered genes. In addition, a functional linkage profile for each gene was then derived from the map, and hierarchical clustering of the profiles resulted in a reordering of the map that provided a visual and detailed portrayal of functional linkage of the whole genome, in which clusters containing previously uncharacterized genes provide clues to their putative function.

"Since our original clustered map contains only linkages inferred by the overlap of two or more [of the four computational methods], examination of linkages established by individual methods may provide additional information and may aid in the identification of protein function involving clusters of previously uncharacterized genes. We envision that these functional linkages may suggest potential functional roles for these proteins and may indicate potential research directions," the authors conclude.


  1. 1.

    Nucleic Acids Research, []

  2. 2.

    University of California, Los Angeles, []

  3. 3.

    Survival perspectives from the world's most successful pathogen, Mycobacterium tuberculosis

Download references


Rights and permissions

Reprints and Permissions

About this article

Cite this article

Holding, C. Clustering round the links. Genome Biol 4, spotlight-20031201-08 (2003).

Download citation


  • Mycobacterium Tuberculosis
  • Functional Linkage
  • Phylogenetic Profile
  • Nucleic Acid Research
  • Uncharacterized Gene