Skip to main content
Figure 1 | Genome Biology

Figure 1

From: A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes

Figure 1

Markov model-based strategy. (a) An optimal core genes dataset is determined, and (b) a Markov probability matrix is built. (c) For a given genome, each ORF is analyzed using a Markov model that takes into account the Markov probability matrix of the core gene dataset and the composition of the ORF under study. (d) Fore each ORF the model calculates an index that represents the likelihood of that ORF having a composition similar to the core genes dataset. (e) One million random sequences are generated based on the Markov probability matrix of the core genes dataset, and their Markov indexes are calculated. (f) ORFs having a Markov index below a defined threshold of the distribution of random sequence indexes are considered as atypical.

Back to article page