Skip to main content
Figure 2 | Genome Biology

Figure 2

From: Identification of cyanobacterial non-coding RNAs by comparative genome analysis

Figure 2

Pipeline for comparative prediction of non-coding RNAs. (a) Intergenic sequences (IGRs) longer than 49 base-pairs were gathered from four Prochlorococcus and Synechococcus genomes and locally aligned using BLASTN. An overview of the intergenic sequences is given in Additional data file 2 (Table S4). Because of the initial asymmetric local alignment using BLASTN (see Figure 2b for a summary of significant BLASTN hits between the strains Prochlorococcus MED4 (MED), MIT 9313 (MIT), SS120 (SS) and Synechococcus WH 8102 (WH)), all candidate sequences were reverse-complemented. Redundancy in this data set was reduced by unifying those hits from each genome that showed a reciprocal overlap of 85% or greater. This candidate set was used as both query and subject in another local alignment step (BLASTN considering only the query strand as possible subject strand). Sequences that directly produced a significant blast hit (E-value ≤ 10-10), or were connected by a chain of such hits, were gathered into clusters ('single-linkage clustering'). Both genome strands were screened; thus, the pipeline produced 310 pairs of clusters in both forward and reverse complementary orientation. After an additional unification step of overlapping sequences within each cluster, the resulting clusters and their complement clusters were scored using ALIFOLDZ [33]. (b) The number of BLASTN high-scoring segment pairs for each query and subject combination of intergenic regions is given for a BLASTN E-value cut-off of 10-5 and after import of high-scoring segment pairs with an E-value of 10-10 or lower (in parentheses). MIT, Prochlorococcus strain MIT 9313; SS, Prochlorococcus strain SS120; WH, Synechococcus sp. WH 8102, MED, Prochlorococcus strain MED4.

Back to article page