Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development

Fig. 1

Computational pipeline to identify circular RNA candidates. a We start with an annotated genome to create a database of junction sequences which is used to create two custom Bowtie2 junction indices: 1) a scrambled junction index containing all possible junction sequences formed either from circularization of a single exon or by each pair of exons in non-canonical order within a 1 Mb sliding window; 2) a linear junction index containing these exon pairs in canonical order. For single-end data (SE), mismatch rates in all reads aligned to a given junction are used to determine whether it is a true or false positive (see “Materials and methods”). For paired-end RNA-Seq data, read 1 (R1) and read 2 (R2) are aligned independently as single-end reads to these junction indices and a Bowtie2 genome index. Each R1 that aligned to a scrambled junction and did not align to the genome or a linear junction is categorized based on its mate alignment: circular if the mate aligns within the genomic region of the presumed circle defined by the junctional exons or decoy if the mate aligns outside this region. Each R1 that aligned to a linear junction is categorized as linear if the mate aligns concordantly to support a linear transcript. Reads in the linear and decoy categories are used to fit a generalized linear model (GLM). The GLM predicts the probability that each circular read belongs to class 1 (true positive) versus class 2 (false positive). The de novo analysis pipeline is shown in green. All reads that did not align to any of the indices are used to create a Bowtie2 index of de novo junction sequences, and all of the unaligned reads are re-aligned to this index. Each R1 from the de novo alignment is categorized based on its mate alignment just as the reads aligned to annotated exon junctions are. See “Materials and methods” for details. b Sequencing errors can lead to incorrect alignment to a circular junction. This results in either a false positive circular read or decoy read depending on whether R2 aligns inside or outside of the circularized region defined by R1

Back to article page