Skip to main content
Figure 2 | Genome Biology

Figure 2

From: A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

Figure 2

Recall and precision for short circular, long circular and long collinear transcripts. For this benchmark, we tested segemehl’s performance with sequence reads that were generated from the RefSeq database (A). To simulate sequencing errors, we applied an Illumina error model to the short circular reads (100 bp) and a 454 error model to the long circular and collinear transcripts (0.5 to 5 kB). For short circular transcripts, segemehl achieved a recall of more than 85%, outcompeting all other tools while maintaining a high precision of 98%. Using RefSeq transcripts of length 0.5 to 5 kB, segemehl achieved a recall of more than 80% for circular and linear transcripts. Among the tools that were able to handle such long transcripts, segemehl was the only tool that was able to detect the circularization. For long collinear transcripts, GSNAP was slightly better than segemehl by 6%, at the expense of a nearly twofold increase in runtime (Additional file 1: Table S1). (B) The RefSeq TTC22 transcript is an example of a simulated circularization. The arrow indicates where the transcript has been artificially circularized. SpliceMap, RUM and STAR did not find any circular junctions (not shown). STAR and GSNAP were the only tools able to handle long reads. gs, GSNAP; ms, MapSplice; se, segemehl; so, SOAPsplice; st, STAR; to, TopHat2.

Back to article page