Skip to main content

Table 1 RNA-seq-based fusion transcript predictors evaluated

From: Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods

Method

Class*

Brief overview of methodology

Arriba [17]

R

Arriba extracts gene fusions from the chimeric alignments reported by STAR [18] by applying a collection of filters which recognize frequent types of artifacts found in RNA-Seq data.

ChimeraScan [19]

R

Identifies candidate fusions from discordant Bowtie [20] genome alignments. Unmapped reads are trimmed and realigned. Junction breakpoint reads are resolved by aligning to candidate fused exons. Fusions are filtered based on abundance of fusion-supporting reads.

ChimPipe [21]

R

The GEMtools RNA-seq pipeline [22] and GEM alignment utility [23] are used to capture discordant and chimeric read alignments, and fusion candidates are filtered according to fusion evidence and additional gene-based filters.

deFuse [24]

R

Aligns reads to spliced and unspliced gene sequences using Bowtie [20], resolves split read junctions using a novel dynamic programming algorithm, and uses an AdaBoost classifier to discriminate between likely true vs. false fusions.

EricScript [25]

R

BWA [26] is used to align reads to the genome. Discordant reads are used to identify candidate gene fusions. BLAT [27] is then used in an iterative local alignment step to define precise fusion breakpoints by aligning to customized targets of fused exons. An AdaBoost classifier trained with synthetic data is used to score and rank fusion predictions.

FusionCatcher [28]

R

Leverages a collection of alignment utilities including Bowtie [20], Bowtie2 [29], BLAT [27], and STAR [18] with a collection of customized target databases to identify and characterize fusion candidates. Rigorous filtering of fusion predictions according to gene and fusion annotations is employed.

FusionHunter [30]

R

First uses Bowtie to align reads to the genome and identify candidate fusions based on discordant read pairs. Then creates a “pseudoreference” by positioning candidate fusion genes with canonical ordering, realigns reads using a custom algorithm and identifies both split and spanning reads providing evidence for gene fusions.

InFusion [31]

R

Reads are first aligned to the reference transcriptome using Bowtie2. Unaligned and discordantly aligned reads are further examined in the context of the genome and transcriptome to cluster evidence and define candidate fusions.

JAFFA-Assembly [32]

A

After removing intronic and intergenic region aligning reads defined by Bowtie genome alignments, the remaining reads are assembled using Oases [33] and the assembled contigs are mapped directly to the transcriptome using BLAT. Chimeric BLAT alignments are further assessed as fusion candidates.

JAFFA-Direct [32]

R

After removing intronic and intergenic region aligning reads defined by Bowtie genome alignments, the remaining reads are mapped directly to the transcriptome using BLAT. Chimeric BLAT alignments are further assessed as fusion candidates.

JAFFA-Hybrid [32]

R,A

After removing intronic and intergenic region aligning reads defined by Bowtie genome alignments, the remaining reads are assembled using Oases. Both the assembled transcripts and the original reads that failed to map to the genome are then mapped directly to the transcriptome using BLAT. Chimeric BLAT alignments are further assessed as fusion candidates.

MapSplice [34]

R

An RNA-seq aligner based on Bowtie similar to TopHat [35] and includes fusion-finding capabilities, although specific algorithmic details are lacking.

nFuse [36]

R

Designed for use with WGS-seq and RNA-seq but can be executed with RNA-seq only, leveraging its included deFuse with Bowtie2.

Pizzly [37]

R

Uses a k-mer based strategy to examine reads that do not map to isoforms consistently via kallisto [38] pseudoalignment.

PRADA [39]

R

Reads are aligned to a combined genome and transcriptome reference using BWA. Discordant reads identify fusion candidates, and junction reads are identified by mapping to a database of all possible 5′-3′ chimeric exon junction database.

SOAP-fuse [40]

R

The SOAP2 aligner [41] is used to map reads to genomes and spliced transcripts to identify fusion candidates.

STARChip [42]

R

Uses chimeric reads reported by STAR aimed primarily at identifying circular RNAs but also reports fusion candidates.

STAR-Fusion [43]

R

Uses chimeric read alignments reported by STAR in its Chimeric.out.junction file to identify candidate fusions followed by extensive filtering of likely artifacts.

STAR-SEQR [44]

R

Uses chimeric reads reported by STAR to find fusions.

TopHat-Fusion [45]

R

A modified execution of the TopHat aligner [35, 46] to examine initially unmapped reads as supporting fusion events.

TrinityFusion-C [47]

A

De novo assembles only the chimeric reads defined by STAR using the Trinity assembler [48], and subsequently leverages GMAP [49, 50] for chimera candidate detection.

TrinityFusion-D [47]

A

De novo assembles all input reads using Trinity, and subsequently leverages GMAP for chimera candidate detection.

TrinityFusion-UC [47]

A

De novo assembles both chimeric and unmapped reads defined by STAR using the Trinity assembler, and subsequently leverages GMAP for chimera candidate detection.

  1. *Class of fusion detection method: R read mapping, A assembly followed by alignment