Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA-sequencing data

Fig. 3

Summary of concordance analysis results. Hierarchical clustering of 25 DE pipelines based on standard scores of four concordance metrics (a fraction of significantly differentially expressed (SDE) genes detected at 5% FDR, b overlap among pipelines in detecting SDE genes at 5% FDR, c gene ranking agreement, and d similarity of log fold-change (LFC) estimates). Scores are averaged across the six datasets. First, observed values (let yi, i = 1,2,..., 25) of concordance metrics (proportions and correlations) for each pipeline from a given dataset are converted to standard scores (zi = (yi − ȳ)/sy, where ȳ and sy are the mean and standard deviation of yi, respectively). Afterwards, the average of standard scores of each pipeline across datasets are presented. A negative value, for example, for the fraction of SDE genes indicates that the number of SDE genes detected by the pipeline is lower than the average across all the 25 pipelines. Subsequently, the Euclidean distance among the marginal standardized scores of the four comparison metrics are computed and the complete linkage method of agglomerative clustering is applied, resulting in four clusters. The bar plots to the right side of the cluster show the individual marginal scores of each DE tool for the four concordance measures. Since the fold-change estimate from SAMSeq is in terms of rank-sum, it was excluded from comparison of LFC estimates

Back to article page