Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis

Fig. 4

The computation time (in hours) for different dimensionality reduction methods. We recorded computing time for 18 dimensionality reduction methods on simulated data sets with a varying number of low-dimensional components and a varying number of sample sizes. Compared dimensionality reduction methods include factor analysis (FA; light green), principal component analysis (PCA; light blue), independent component analysis (ICA; blue), Diffusion Map (pink), nonnegative matrix factorization (NMF; green), Poisson NMF(light orange), zero-inflated factor analysis (ZIFA; light pink), zero-inflated negative binomial based wanted variation extraction (ZINB-WaVE; orange), probabilistic count matrix factorization (pCMF; light purple), deep count autoencoder network (DCA; yellow), scScope (purple), generalized linear model principal component analysis (GLMPCA; red), multidimensional scaling (MDS; cyan), locally linear embedding (LLE; blue green), local tangent space alignment (LTSA; teal blue), Isomap (gray), uniform manifold approximation and projection (UMAP; brown), and t-distributed stochastic neighbor embedding (tSNE; dark red). a Computation time for different dimensionality reduction methods (y-axis) changes with respect to an increasing number of low-dimensional components (x-axis). The number of cells is fixed to be 500 and the number of genes is fixed to be 10,000 in this set of simulations. Three methods (ZINB-WaVE, pCMF, and ZIFA) become noticeably computationally more expensive than the remaining methods with increasing number of low-dimensional components. b Computation time for different dimensionality reduction methods (y-axis) changes with respect to an increasing sample size (i.e., the number of cells) in the data. Computing time is recorded on a single thread of an Intel Xeon E5-2683 2.00-GHz processor. The number of low-dimensional components is fixed to be 22 in this set of simulations for most methods, except for tSNE which used two low-dimensional components due to the limitation of the tSNE software. Note that some methods are implemented with parallelization capability (e.g., ZINB-WaVE and pCMF) though we tested them on a single thread for fair comparison across methods. Note that PCA is similar to ICA in a and scScope is similar to several other efficient methods in b; thus, their lines may appear to be missing. Overall, three methods (ZIFA, pCMF, and ZINB-WaVE) become noticeably computationally more expensive than the remaining methods with increasing number of cells in the data

Back to article page