Fig. 3From: Classification of low quality cells from single-cell RNA-seq dataDeceptive cells appear intact but are low quality. a PCA of first two principal components of 960 mESCs using all features. There is a clear separation between visually intact and visibly damaged cells. However, a noticeable fraction of visibly intact cells clusters with visibly damaged cells, and we term these ‘deceptive’ cells, as they look intact but are most likely damaged inside. b Statistical test from 2A-B. Similarity in GO terms indicate that the deceptive cells are also likely broken. c-e Different types of analysis illustrating the effect of removing low quality cells based purely on visual damaged (left side), and in addition, deceptive cells (right) from the training set. c Microscopy images of two chambers from a Fluidigm C1 chip showing the similarity between a genuine visually intact, high quality cell, and one annotated as such but positioned as an outlier cell in the PCA. d Principal component analysis of the training set (serum/LIF, 2i/LIF, alternative 2i/LIF). e Differential expression between serum/LIF and 2i/LIF cells. Boxplots of protein binding enriched GO categories in the middle, illustrating change in gene expression levels when deceptive cells are excluded. f Coefficient of variation compared against mean expression of each gene. Boxplot in the middle illustrates the change in gene expression levels for two significantly enriched GO categoriesBack to article page