Skip to main content

Table 2 Overview of UMI datasets used for analysis

From: Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data

reference

accession no.

protocol

cells

genes

species

description

33k PBMC

*

10X v1

33,148

16,809

Human

Peripheral blood mononuclear cells

[43]

SRP073767**

10X v1

3,994

15,715

Human

FACS-sorted PBMC cells

[52]

E-MTAB-5480

10X v2

2,000

13,025

Human

Droplets of bulk RNA solution

[53]

GSM1599501

inDrop

953

25,025

Human

Droplets of bulk RNA solution

[54]

GSM2906413

MicrowellSeq

9,994

15,069

Mouse

Non-differentiating stem cells

[33]

GSE63472

DropSeq

24,769

17,973

Mouse

Retinal cells

[37]

GSE81904

DropSeq

13,987

16,520

Mouse

Retinal bipolar cells

[38]

GSE133382

10X v2

15,750

17,685

Mouse

Retinal ganglion cells

[39]

GSE119945

sci-RNA-seq3

2,058,652

26,183

Mouse

Organogenesis of mouse embryo cells

  1. In the 10X control dataset [52], we used only sample 1. In the MicrowellSeq control dataset [54], we used the E14 dataset. In the three retinal datasets [33, 37, 38], we only used cells from the largest batch. The FACS-sorted PBMC dataset was assembled by authors of a recent paper [42], based on a benchmarking dataset published earlier [43]. Numbers of genes and cells are after batch selection (where applicable) and initial gene filtering (see “Methods”). Scripts performing these operations and detailed download instructions for all materials are published in our Github repository at http://www.github.com/berenslab/umi-normalization. The accession numbers refer to archived datasets at the Gene Expression Omnibus (NCBI), the Sequence Read Archive (NCBI), or ArrayExpress (EMBL-EBI). *Data directly obtained from 10X Genomics at https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc33k. **The accession number links to the base dataset that the original authors used to construct the ground truth dataset for their paper [42]. To obtain the dataset used here, use the Bioconductor 3.1.3 R package DuoClustering2018 or visit the authors’ website (http://imlspenticton.uzh.ch/robinson_lab/DuoClustering2018/)