Skip to main content

Table 1 Dataset summary. See 5: 5.4 for the details of these datasets. For each simulated dataset, the actual cell type label set is known. For each experimental dataset, the benchmark label set was obtained by the experimental and bioinformatics analysis process. This process includes fluorescence-activated cell sorting (FACS), known feature gene checking, cell screening, and clustering. Thus, although these benchmark label sets are not the actual label sets, they reflect our best knowledge of the cell types

From: Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering

 

Dataset

\(\#\)Cells

\(\#\)Genes

\(\#\)Benchmark main-types (subtypes)

Protocols

Reference

Experimental

CT26.WT

9621

11,710

1(-)

10X v3

-

datasets

T-CELL

2989

7893

5(-)

10X

[24]

 

CORTEX

7390

12,887

8(33)

inDrop

[25]

 

RETINA

26,830

13,118

6(18)

Drop-seq

[26]

 

IPF

114,396

20,354

5(31)

10X

[27]

 

COVID

1,251,200

23,491

31(57)

10X

[28,29,30]

Simulated

SD1

4000

10,000

10(-)

-

-

datasets

SD2

4200

10,000

4(-)

-

-

 

SD3

2800

10,000

2(4)

-

-

 

SD4

3000

4887

5(-)

-

-