Skip to main content

Table 1 Datasets used for evaluation of BERMUDA

From: BERMUDA: a novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes

Dataset

Batch

Protocol

Number of cells

Cell types (number of cells)

2D Gaussian

Batch1

Simulation

2000

Type1 (855), Type2 (237), Type3 (373), Type4 (535)

Batch2

Simulation

2000

Type1 (379), Type2 (656), Type3 (102), Type4 (863)

Splatter

Batch1

Splatter

2000

Type1 (810), Type2 (616), Type3 (388), Type4 (186)

Batch2

Splatter

1000

Type1 (439), Type2 (262), Type3 (201), Type4 (98)

Pancreas

Muraro

CEL-Seq2

2042

alpha (812), beta (448), gamma (101), delta (193), epsilon (3), acinar (219), ductal (245), endothelial (21)

Baron

inDrop

8012

alpha (2326), beta (2525), gamma (255), delta (601), epsilon (18), acinar (958), ductal (1077), endothelial (252)

Segerstolpe

SMART-seq2

2061

alpha (886), beta (270), gamma (197), delta (114), epsilon (7), acinar (185), ductal (386), endothelial (16)

PBMC

PBMC

10x Chromium

8381

 

Pan T cell

10x Chromium

3555

 
  1. For the consistency in the paper, we refer to each pancreas and PBMC dataset, e.g., the Muraro dataset, as a batch, and refer to multiple batches investigating similar biological systems, e.g., pancreas, as a dataset. Type1 to Type4 in the 2D Gaussian dataset and the Splatter dataset refer to the virtual cell types that were generated using simulation techniques