Skip to main content

Table 1 Datasets used for evaluation of BERMUDA

From: BERMUDA: a novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes

Dataset Batch Protocol Number of cells Cell types (number of cells)
2D Gaussian Batch1 Simulation 2000 Type1 (855), Type2 (237), Type3 (373), Type4 (535)
Batch2 Simulation 2000 Type1 (379), Type2 (656), Type3 (102), Type4 (863)
Splatter Batch1 Splatter 2000 Type1 (810), Type2 (616), Type3 (388), Type4 (186)
Batch2 Splatter 1000 Type1 (439), Type2 (262), Type3 (201), Type4 (98)
Pancreas Muraro CEL-Seq2 2042 alpha (812), beta (448), gamma (101), delta (193), epsilon (3), acinar (219), ductal (245), endothelial (21)
Baron inDrop 8012 alpha (2326), beta (2525), gamma (255), delta (601), epsilon (18), acinar (958), ductal (1077), endothelial (252)
Segerstolpe SMART-seq2 2061 alpha (886), beta (270), gamma (197), delta (114), epsilon (7), acinar (185), ductal (386), endothelial (16)
PBMC PBMC 10x Chromium 8381  
Pan T cell 10x Chromium 3555  
  1. For the consistency in the paper, we refer to each pancreas and PBMC dataset, e.g., the Muraro dataset, as a batch, and refer to multiple batches investigating similar biological systems, e.g., pancreas, as a dataset. Type1 to Type4 in the 2D Gaussian dataset and the Splatter dataset refer to the virtual cell types that were generated using simulation techniques