Community-wide hackathons to identify central themes in single-cell multi-omics

Lê Cao, Kim-Anh; Abadi, Al J.; Davis-Marcisak, Emily F.; Hsu, Lauren; Arora, Arshi; Coullomb, Alexis; Deshpande, Atul; Feng, Yuzhou; Jeganathan, Pratheepa; Loth, Melanie; Meng, Chen; Mu, Wancen; Pancaldi, Vera; Sankaran, Kris; Righelli, Dario; Singh, Amrit; Sodicoff, Joshua S.; Stein-O’Brien, Genevieve L.; Subramanian, Ayshwarya; Welch, Joshua D.; You, Yue; Argelaguet, Ricard; Carey, Vincent J.; Dries, Ruben; Greene, Casey S.; Holmes, Susan; Love, Michael I.; Ritchie, Matthew E.; Yuan, Guo-Cheng; Culhane, Aedin C.; Fertig, Elana

doi:10.1186/s13059-021-02433-9

Table 2 Different methods were used in the hackathons and are also available as reproducible vignettes

From: Community-wide hackathons to identify central themes in single-cell multi-omics

Common challenges	Tasks	Hackathon 1 (spatial transcriptomics)	Hackathon 2 (spatial proteomics)	Hackathon 3 (scNMT-seq)
Pre-processing	Normalization & data transformation	Data distribution checks (Fig. 2: 1, Fig. 2: 4) High Variable Genes selection (Fig. 2: 5)	Variance Stabilization Normalization [16] (Fig. 2: 8) Arcsinh transformation (Fig. 2: 9). Inverse transformation (Fig. 2: 9) Selection of patients (Fig. 2: 9)	Summaries of DNA measurements (input data provided in hackathon)
Managing differences in scale	Data integration	LIGER [17] (Fig. 2: 2) (sc) ComBat (Fig. 2: 4) (bulk) Projection methods MFA, sGCCA [18] (Fig. 2: 4^a) (bulk) UMAP/tSNE (Fig. 2: 2) (sc)	Multiblock PCA [19] Weighting matrices based on their similarities: STATIS, MFA (Fig. 2: 8) (bulk^a) Scale MIBI-TOF to the range of CyTOF values (Fig. 2: 9)	LIGER [17] (Fig. 2: 13) (sc) Projection method sGCCA [18] (Fig. 2: 11) (bulk) Multi Omics Supervised Integrative Clustering with weights (Fig. 2: 14) (bulk)
Overlap	Cell overlap (features not matching)			Dimension reduction and projection methods: LIGER [17] (Fig. 2: 13) (sc) sGCCA [18] (Fig. 2: 11) (bulk)
	Partial feature overlap (cells not matching)		Imputation: Direct inversion with latent variables Optimal transport to predict protein expression (Fig. 2: 10) K-nearest neighbor averaging (Fig. 2: 9) No imputation: Biological Network Interaction^a
	Partial cell overlap (features not matching)		Multiblock PCA [19] (Fig. 2: 8^a)
	No cell overlap (complete feature overlap)		Transfer cell type label with Random Forest (Fig. 2: 7)	LIGER [17] (Fig. 2: 13)
	No cell overlap (partial feature overlap)		Topic modeling to predict cell spatial co-location or spatial expression (Fig. 2: 9, partial feature overlap)
	No overlap		RLQ^a [20]
Generic approaches	Classification & feature selection	Backward selection with SVM (Fig. 2: 1) Self-training ENet (Fig. 2: 4) Balanced error rate (Fig. 2: 1) Fig. 2: 4) Recursive Feature Elimination (Fig. 2: 5) (all bulk)		Multi Omics Supervised Integrative Clustering (Fig. 2: 14) (bulk) Lasso penalization in regression-type models (bulk)
	Cell type prediction	Projection with LIGER [17] (Fig. 2: 2) SVM (Fig. 2: 1, Fig. 2: 5) ssEnet (Fig. 2: 4) (all bulk)
	Spatial analysis	Hidden Markov random field Voronoi tesselation (Fig. 2: 1) (bulk)	Spatial autocorrelation with Moran’s Index (Fig. 2: 7, Fig. 2: 10) Selection of spatial discriminative features: Moran’s Index, NN correlation, Cell type, interaction composition, L function (Fig. 2: 10) (all bulk)
	Inclusion of additional information		Survival prediction: Cox regression based on spatial features (Fig. 2: 10)	Include annotated hypersensitive sites index to anchor new/unseen data from DNase-seq, (sc)ATAC-seq, scNMT-seq, for de novo peak calling (bulk^a)

^aindicates that the method was not applied on the hackathon data, “bulk” indicates the method was originally developed for bulk omics, “sc” indicates the method was specifically developed for single-cell data, other methods are generic

Back to article page

ISSN: 1474-760X

Contact us

Submission enquiries: editorial@genomebiology.com
General enquiries: info@biomedcentral.com

Genome Biology

Contact us