From: Community-wide hackathons to identify central themes in single-cell multi-omics
Common challenges | Tasks | Hackathon 1 (spatial transcriptomics) | Hackathon 2 (spatial proteomics) | Hackathon 3 (scNMT-seq) |
---|---|---|---|---|
Pre-processing | Normalization & data transformation | Data distribution checks (Fig. 2: 1, Fig. 2: 4) High Variable Genes selection (Fig. 2: 5) | Variance Stabilization Normalization [16] (Fig. 2: 8) Arcsinh transformation (Fig. 2: 9). Inverse transformation (Fig. 2: 9) Selection of patients (Fig. 2: 9) | Summaries of DNA measurements (input data provided in hackathon) |
Managing differences in scale | Data integration | ComBat (Fig. 2: 4) (bulk) Projection methods MFA, sGCCA [18] (Fig. 2: 4a) (bulk) UMAP/tSNE (Fig. 2: 2) (sc) | Multiblock PCA [19] Weighting matrices based on their similarities: STATIS, MFA (Fig. 2: 8) (bulka) Scale MIBI-TOF to the range of CyTOF values (Fig. 2: 9) | Projection method sGCCA [18] (Fig. 2: 11) (bulk) Multi Omics Supervised Integrative Clustering with weights (Fig. 2: 14) (bulk) |
Overlap | Cell overlap (features not matching) | Dimension reduction and projection methods: | ||
Partial feature overlap (cells not matching) | Imputation: Direct inversion with latent variables Optimal transport to predict protein expression (Fig. 2: 10) K-nearest neighbor averaging (Fig. 2: 9) No imputation: Biological Network Interactiona | |||
Partial cell overlap (features not matching) | ||||
No cell overlap (complete feature overlap) | Transfer cell type label with Random Forest (Fig. 2: 7) | |||
No cell overlap (partial feature overlap) | Topic modeling to predict cell spatial co-location or spatial expression (Fig. 2: 9, partial feature overlap) | |||
No overlap | RLQa [20] | |||
Generic approaches | Classification & feature selection | Backward selection with SVM (Fig. 2: 1) Self-training ENet (Fig. 2: 4) Balanced error rate (Fig. 2: 1) Fig. 2: 4) Recursive Feature Elimination (Fig. 2: 5) (all bulk) | Multi Omics Supervised Integrative Clustering (Fig. 2: 14) (bulk) Lasso penalization in regression-type models (bulk) | |
Cell type prediction | Projection with LIGER [17] (Fig. 2: 2) ssEnet (Fig. 2: 4) (all bulk) | |||
Spatial analysis | Hidden Markov random field Voronoi tesselation (Fig. 2: 1) (bulk) | Spatial autocorrelation with Moran’s Index (Fig. 2: 7, Fig. 2: 10) Selection of spatial discriminative features: Moran’s Index, NN correlation, Cell type, interaction composition, L function (Fig. 2: 10) (all bulk) | ||
Inclusion of additional information | Survival prediction: Cox regression based on spatial features (Fig. 2: 10) | Include annotated hypersensitive sites index to anchor new/unseen data from DNase-seq, (sc)ATAC-seq, scNMT-seq, for de novo peak calling (bulka) |