Skip to main content

Table 2 Different methods were used in the hackathons and are also available as reproducible vignettes

From: Community-wide hackathons to identify central themes in single-cell multi-omics

Common challenges

Tasks

Hackathon 1 (spatial transcriptomics)

Hackathon 2 (spatial proteomics)

Hackathon 3 (scNMT-seq)

Pre-processing

Normalization & data transformation

Data distribution checks (Fig. 2: 1, Fig. 2: 4)

High Variable Genes selection (Fig. 2: 5)

Variance Stabilization Normalization [16] (Fig. 2: 8)

Arcsinh transformation (Fig. 2: 9).

Inverse transformation (Fig. 2: 9)

Selection of patients (Fig. 2: 9)

Summaries of DNA measurements (input data provided in hackathon)

Managing differences in scale

Data integration

LIGER [17] (Fig. 2: 2) (sc)

ComBat (Fig. 2: 4) (bulk)

Projection methods MFA, sGCCA [18] (Fig. 2: 4a) (bulk)

UMAP/tSNE (Fig. 2: 2) (sc)

Multiblock PCA [19]

Weighting matrices based on their similarities: STATIS, MFA (Fig. 2: 8) (bulka)

Scale MIBI-TOF to the range of CyTOF values (Fig. 2: 9)

LIGER [17] (Fig. 2: 13) (sc)

Projection method sGCCA [18] (Fig. 2: 11) (bulk)

Multi Omics Supervised Integrative Clustering with weights (Fig. 2: 14) (bulk)

Overlap

Cell overlap

(features not matching)

  

Dimension reduction and projection methods:

LIGER [17] (Fig. 2: 13) (sc)

sGCCA [18] (Fig. 2: 11) (bulk)

Partial feature overlap

(cells not matching)

 

Imputation:

Direct inversion with latent variables

Optimal transport to predict protein expression (Fig. 2: 10)

K-nearest neighbor averaging (Fig. 2: 9)

No imputation:

Biological Network Interactiona

 

Partial cell overlap

(features not matching)

 

Multiblock PCA [19] (Fig. 2: 8a)

 

No cell overlap

(complete feature overlap)

 

Transfer cell type label with Random Forest (Fig. 2: 7)

LIGER [17] (Fig. 2: 13)

No cell overlap

(partial feature overlap)

 

Topic modeling to predict cell spatial co-location or spatial expression (Fig. 2: 9, partial feature overlap)

 

No overlap

 

RLQa [20]

 

Generic approaches

Classification & feature selection

Backward selection with SVM (Fig. 2: 1)

Self-training ENet (Fig. 2: 4)

Balanced error rate (Fig. 2: 1) Fig. 2: 4)

Recursive Feature Elimination (Fig. 2: 5)

(all bulk)

 

Multi Omics Supervised Integrative Clustering (Fig. 2: 14) (bulk)

Lasso penalization in regression-type models (bulk)

Cell type prediction

Projection with LIGER [17] (Fig. 2: 2)

SVM (Fig. 2: 1, Fig. 2: 5)

ssEnet (Fig. 2: 4)

(all bulk)

  

Spatial analysis

Hidden Markov random field

Voronoi tesselation (Fig. 2: 1) (bulk)

Spatial autocorrelation with Moran’s Index (Fig. 2: 7, Fig. 2: 10)

Selection of spatial discriminative features:

Moran’s Index, NN correlation, Cell type, interaction composition, L function (Fig. 2: 10)

(all bulk)

 

Inclusion of additional information

 

Survival prediction: Cox regression based on spatial features (Fig. 2: 10)

Include annotated hypersensitive sites index to anchor new/unseen data from DNase-seq, (sc)ATAC-seq, scNMT-seq, for de novo peak calling (bulka)

  1. aindicates that the method was not applied on the hackathon data, “bulk” indicates the method was originally developed for bulk omics, “sc” indicates the method was specifically developed for single-cell data, other methods are generic