Skip to main content

Table 2 Overview of single-cell Hi-C data analysis methods

From: Normalization and de-noising of single-cell Hi-C data with BandNorm and scVI-3D

Method

Resolution

Cell filtering

Locus pairs filtering

Sequencing depth normalization\(^*\)

Batch effect removal

Time (Lee2019 1Mb)

Memory (Lee2019 1Mb)

CellScale

1Mb

NULL

NULL

Each locus pair is normalized by dividing with the total IFs of each contact matrix and multiplying by 10,000

NULL

< 15 min (23 cores)

20G

BandScale [10]

100kb

NULL

Keep < = 2Mb

Each locus pair is normalized by dividing with the average IFs of locus pairs at the same distance

NULL

< 15 min (23cores)

20G

BandNorm

1Mb

NULL

NULL

Each locus pair is normalized by dividing with the total IFs at the same genomic distance (i.e., band) per cell and multiplying by the average total interaction frequency across cells at the same distance

Top batch correlated PCA components other than the 1st and 2nd are removed

\(\varvec{<}\) 15 min(1 core)

20G

scHiCluster [16]

1Mb 200kb

Keep cell with off–diagonal contacts > 5000.

Keep cells with > = x non–zero locus pairs, x is chromosome size (Mb)

After neighborhood smoothing and random walk imputation, top 20% interacting locus pairs are set to 1 and rest to 0

NULL

8 h (23 cores)

204G

scHiC Topics [15]

500kb

NULL

Keep < = 10Mb

Cell by locus pair matrix is binarized; (optional) bottom 50% cell based on total IFs are removed and the rest is downsampled to remove the sequencing depth effect

NULL

 36 h (1 core)

14.6G

Higashi [17]

1Mb

Keep cells with >= 2,000 interactions that have genomic span greater than 500Kb.

NULL

Contact maps can be smoothed via linear convolution to reach similar sparsity; quantile normalization can be applied to the matrix

NULL

24 h (23 core CPU)

253G (CPU)

1 h (23 cores GPU)

110G (GPU)

CellScale+CNN

1Mb

NULL

Keep cells with > = x/6 non–zero locus pairs, x is chromosome size (Mb)

Each locus pair is normalized by dividing with the average IFs of locus pairs at the same distance.

NULL

5 h (23 cores)

92G

scVI–3D

1Mb

NULL

(Optional) Diagonals of contact matrix with high contact count variation are selected

A variational autoencoder with Gamma–Poisson mixture at each band with a separate size factor is fitted

The batch factor is incorporated into the autoencoder at each distance

20 min  (23 core GPU)

110G (GPU)