From: Normalization and de-noising of single-cell Hi-C data with BandNorm and scVI-3D
Method | Resolution | Cell filtering | Locus pairs filtering | Sequencing depth normalization\(^*\) | Batch effect removal | Time (Lee2019 1Mb) | Memory (Lee2019 1Mb) |
---|---|---|---|---|---|---|---|
CellScale | 1Mb | NULL | NULL | Each locus pair is normalized by dividing with the total IFs of each contact matrix and multiplying by 10,000 | NULL | < 15 min (23 cores) | 20G |
BandScale [10] | 100kb | NULL | Keep < = 2Mb | Each locus pair is normalized by dividing with the average IFs of locus pairs at the same distance | NULL | < 15 min (23cores) | 20G |
BandNorm | 1Mb | NULL | NULL | Each locus pair is normalized by dividing with the total IFs at the same genomic distance (i.e., band) per cell and multiplying by the average total interaction frequency across cells at the same distance | Top batch correlated PCA components other than the 1st and 2nd are removed | \(\varvec{<}\) 15 min(1 core) | 20G |
scHiCluster [16] | 1Mb 200kb | Keep cell with off–diagonal contacts > 5000. | Keep cells with > = x non–zero locus pairs, x is chromosome size (Mb) | After neighborhood smoothing and random walk imputation, top 20% interacting locus pairs are set to 1 and rest to 0 | NULL | 8 h (23 cores) | 204G |
scHiC Topics [15] | 500kb | NULL | Keep < = 10Mb | Cell by locus pair matrix is binarized; (optional) bottom 50% cell based on total IFs are removed and the rest is downsampled to remove the sequencing depth effect | NULL | 36 h (1 core) | 14.6G |
Higashi [17] | 1Mb | Keep cells with >= 2,000 interactions that have genomic span greater than 500Kb. | NULL | Contact maps can be smoothed via linear convolution to reach similar sparsity; quantile normalization can be applied to the matrix | NULL | 24 h (23 core CPU) | 253G (CPU) |
1 h (23 cores GPU) | 110G (GPU) | ||||||
CellScale+CNN | 1Mb | NULL | Keep cells with > = x/6 non–zero locus pairs, x is chromosome size (Mb) | Each locus pair is normalized by dividing with the average IFs of locus pairs at the same distance. | NULL | 5 h (23 cores) | 92G |
scVI–3D | 1Mb | NULL | (Optional) Diagonals of contact matrix with high contact count variation are selected | A variational autoencoder with Gamma–Poisson mixture at each band with a separate size factor is fitted | The batch factor is incorporated into the autoencoder at each distance | 20 min (23 core GPU) | 110G (GPU) |