Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: scIBD: a self-supervised iterative-optimizing model for boosting the detection of heterotypic doublets in single-cell chromatin accessibility data

Fig. 1

Overview of scIBD. a The formation of doublets in droplet-based scCAS. The input of scIBD is the cell by bin/peak matrix, which supports customized quality control and peak calling. b The scheme of scIBD. We present a pseudo-droplet simulation strategy where clustering is firstly performed, and then a bunch of artificial doublets are simulated, whose profiles are the union of the droplet profiles picked weighted by clusters. A reference vector of raw droplets is initialized with all values set to zero, indicating that all droplets have no contributions for detecting doublets primordially. In each iteration, scIBD computes doublet scores for all raw droplets based on their similarity to their nearest neighbors (KNN graph) and their previous scores (reference vector). The droplets with high doublet scores are detected as doublets, which no longer participate in the clustering in the following iterations. The artificial doublets are always re-created based on the newly clustering results in the current iteration. The reference vector is updated using the normalized doublet scores, which then influences the detection of doublets by participating in doublet score aggregating in the following iterations. c The performance evaluations of scIBD. We comprehensively benchmarked scIBD on three categories of datasets, including fully-synthetic, real, and semi-synthetic datasets derived from various scCAS data. Downstream biological analyses were conducted to further demonstrate the efficacy of scIBD

Back to article page