Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: deMULTIplex2: robust sample demultiplexing for scRNA-seq

Fig. 1

An overview of the deMULTIplex2 algorithm. A Illustration for how deMULTIplex2 models tag cross-contamination in a simple two-sample multiplexing experiment. Each sample is labeled with tag A or B. Pooling the samples for single-cell capture allows floating tags to be bound to the cell or captured by the droplet. These contaminating tag counts are modeled by fitting two GLM-NB models in two separate spaces, and the sample identity is inferred using the EM algorithm. The y = x line is shown in black for this panel and panels CDF. B Scatter plot illustrating the association between total tag UMI count and number of detected genes in the 4-cell line dataset from Stoeckius et al. [3]. C Plot of the relationship between log(N + 𝐶) and log(N) across different values of 𝐶. For a typical experiment, the total tag count of cells is enriched within a range of one to two orders of magnitude (such as in the range of 100 to 1000 highlighted by the dashed line). D Simulated tag count distribution for scenarios with zero ambient contamination or zero cell-bound contamination. E Cosine similarity with canonical vector plotted against the tag count for a given tag (tag B) using simulated data. Several existing methods look for a bimodal distribution in the count dimension, while deMULTIplex2 takes advantage of the separation between positive and negative cells in the cosine dimension to initialize EM. F (left) The original UMI count of a given tag plotted against total tag count with simulated data. F (right) RQRs computed by deMULTIplex2 of the same simulated data plotted against total tag counts. N: negative cells, P: positive cells, P(d): doublets positive with tag B. The RQRs are plotted against the normal quantiles for true negative cells and predicted negative cells

Back to article page