Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: CoCoA-diff: counterfactual inference for single-cell gene expression analysis

Fig. 1

Counterfactual confounder adjustment for single-cell differential gene expression analysis (CoCoA-diff). a Hierarchical (nested) structure of single-cell gene expression data. We have tens of individuals for one case-control study. Each individual (i) contains a heterogeneous mixture of multiple cell types. Single-cell technology measure a thousand genes on each cell (j). b This work seeks to address a specific causal inference problem of genomics research. We seek to prioritize genes causally modulated by a disease status, not the genes affecting the predisposition and risk of disease development. c Overview of CoCoA-diff approach (see Methods for details). Y: gene expression matrix. Y(0): counterfactual data with disease W = 0, Y(1): counterfactual data with disease W = 1. β: Poisson regression coefficient. δ: residual effect. ρ: sequencing depth. μ: shared confounding effect. d Data generation scheme for simulation studies. We simulate 50 causal and 9,950 non-causal genes with or without disease-causing mechanisms (an edge between W and λ). Wi: disease label assignment for an individual i. Xi: confounding effects for an individual i. λgi: unobserved gene expression for a gene g of an individual i as a function of X and W. Ygj: realization of cell-level gene expression of a gene g with a cell j-specific sequencing depth ρj (stochastically sampled from Gamma distribution). Here, we simulated five different X variables. e CoCoA-diff accurately estimates shared confounder variables (μg), showing a significantly higher level of correlation with true confounding effects on non-causal genes than a pseudo-bulk analysis. f CoCoA-diff accurately estimates disease-causing effects (δg), showing high correlation with true differential effects on causal genes. g Illustration of CoCoA-diff approach on the APOE in microglia example. HC, health control. AD, Alzheimer’s disease. μ, shared confounding effect; δ, residual differential effect. For a clear visualization, we omitted samples (individuals) with zero read count observed on APOE gene in the microglial cell type

Back to article page