From: Statistics or biology: the zero-inflation controversy about scRNA-seq data
Key concepts | Definition | Nature |
---|---|---|
RNA polymerase | An enzyme that transcribes a DNA sequence into an RNA sequence | Biology |
mRNA degredation | The process of an mRNA sequence being destroyed | Biology |
Biological zero | Absence of mRNA of a gene in a cell | Biology |
GC-rich | Majority of the bases in a sequence are either cytosine (C) or guanine (G) | Biology |
Reverse transcription | Enzyme-mediated synthesis of a DNA molecule from an RNA template; a step to enable DNA sequencing | Sequencing technology |
cDNA | Complementary DNA (synthesized from reverse transcription) | Sequencing technology |
PCR | Polymerase chain reaction; a step to amplify cDNA copy number | Sequencing technology |
IVT | In vitro transcription amplification; a step to amplify cDNA copy number | Sequencing technology |
Sequence read | A short sequence read out by sequencing machine | Sequencing technology |
UMI | Unique molecular identifier, which is used to correct amplification bias | Sequencing technology |
Non-biological zero | Absence of reads or UMIs of a gene in a cell in scRNA-seq data when the gene in fact has mRNAs in the cell | Sequencing technology |
Technical zero | Absence of reads or UMIs of a gene in a cell due to the library-preparation steps (e.g., cDNA synthesis) before cDNA amplification | Sequencing technology |
Sampling zero | Absence of reads or UMIs of a gene in a cell due to inefficient amplification and/or limited sequencing depth | Sequencing technology |
Dropouts | Various meanings in the literature | Ambiguous |
Excess zeros | Various meanings in the literature | Ambiguous |
Two-state gene expression model | A model that describes a gene’s switching between active and inactive states during transcription | Modeling |
Zero inflation | A statistical concept that depends on a specified statistical model | Modeling |
Poisson | A statistical model for counts; it requires the count variance to be equal to the count mean | Modeling |
Zero-inflated Poisson (ZIP) | A statistical model for counts; it allows for a larger proportion of zeros than Poisson does | Modeling |
Negative binomial (NB) | A statistical model for counts; it requires the count variance to be larger than the count mean | Modeling |
Zero-inflated negative binomial (ZINB) | A statistical model for counts; it allows for a larger proportion of zeros than NB does | Modeling |
Masking scheme | A way to mask a proportion of non-zero counts in a matrix to zeros | Modeling |
Differentially expressed (DE) gene | A gene that has statistically significant difference in expression between two conditions (e.g., cell groups) | Modeling |
Impute | To change the zero counts in a matrix to non-zero counts | Modeling |
Binarize | To change the non-zero counts in a matrix to ones | Modeling |