From: A practical guide to methods controlling false discoveries in computational biology
Case study | Covariates found to be independent and informative |
---|---|
Microbiome | Ubiquity: the proportion of samples in which the feature is present. In microbiome data, it is common for many features to go undetected in many samples. |
 | Mean nonzero abundance: the average abundance of a feature among those samples in which it was detected. We note that this did not seem as informative as ubiquity in our case studies. |
GWAS | Minor allele frequency: the proportion of the population which exhibits the less common allele (ranges from 0 to 0.5) represents the rarity of a particular variant. |
 | Sample size (for meta-analyses): the number of samples for which the particular variant was measured. |
Gene set analyses | Gene set size: the number of genes included in the particular set. Note that this is not independent under the null for over-representation tests, however (see Additional file 1: Supplementary Results). |
Bulk RNA-seq | Mean gene expression: the average expression level (calculated from normalized read counts) for a particular gene. |
Single-Cell RNA-seq | Mean nonzero gene expression: the average expression level (calculated from normalized read counts) for a particular gene, excluding zero counts. |
 | Detection rate: the proportion of samples in which the gene is detected. In single-cell RNA-seq it is common for many genes to go undetected in many samples. |
ChIP-seq | Mean read depth: the average coverage (calculated from normalized read counts) for the region |
 | Window Size: the length of the region |