Skip to main content

Table 1 Independent and informative covariates used in case studies

From: A practical guide to methods controlling false discoveries in computational biology

Case study Covariates found to be independent and informative
Microbiome Ubiquity: the proportion of samples in which the feature is present. In microbiome data, it is common for many features to go undetected in many samples.
  Mean nonzero abundance: the average abundance of a feature among those samples in which it was detected. We note that this did not seem as informative as ubiquity in our case studies.
GWAS Minor allele frequency: the proportion of the population which exhibits the less common allele (ranges from 0 to 0.5) represents the rarity of a particular variant.
  Sample size (for meta-analyses): the number of samples for which the particular variant was measured.
Gene set analyses Gene set size: the number of genes included in the particular set. Note that this is not independent under the null for over-representation tests, however (see Additional file 1: Supplementary Results).
Bulk RNA-seq Mean gene expression: the average expression level (calculated from normalized read counts) for a particular gene.
Single-Cell RNA-seq Mean nonzero gene expression: the average expression level (calculated from normalized read counts) for a particular gene, excluding zero counts.
  Detection rate: the proportion of samples in which the gene is detected. In single-cell RNA-seq it is common for many genes to go undetected in many samples.
ChIP-seq Mean read depth: the average coverage (calculated from normalized read counts) for the region
  Window Size: the length of the region