Skip to main content

Table 1 Independent and informative covariates used in case studies

From: A practical guide to methods controlling false discoveries in computational biology

Case study

Covariates found to be independent and informative

Microbiome

Ubiquity: the proportion of samples in which the feature is present. In microbiome data, it is common for many features to go undetected in many samples.

 

Mean nonzero abundance: the average abundance of a feature among those samples in which it was detected. We note that this did not seem as informative as ubiquity in our case studies.

GWAS

Minor allele frequency: the proportion of the population which exhibits the less common allele (ranges from 0 to 0.5) represents the rarity of a particular variant.

 

Sample size (for meta-analyses): the number of samples for which the particular variant was measured.

Gene set analyses

Gene set size: the number of genes included in the particular set. Note that this is not independent under the null for over-representation tests, however (see Additional file 1: Supplementary Results).

Bulk RNA-seq

Mean gene expression: the average expression level (calculated from normalized read counts) for a particular gene.

Single-Cell RNA-seq

Mean nonzero gene expression: the average expression level (calculated from normalized read counts) for a particular gene, excluding zero counts.

 

Detection rate: the proportion of samples in which the gene is detected. In single-cell RNA-seq it is common for many genes to go undetected in many samples.

ChIP-seq

Mean read depth: the average coverage (calculated from normalized read counts) for the region

 

Window Size: the length of the region