Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary

Fig. 3

Pairwise comparisons examples. a Fisher’s exact test for this sample would be highly significant (p = 2.8E-6); however, upon inspection of the tree it becomes clear that there are lineage-specific interdependencies which is a violation of the randomness model implicit in Fisher’s test. The top samples, which display 1–1 are more closely related to each other than the bottom samples, which display 0–0, and vice versa. The most parsimonious scenario is a single introduction (or loss) of the gene and the trait on the root branch. This is illustrated by the pairwise comparisons algorithm, which can find a maximum of 1 contrasting pair (0–0|1–1). b Contrast this to (a). This tree has a maximum of ten contrasting pairs, all 0–0|1–1, which indicates a minimum of ten transitions between 0–0 and 1–1 in the evolutionary history of the sample. In this situation, we should be more convinced that there is a true association between this gene and the trait. The associated p value of the binomial test (the statistical test in the pairwise comparisons algorithm) would be 0.0019. Note that the gene-trait matrix is identical to the one in (a), only shuffled to correspond to tree leaves. c Tree with a maximum number of 7 non-intersecting, contrasting pairs. In this picking, all pairs are 1–1|0–0, indicating a binomial test p value of 0.015, a “best” picking of pairs. d Another picking of 7 contrasting pairs from of the tree in (c), but this set of pairs includes a 1–0|0–1 pair, corresponding to a p value of 0.125. This represents a “worst” picking of pairs from the tree. Thus, the full range of pairwise comparison p values for the gene-trait-phylogeny combination in (c) and (d) would be 0.015–0.125

Back to article page