Skip to main content
Fig. 7 | Genome Biology

Fig. 7

From: PCA outperforms popular hidden variable inference methods for molecular QTL mapping

Fig. 7

PCA provides insight into the choice of K. Recall from Section 2.3 that PEER factors are almost identical to PCs in GTEx eQTL data [10]. Therefore, for each tissue type, we compare the number of PEER factors selected by GTEx to (1) the number of PCs chosen via an automatic elbow detection method (Additional file 1: Algorithm S2) and (2) the number of PCs chosen via the BE algorithm (Additional file 1: Algorithm S3; the default parameters are used). a Example scree plots. b This scatter plot contains 49 dots of each color, corresponding to the 49 tissue types with GTEx eQTL analyses. The number of PEER factors selected by GTEx far exceeds the number of PCs chosen via BE for many tissue types with sample size above 350 (dashed line), suggesting that the number of PEER factors selected by GTEx may be too large. c For the eight tissue types with the largest absolute differences between the number of PEER factors chosen by GTEx and the number of PCs chosen via BE (all eight tissue types have sample size above 350), we replace the PEER factors with smaller numbers of PCs in GTEx’s FastQTL pipeline [10, 13] and find that we can reduce the number of inferred covariates to between 20% (\(12/60=20\%\), Colon - Transverse) and 40% (\(22/60\approx 36.67\%\), Esophagus - Mucosa) of the number of PEER factors selected by GTEx without significantly reducing the number of discovered cis-eGenes

Back to article page