Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: PWAS: proteome-wide association study—linking genes and phenotypes by functional variation in proteins

Fig. 1

The PWAS framework. a The causal model that PWAS attempts to capture: genetic variants (within a coding region) affect the function of a protein, whose altered function influences a phenotype. PWAS identifies protein-coding genes whose overall genetic functional alterations are associated with the studied phenotype by explicitly modeling and quantifying those functional alterations. In contrast, GWAS seeks direct associations between individual variants and the phenotype. b Overview of the PWAS framework. PWAS takes the same inputs as GWAS: (i) called genotypes of m variants across n individuals, (ii) a vector of n phenotype values (could be either binary or continuous), and (iii) a covariate matrix for the n individuals (e.g., sex, age, principal components, batch). By exploiting a rich proteomic knowledgebase, a pre-trained machine learning model estimates the extent of damage caused to each of the k proteins in the human proteome, as a result of the m observed variants, for each of the n individuals (typically km). These estimations are stored as protein function effect score matrices. PWAS generates two such matrices, reflecting either a dominant or a recessive effect on phenotypes. PWAS identifies significant associations between the phenotype values and the effect score values in the columns of the matrices (where each column represents a distinct protein-coding gene), while taking into account the provided covariates. Each gene can be tested by the dominant model, the recessive model, or a generalized model that uses both the dominant and recessive values

Back to article page