Skip to main content

An Atlas of Variant Effects to understand the genome at nucleotide resolution

Abstract

Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an ‘Atlas’ of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.

Introduction

Two decades after sequencing the first human genome, millions of human exomes and genomes have been sequenced. Interpreting the effects of the hundreds of millions of variants thus discovered has become a central challenge for genomics. The genomes of the 8 billion people alive today collectively contain nearly all ~ 9 billion possible single nucleotide genetic variants compatible with life, as well as numerous insertions, deletions and other types of variants [1, 2]. Moreover, within the trillions of cells of each individual, every possible single nucleotide genetic variant will have arisen through somatic mutation. The functional impact of genetic variants has primarily been determined by asking if the variant co-occurs with a disease, disorder or other trait, an approach which has collectively characterised the functional impact of less than 1% of genetic variation. Moreover, our knowledge of variant effects is focused on the best-understood 1–2% of our DNA—the genes that encode proteins. For non-coding variation, the situation is even less certain, because the location of most known non-coding functional elements has only been recently identified [3]. Moreover, non-coding elements are not as highly conserved and their functions are often cell type and development stage specific [4].

Our lack of information about the effect of variation found through genetic testing or genome sequencing is the major barrier to the use of sequence information for diagnosing genetic disease. This lack of information limits the effectiveness of genetic precision medicine and hinders our ability to understand genome function. Even when a variant in a well-annotated functional element is known to increase disease risk, the mechanism by which it does so is often unknown. A solution lies in our ability to assess the functional effect of variants using in vitro or cell-based assays, which can provide strong evidence to interpret their biological and clinical impact and can, in principle, be applied to any variant. However, owing to the resource- and time-intensive nature of traditional variant effect assays, they have generally been undertaken reactively for individual variants only after and, in most cases long after, the first observation of the variant. Now, multiplexed assays of variant effect (MAVEs) enable the generation of ‘variant effect maps’ characterising aspects of the function of every possible single nucleotide change in a gene or functional element of interest. Because variant effect maps are comprehensive, they profile all previously observed variants, as well as those that might be found in the future. Generating variant effect maps for every protein encoding gene and regulatory element in the human genome would create an ‘Atlas’ of variant effect maps that would transform our understanding of genetics by ushering in a new era of nucleotide-resolution functional knowledge of the genome.

The generation of an Atlas of Variant Effects (AVE) would have major impact across multiple areas of basic and translational research and, importantly, for clinical care. Any effort to determine whether a variant alters function would be transformed by having an Atlas, including in the following high impact areas (Fig. 1):

  • Precision genomic medicine. Variant effect maps of functional elements known to harbour disease-causing variation can drive more accurate, rapid and inexpensive genetic diagnostic testing. Variant effect maps can also enhance our understanding of penetrance and variable expressivity and potentially even reveal compensatory genetic perturbations. For a wide variety of genetically driven disorders, knowledge of disease risk variants allows screening within families or even populations for early detection and thus early intervention [5].

  • Disease association studies. Just as targeted variant functional assays have assisted discovery and validation of associations between specific rare genetic variants and disease risk, variant effect maps can enable this approach broadly, at scale [6, 7].

  • Therapeutic development and pharmacogenetics. Variant effect maps can shed light on disease mechanisms and may identify novel potential targets for drugs or other therapeutics [8], help predict the safety and efficacy of modulating specific targets, reveal routes of resistance and identify patients likely to respond favourably in clinical trials. Variant effect maps of pharmacogenes, where genetic variation can influence the activity or metabolism of drugs, could reveal the optimal dose for an individual or identify predispositions to adverse reactions. Variant effect maps could also enable the systematic study of genetic dose–response curves through functional and clinical correlations.

  • Sequence/structure/function relationships. Understanding the relationship between sequence and function is fundamental to biology [9] and remains difficult to predict. Variant effect maps can illuminate this relationship, for example by improving or benchmarking computational variant effect prediction; revealing protein function, allostery or structure; and discerning the composition and mechanisms of regulatory elements [10,11,12,13,14,15,16,17,18,19].

  • Evolutionary genetics. Differences in the biology of species, including those of commercial interest, is genomically encoded. Variant effect maps can highlight the subset of genetic differences between species that have functional consequences, probe inferred ancestral sequences [20] and improve phylogenetic inference [21,22,23].

  • Pathogen biology. Genetic variation in pathogen genomes influences key characteristics of pathogen biology, including virulence, transmission, immune evasion and drug resistance. Variant effect maps can inform the surveillance of pathogen evolution [24] and provide opportunities to respond more rapidly, as well as revealing drug resistance and immune evasion variants [25].

Fig. 1
figure 1

Schematic representation of areas of high impact resulting from an Atlas of Variant Effects

By comprehensively capturing the impact of variants in functional elements throughout the genome, an Atlas of Variant Effects would accelerate and empower biological research, drug discovery and clinical practice. Systematic variant analysis, unbiased by allele frequency in any population, would empower equitable interpretation and reduce healthcare disparities [26]. Building and implementing a coherent Atlas of Variant Effects will necessarily be a collective endeavour, drawing together diverse expertise from different communities, including patients, patient advocates, researchers, clinicians, diagnostics companies and drug developers.

MAVEs can measure the effect of genetic variants at the scale necessary to compile an Atlas of Variant Effects

MAVEs are a rapidly growing family of methods that involve mutagenesis of a DNA-encoded protein or regulatory element followed by a multiplexed assay for some aspect of function [9, 27,28,29]. High-throughput DNA sequencing is used to read out each variant’s effect in the assay (Fig. 2A). MAVEs encompass both assays of protein function, often called deep mutational scans, and of regulatory elements, often called massively parallel reporter assays. Early MAVEs were applied to small protein domains and short regulatory elements [14, 15, 30] generally querying single ‘sub-functions’ of an element such as promoter activity [14, 15], protein–ligand interactions [30,31,32] or stability [33, 34]. Other early efforts focused on the ability of an element to perform its overall cellular function in a cell-based growth assay [35]. Subsequently, MAVEs have been developed for a variety of functions and have been used to generate multiple variant effect maps examining different functions for the same element [9, 28, 36]. Now, MAVEs have been scaled up and optimised to enable routine application to entire genes, measuring the relative functional impact of tens of thousands of variants in a single controlled experiment.

Fig. 2
figure 2

A (top panel) MAVEs can measure a wide variety of protein and regulatory DNA functions, and they produce comprehensive variant effect maps representing the effects of nearly all possible nucleotide or amino acid variants in the scanned functional element. A variant effect map is shown for a small region of a protein-coding gene; each column in the map is a position in a gene and each row is an amino acid substitution. Tiles are coloured based on the measured effect of the variant. B (bottom panel) MAVEs have been applied to hundreds of functional elements and, collectively, ~ 11 million variant effect measurements have been made with MAVEs. Data available at 10.5281/zenodo.7662580

To date, variant effect maps have been generated for hundreds of functional elements encompassing over 11 million total variants (Fig. 2B). However, existing variant effect maps cover < 1% of the known clinically relevant human genome and are largely focused on single nucleotide variants, as these are the type of variants most often encountered in current human genome sequencing and clinical testing. No functional element has been mapped in a diverse panel of cell types or across developmental stages. However, even at this very early stage in the development of a comprehensive Atlas of Variant Effects, multiplexed variant functional data are proving to be powerful. In particular, variant effect maps are beginning to reshape how human variants found in clinical genetic testing are interpreted and also to redefine our understanding of the mapping between DNA sequence and molecular, cellular and organismal phenotype.

The value of functional evidence for informing clinical variant interpretation is already well appreciated and has been incorporated within current professional guidelines for genetic diagnosis that are used internationally [37, 38]. MAVE-derived variant functional data has numerous advantages as compared to functional data derived from traditional, low-throughput assays. Unlike testing variants in small batches using different methods in different labs, MAVEs can determine the effects of thousands of variants simultaneously, not only improving reproducibility but allowing assessment of variants in the context of the functional effects of all of the variants in that gene, including the effects of known pathogenic and benign variants. Thus, MAVE-derived functional data can be used to eliminate many, if not most, of the uncertain, clinically observed variants in monogenic disease genes demonstrating the power of functional data to help deliver more definitive genetic test results to patients and clinicians [39,40,41].

Multiplexed variant functional data can also transform our understanding of how variants encode molecular and cellular function and how sequence dictates biological structure. For example, multiplexed measurements of variant abundance and ligand binding in SH3 and PDZ domains, combined with a model, enabled a comprehensive accounting of allostery within each domain [16]. Multiplexed variant functional data can be used to validate proposed protein structures [17, 42] or, where variant combinations are assayed, even infer them de novo [12, 13]. Knowledge of the precise mechanism of variant effects opens the door for variant-guided therapies designed to ameliorate protein misfolding or aggregation, aberrant splicing and more.

Existing variant effect maps for human genes have been generated by a range of different technologies, from yeast complementation assays to CRISPR-based saturation genome editing in human cells. Each technology has specific advantages and disadvantages. For example, yeast complementation assays are only applicable to a minority of human genes [43] and would not be appropriate for identifying some variant effects, such as those that affect functions beyond those needed for complementation or those that disrupt splicing. CRISPR-based saturation genome editing of an endogenous locus is costly and practical only for growth-based assays. Thus, no single technology can currently be used to generate maps of variant effects for all functional elements. Indeed, even within a single gene, multiple assays may be required to assess different pathophysiological mechanisms. Current MAVEs require appreciable effort, and the time and cost needed to develop new assays can be considerable. Moreover, some variant effects may only be well-modelled in terminally differentiated cell types or in multicellular systems or by assaying variant effects on complex phenotypes like cell morphology or transcriptional state. Thus, the existing portfolio of MAVE technologies can be applied to a substantial fraction of the genome, but more technology development is required to achieve comprehensive coverage of genomic functional elements and to identify the mechanism by which most variants act.

The AVE Alliance provides international coordination to create, disseminate and implement an Atlas of Variant Effects

Compiling a complete Atlas of Variant Effects for all 20,000 human genes, not to mention potentially hundreds of thousands of noncoding regulatory elements, will require an international collaborative effort involving thousands of researchers, clinicians and technologists. Comparing this initiative to some of the landmark genomic collaborative achievements of the past 30 years highlights some of the key challenges to be addressed. The Human Genome Project (HGP) required a small number of centres generating data at unprecedented scales, in a highly coordinated and centralised fashion. By contrast, the Protein Data Bank (PDB) contains structures for thousands of human proteins, generated by thousands of researchers, in a largely uncoordinated and decentralised fashion [44]. Despite their differences, both HGP and PDB succeeded in generating an enduring and sustainable knowledge base and depended, crucially, on robust data standards, community-agreed quality metrics and centralised data deposition and dissemination. Moreover, a strong community ethos was essential for the development and adoption of these core standards and infrastructure. Some of the critical informatics infrastructure needed to support the AVE has already been developed, for example the MaveDB repository [45, 46], initial standards [47] for MAVE datasets and a MAVE project registry [48].

We envisage that the AVE will sit between the extremes exemplified by HGP and PDB, with a combination of a small number of centres generating variant effect maps at scale using generalisable assays and a large number of laboratories generating small numbers of maps, using bespoke assays, leveraging their expertise in investigating particular genes and biological pathways. Integration of variant effect data for the same gene, generated using different MAVEs, will in some cases be required to achieve accurate and comprehensive characterisation of different functional effects [39, 49,50,51,52]. The computational prediction of variant effect maps using AI/ML methods will continue to improve and will leverage growing numbers of experimentally determined variant effect maps, analogous to the advances in computational prediction of protein structures based on thousands of experimentally determined protein structures (Fig. 3). With these expectations in mind, we can identify some of the key challenges that realising the AVE vision will face and some of the likely solutions on the critical path to success:

  • Diverse expertise. Developing new experimental technologies that reflect the complexity of biology and disease, scaling existing technologies, processing and managing complex data, and translating knowledge into clinical benefits requires a broad range of expertise, interests and competencies, working collaboratively. No one centre or community will be able to create the AVE in isolation. Technology developers, geneticists, cell biologists, protein scientists, data scientists, software engineers, clinicians will need to work together, aligned around a common vision, language and values.

  • Technology development and scaling. Generating variant effect maps for all 20,000 genes will require both the scaling of existing technologies that can be applied to many genes, and the development of new technologies that will extend coverage of MAVE-compatible assays to all functional elements. Moreover, new approaches will be needed to assess variant effects in more complex contexts, such as specific cell types or in development, and for more complex phenotypes, such as cell morphology and behaviour.

  • Democratisation of technology. Completing the AVE will require a major expansion in the numbers of researchers and organisations actively performing MAVEs. Readily accessible training materials, protocols, experimental resources (e.g. cell lines, libraries) and easy-to-use and flexible software will all be crucial, as will advocacy and support to facilitate researchers with expertise in informative assays to adopt MAVE technologies.

  • Data standards and coordination. Data standards, community-agreed quality standards, centralised data deposition, open dissemination and a FAIR ethos [53] are all necessary but not sufficient for compiling the Atlas of Variant Effects. The existing informatics infrastructure needs to evolve, become integrated into the wider clinical and biological data ecosystem and be actively sustained for long term impact. Moreover, community-wide adoption of best practices with regard to data and meta-data deposition are critical for data integration.

  • Ensuring trustworthy clinical adoption. The potential clinical impact of the Atlas of Variant Effects can only be achieved through rigorous and clinician-trusted integration into diagnostic workflows. Co-development of quality standards and guidelines with clinical communities will help to build trust, as will starting conservatively. Integration with existing clinical decision support software (e.g. DECIPHER [54]) and data resources (e.g. ClinVar [55]), as opposed to requiring diagnosticians to use new systems, will facilitate rapid adoption.

Fig. 3
figure 3

Stages of Atlas of Variant Effects completion

To achieve the AVE vision and tackle these challenges, an international group of diverse researchers, clinicians and diagnosticians established the Atlas of Variant Effects Alliance (www.varianteffect.org). The AVE Alliance currently has over 400 members from over 100 institutions, located in 30 countries, united by the mission to bring the AVE into reality. The AVE Alliance is committed to Open Science and places diversity and inclusion at the heart of its activities. The AVE Alliance organises an annual meeting, the Mutational Scanning Symposium, and a monthly seminar series, the Variant Effect Seminar Series. To tackle the challenges identified above, AVE has established workstreams to:

  • Develop, standardise and democratise experimental and computational technologies,

  • Develop the infrastructure necessary to ingest, store and disseminate high quality FAIR data,

  • Ensure that clinical benefits are realised,

  • Expand, coordinate and sustain a diverse and motivated community.

The AVE Alliance provides a ‘front door’ for other organisations and initiatives to work with the diverse AVE community, from complementary large-scale national initiatives such as the NIH-funded Impact of Genomic Variants on Function (IGVF), as well as research funders and commercial organisations who are keen to engage with the community as a whole. We welcome any and all readers who are interested in building and learning from the Atlas of Variant Effects to join the Alliance and get involved [56, 57].

Availability of data and materials

Not applicable.

References

  1. Shirts BH, Pritchard CC, Walsh T. Family-specific variants and the limits of human genetics. Trends Mol Med. 2016;22:925–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Kruglyak L, Nickerson DA. Variation is the spice of life. Nat Genet. 2001;27:234–6.

    Article  CAS  PubMed  Google Scholar 

  3. Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–30.

    Article  PubMed Central  Google Scholar 

  4. ENCODE Project Consortium, Moore JE, Purcaro MJ, Pratt HE, Epstein CB, Shoresh N, et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature. 2020;583:699–710.

    Article  Google Scholar 

  5. Green RC, Berg JS, Grody WW, Kalia SS, Korf BR, Martin CL, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet Med. 2013;15:565–74.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Schiabor Barrett KM, Masnick M, Hatchell KE, Savatt JM, Banet N, Buchanan A, et al. Clinical validation of genomic functional screen data: analysis of observed BRCA1 variants in an unselected population cohort. HGG Adv. 2022;3:100086.

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Dorling L, Carvalho S, Allen J, Parsons MT, Fortuno C, González-Neira A, et al. Breast cancer risks associated with missense variants in breast cancer susceptibility genes. Genome Med. 2022;14:51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Nelson MR, Tipney H, Painter JL, Shen J, Nicoletti P, Shen Y, et al. The support of human genetic evidence for approved drug indications. Nat Genet. 2015;47:856–60.

    Article  CAS  PubMed  Google Scholar 

  9. Kinney JB, McCandlish DM. Massively parallel assays and quantitative sequence-function relationships. Annu Rev Genomics Hum Genet. 2019;20:99–127.

    Article  CAS  PubMed  Google Scholar 

  10. Gray VE, Hause RJ, Luebeck J, Shendure J, Fowler DM. Quantitative missense variant effect prediction using large-scale mutagenesis data. Cell Syst. 2018;6:116-24.e3.

    Article  CAS  PubMed  Google Scholar 

  11. Riesselman AJ, Ingraham JB, Marks DS. Deep generative models of genetic variation capture the effects of mutations. Nat Methods. 2018;15:816–22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Rollins NJ, Brock KP, Poelwijk FJ, Stiffler MA, Gauthier NP, Sander C, et al. Inferring protein 3D structure from deep mutation scans. Nat Genet. 2019;51:1170–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Schmiedel JM, Lehner B. Determining protein structures using deep mutagenesis. Nat Genet. 2019;51:1177–86.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Kinney JB, Murugan A, Callan CG Jr, Cox EC. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc Natl Acad Sci U S A. 2010;107:9158–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Patwardhan RP, Lee C, Litvin O, Young DL, Pe’er D, Shendure J. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat Biotechnol. 2009;27:1173–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Faure AJ, Domingo J, Schmiedel JM, Hidalgo-Carcedo C, Diss G, Lehner B. Mapping the energetic and allosteric landscapes of protein binding domains. Nature. 2022;604:175–83.

    Article  CAS  PubMed  Google Scholar 

  17. Chiasson MA, Rollins NJ, Stephany JJ, Sitko KA, Matreyek KA, Verby M, et al. Multiplexed measurement of variant abundance and activity reveals VKOR topology, active site and human variant impact. Elife. 2020;9. https://doi.org/10.7554/eLife.58026.

  18. Livesey BJ, Marsh JA. Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations. Mol Syst Biol. 2020;16:e9380.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Frazer J, Notin P, Dias M, Gomez A, Min JK, Brock K, et al. Disease variant prediction with deep generative models of evolutionary data. Nature. 2021;599:91–5.

    Article  CAS  PubMed  Google Scholar 

  20. Starr TN, Picton LK, Thornton JW. Alternative evolutionary histories in the sequence space of an ancient protein. Nature. 2017;549:409–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Klein JC, Keith A, Agarwal V, Durham T, Shendure J. Functional characterization of enhancer evolution in the primate lineage. Genome Biol. 2018;19:99.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Bloom JD. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol Biol Evol. 2014;31:1956–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Gallego Romero I, Lea AJ. Leveraging massively parallel reporter assays for evolutionary questions. Genome Biol. 2023;24:26.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Lee JM, Huddleston J, Doud MB, Hooper KA, Wu NC, Bedford T, et al. Deep mutational scanning of hemagglutinin helps predict evolutionary fates of human H3N2 influenza variants. Proc Natl Acad Sci U S A. 2018;115:E8276–85.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Stiffler MA, Hekstra DR, Ranganathan R. Evolvability as a function of purifying selection in TEM-1 β-lactamase. Cell. 2015;160:882–92.

    Article  CAS  PubMed  Google Scholar 

  26. Wright CF, Campbell P, Eberhardt RY, Aitken S, Perrett D, Brent S, et al. Optimising diagnostic yield in highly penetrant genomic disease. bioRxiv. 2022. Available from: https://www.medrxiv.org/content/10.1101/2022.07.25.22278008v1.

  27. Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable functional assays for the interpretation of human genetic variation. Annu Rev Genet. 2022;56:441–65.

    Article  PubMed  Google Scholar 

  28. Starita LM, Ahituv N, Dunham MJ, Kitzman JO, Roth FP, Seelig G, et al. Variant interpretation: functional assays to the rescue. Am J Hum Genet. 2017;101:315–25.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Gasperini M, Starita L, Shendure J. The power of multiplexed functional analysis of genetic variants. Nat Protoc. 2016;11:1782–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, et al. High-resolution mapping of protein sequence-function relationships. Nat Methods. 2010;7:741–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Zhang H, Torkamani A, Jones TM, Ruiz DI, Pons J, Lerner RA. Phenotype-information-phenotype cycle for deconvolution of combinatorial antibody libraries selected against complex systems. Proc Natl Acad Sci U S A. 2011;108:13456–61.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Ernst A, Gfeller D, Kan Z, Seshagiri S, Kim PM, Bader GD, et al. Coevolution of PDZ domain-ligand interactions analyzed by high-throughput phage display and deep sequencing. Mol Biosyst. 2010;6:1782–90.

    Article  CAS  PubMed  Google Scholar 

  33. Kim I, Miller CR, Young DL, Fields S. High-throughput analysis of in vivo protein stability. Mol Cell Proteomics. 2013;12:3370–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci U S A. 2012;109:16858–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Hietpas RT, Jensen JD, Bolon DNA. Experimental illumination of a fitness landscape. Proc Natl Acad Sci Natl Acad Sci. 2011;108:7896–901.

    Article  CAS  Google Scholar 

  36. Weile J, Roth FP. Multiplexed assays of variant effects contribute to a growing genotype–phenotype atlas. Hum Genet. 2018;137:665–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Brnich SE, Abou Tayoun AN, Couch FJ, Cutting GR, Greenblatt MS, Heinen CD, et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019;12:3.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Fayer S, Horton C, Dines JN, Rubin AF, Richardson ME, McGoldrick K, et al. Closing the gap: systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am J Hum Genet. 2021;108:2248–58.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Radford EJ, Tan HK, Andersson MHL, Stephenson JD, Gardner EJ, Ironfield H, et al. Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation. medRxiv [Internet]. Cold Spring Harbor Laboratory Press; 2022; Available from: https://www.medrxiv.org/content/10.1101/2022.06.10.22276179v1.

  41. Scott A, Hernandez F, Chamberlin A, Smith C, Karam R, Kitzman JO. Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol. 2022;23:266.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Adkar BV, Tripathi A, Sahoo A, Bajaj K, Goswami D, Chakrabarti P, et al. Protein model discrimination using mutational sensitivity derived from deep sequencing. Structure. 2012;20:371–81.

    Article  CAS  PubMed  Google Scholar 

  43. Kachroo AH, Laurent JM, Yellman CM, Meyer AG, Wilke CO, Marcotte EM. Evolution. Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science. 2015;348:921–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019;47:D520-8.

    Article  Google Scholar 

  45. Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, et al. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 2019;20:223.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Rubin AF, Min JK, Rollins NJ, Da EY, Esposito D, Harrington M, et al. MaveDB v2: a curated community database with over three million variant effects from multiplexed functional assays. bioRxiv. 2022 . p. 2021.11.29.470445. Available from: https://www.biorxiv.org/content/10.1101/2021.11.29.470445v2. [Cited 2022 Dec 5].

  47. Gelman H, Dines JN, Berg J, Berger AH, Brnich S, Hisama FM, et al. Recommendations for the collection and use of multiplexed functional data for clinical variant interpretation. Genome Med. 2019;11:85.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Kuang D, Weile J, Kishore N, Rubin AF, Fields S, Fowler DM, et al. MaveRegistry: a collaboration platform for multiplexed assays of variant effect. Bioinformatics. 2021;37:3382–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Mighell TL, Thacker S, Fombonne E, Eng C, O’Roak BJ. An integrated deep-mutational-scanning approach provides clinical insights on PTEN genotype-phenotype relationships. Am J Hum Genet. 2020;106:818–29.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Suiter CC, Moriyama T, Matreyek KA, Yang W, Scaletti ER, Nishii R, et al. Massively parallel variant characterization identifies NUDT15 alleles associated with thiopurine toxicity. Proc Natl Acad Sci U S A. 2020;117:5394–401.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Jepsen MM, Fowler DM, Hartmann-Petersen R, Stein A, Lindorff-Larsen K. Chapter 5 - Classifying disease-associated variants using measures of protein activity and stability. In: Pey AL, editor. Protein Homeostasis Diseases. Academic Press; 2020. p. 91–107. https://doi.org/10.1101/688234, https://www.biorxiv.org/content/10.1101/688234v2.full.pdf.

  52. Cagiada M, Johansson KE, Valanciute A, Nielsen SV, Hartmann-Petersen R, Yang JJ, et al. Understanding the origins of loss of protein function by analyzing the effects of thousands of variants on activity and abundance. Mol Biol Evol. 2021;38:3235–46.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Wilkinson MD, Dumontier M, Jan Aalbersberg I, Appleton G, Axton M, Baak A, et al. Addendum: the FAIR guiding principles for scientific data management and stewardship. Sci Data. 2019;6:6.

    Article  PubMed  PubMed Central  Google Scholar 

  54. DECIPHER v11.16: Mapping the clinical genome. Available from: http://www.deciphergenomics.org. [Cited 2022 Dec 3].

  55. Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–5.

    Article  CAS  PubMed  Google Scholar 

  56. Atlas of variant effects alliance. Atlas of Variant Effects Alliance. Available from: http://www.varianteffect.org. [Cited 2022 Dec 3].

  57. AVE Alliance Founding Members. The Atlas of Variant Effects (AVE) Alliance: understanding genetic variation at nucleotide resolution. Zenodo; 2021. Available from: https://zenodo.org/record/4989960.

Download references

Acknowledgements

We thank all members of the Atlas of Variant Effects Alliance for their work and contributions. Uta Mackensen helped prepare final figure graphics. We thank Carlos Araya for helpful input. We thank Alex Hopkins for administrative support.

To fairly give credit, and for the purposes of PubMed, we list the following AVE Alliance contributing authors, who provided substantial comments and edits to this manuscript.

Atlas of Variant Effects Alliance contributing authors

Nadav Ahituv14, Orli G. Bahcal15 , Dustin Baldridge16, Jonathan S. Berg17, Alice H. Berger18, Aisha Haley Bianchi19, Benedetta Bolognesi20, Michael Boutros21, Steven Brenner22, Matthew H. Brush23, Vanessa Bryant24, Carol J. Bult25, Martha Bulyk26, Melissa Call27, Hannah Carter28, Melina Claussnitzer 28,29, Feng Chen30, Melissa S. Cline31, Josh T. Cuperus1, Moez Dawood32, Hannah N. De Jong33, Mafalda Dias34, Michael Dunn5, Jesse Engreitz35, Kyle Farh30, Phillip G. Febbo30, Stanley Fields1, Gregory M. Findlay36, Helen Firth37, James S. Fraser38, Jonathan Frazer34, Mattia Frontini39, Irene Gallego Romero40, Andrew M. Glazer41, Murat Güler21, Rasmus Hartmann-Petersen42, Richard Houlston43, Kuan-lin Huang44, Carolyn M. Hutter45, Sujatha Jagannathan46,47, Richard G. James48, Martin Kampmann49,50, Rachel Karchin51, Justin B. Kinney52, Alexis C. Komor53, Sriram Kosuri54, Ben Lehner5, 34, 55, 56,Kresten Lindorff-Larsen42, Zané Lombard57, Daniel G. MacArthur58, Maria Martin59, Ultan McDermott60, Shannon M. McNulty61, Alex N. Nguyen Ba62, Anne O'Donnell-Luria63,64, Brian J. O'Roak65, Victoria N. Parikh66, Leopold Parts5, Michael J. Pazin45, Tina Pesaran67, Slavé Petrovski68, Christine Queitsch1,3, David E. Root8, Jay Shendure1,3, Amanda B. Spurdle69, Kevin L Taylor70, Clare Turnbull43, Judit Villén1, L.E.L.M. Vissers71, Alex H. Wagner72,73, Matthew J. Wakefield74, Jochen Weile10, Jenny Xiao75

14 Department of Bioengineering and Therapeutic Sciences and Institute for Human Genetics, University of California San Francisco USA

15 Cell Genomics, New York, NY USA

16 Washington University School of Medicine, St. Louis, MO USA

17 The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

18 Fred Hutchinson Cancer Center, Seattle, WA USA

19 National Institute of Aging, Baltimore, Maryland USA

20 Institute for Bioengineering of Catalunya (IBEC) and The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain

21 German Cancer Research Center (DKFZ) and Heidelberg University, Germany

22 University of California, Berkeley, Berkeley, CA USA

23 Department of Biomedical Informatics, University of Colorado, Aurora, CO USA

24 The Walter and Eliza Hall Institute of Medical Research/Department of Medical Biology, University of Melbourne/Dept Clinical Immunology, Royal Melbourne Hospital, Australia

25 The Jackson Laboratory/Mouse Genome Informatics (MGI) consortium

26 Brigham & Women's Hospital and Harvard Medical School, USA

27 The Walter and Eliza Hall Institute of Medical Research/Department of Medical Biology, University of Melbourne, Australia

28 Department of Medicine, University of California San Diego, La Jolla CA USA

29 The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease USA

30 Illumina USA

31 UC Santa Cruz Genomics Institute, Santa Cruz, CA USA

32 Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA

33 Department of Genetics, Stanford University School of Medicine, Stanford, CA USA

34 Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST) and Universitat Pompeau Fabra (UPF), Barcelona, Spain

35 Department of Genetics, Stanford University School of Medicine, CA USA

36 The Francis Crick Institute, London, UK

37 Sanger/Cambridge Hospitals UK

38 Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA USA

39 Department of Clinical and Biomedical Sciences, University of Exeter Medical School, Faculty of Health and Life Sciences, RILD Building, Barrack Road, Exeter, EX2 5DW.

40 School of BioSciences, University of Melbourne, Parkville, Australia

41 Department of Medicine, Vanderbilt University Medical Center, Nashville, TN USA

42 Department of Biology, University of Copenhagen, Copenhagen, Denmark

43 Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, Surrey UK

44 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY USA

45 Division of Genome Sciences, NHGRI, Bethesda, MD, USA

46 Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA

47 RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO, USA

48 Seattle Children's Research Institute, Seattle WA USA

49 Department of Biochemistry & Biophysics, University of California, San Francisco CA USA

50 Institute for Neurodegenerative Diseases, University of California, San Francisco CA USA

51 Johns Hopkins University, Baltimore, Maryland USA

52 Cold Spring Harbor Laboratory, NY USA

53 Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA USA

54 Octant Inc USA

55 University Pompeu Fabra (UPF), Barcelona, Spain

56 Institució Catalana de Recerca i estudis Avançats (ICREA), Barcelona, Spain

57 Division of Human Genetics, National Health Laboratory Service, and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa

58 Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Australia

59 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome, Hinxton, UK

60 R&D Oncology, AstraZeneca UK

61 Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC USA

62 Department of Biology, University of Toronto, Toronto, Canada

63 Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA USA

64 Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA USA

65 Oregon Health & Science University, Portland, OR USA

66 Stanford Center for Inherited Cardiovascular Disease, Stanford School of Medicine, CA USA

67 Ambry Genetics, Aliso Viejo, CA USA

68 Centre for Genomics Research, Discovery Sciences, R&D, Astrazeneca UK

69 QIMR Berghofer Medical Research Institute, Brisbane, Australia

70 Proteogenomics, BioLegend USA

71 Department Human Genetics, Radboud University, Nijmegen, NL

72 The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH 43215, USA

73 Departments of Pediatrics and Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH 43210, USA

74 The Walter and Eliza Hall Institute, Parkville, Vic, Australia & Department of Obstetrics and Gynaecology, The University of Melbourne, Australia

75 Guardant Health, Palo Alto USA

76 Center for Genomic Medicine Massachusetts General Hospital, Harvard Medical School USA

Funding

F.P.R. acknowledges support from the NIH/NHGRI Impact of Genomic Variation on Function (IGVF) Initiative (HG011989) and from an NIH/NHLBI R01 grant (HL164675) and from a Canadian Institutes of Health Research Foundation Grant. L.M.S., L.A.M., D.M.F. and A.F.R. acknowledge support from NIH/NHGRI Impact of Genomic Variation on Function (IGVF) Initiative (HG011969). L.M.S., L.A.M., D.M.F., A.F.R. and F.P.R. all receive support from the NIH/NHGRI Center of Excellence in Genomic Science (HG010461). DMF also receive support from R01HL152066. L.M.S. is also supported by the Brotman Baty Institute. A.L.G. is a Wellcome Trust Senior Fellow (200837/Z/16/Z) and is also supported by NIDDK (UM-1DK126185). W.C.H. acknowledges support from NIH/NCI U01CA176058. J.T.N. acknowledges support from the Novo Nordisk Foundation (NNF21SA0072102) and from an NIH DP2 grant (1DP2GM146252). D.J.A. is supported by Cancer Research UK (CG-MAVE: EDDPGM-Nov22/100004) and the Wellcome Trust. A.F.R received grant funding from the Australian Government. D.S.M. acknowledges support from Chan Zuckerberg Initiative CZI2018- 191853. and NIH TR01 grant (1R01CA260415).

Author information

Authors and Affiliations

Authors

Contributions

D.M.F., A.L.G. and M.E.H. contributed to the conceptualization and writing of the original draft. The remaining co-authors contributed to the writing of the original draft. Contributing authors listed in the acknowledgements reviewed and edited the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Douglas M. Fowler, Anna L. Gloyn or Matthew E. Hurles.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

A.L.G. declares that her spouse is an employee of Genentech and holds stock options in Roche. D.J.A. is a consultant for Microbiotica and Astra Zeneca. D.S.M. is a consultant for Insitro, Dyno and Octant. J.T.N. receives research support from Bristol Myers Squibb. F.P.R. holds shares in Ranomics, Inc., and is an investor and advisor for SeqWell, Inc. and Constantiam Biosciences, Inc. L.M.S. is a consultant for Nostos Genomics. W.C.H. is a consultant for Thermo Fisher, Solasta Ventures, MPM Capital, Tyra Biosciences, Frontier Medicines, Jubilant Therapeutics, KSQ Therapeutics, RAPPTA Therapeutics, Serinus Biosciences, Hexagon Bio, Function Oncolog, Riva Therapeutics, and Calyx. M.E.H. is a consultant for AstraZeneca and co-founder, director, shareholder of Congenica Ltd.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fowler, D.M., Adams, D.J., Gloyn, A.L. et al. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol 24, 147 (2023). https://doi.org/10.1186/s13059-023-02986-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13059-023-02986-x

Keywords