The MHC locus and genetic susceptibility to autoimmune and infectious diseases
Genome Biology volume 18, Article number: 76 (2017)
In the past 50 years, variants in the major histocompatibility complex (MHC) locus, also known as the human leukocyte antigen (HLA), have been reported as major risk factors for complex diseases. Recent advances, including large genetic screens, imputation, and analyses of non-additive and epistatic effects, have contributed to a better understanding of the shared and specific roles of MHC variants in different diseases. We review these advances and discuss the relationships between MHC variants involved in autoimmune and infectious diseases. Further work in this area will help to distinguish between alternative hypotheses for the role of pathogens in autoimmune disease development.
The major histocompatibility complex (MHC) locus, also known as the human leukocyte antigen (HLA) locus, spans around 4 Mbp on the short arm of chromosome 6 (6p21.3; Box 1). Molecules encoded by this region are involved in antigen presentation, inflammation regulation, the complement system, and the innate and adaptive immune responses, indicating the MHC’s importance in immune-mediated, autoimmune, and infectious diseases . Over the past 50 years, polymorphisms in the MHC locus have been shown to influence many critical biological traits and individuals’ susceptibility to complex, autoimmune, and infectious diseases (Boxes 2 and 3). In addition to autoimmune and inflammatory diseases, the MHC has recently been found to play a role in some neurological disorders [2,3,4,5,6], implicating autoimmune components in these diseases.
The genetic structure of the MHC is characterized by high levels of linkage disequilibrium (LD) compared to the rest of the genome, which means there are technical challenges in identifying MHC single nucleotide polymorphisms (SNPs), alleles, and amino acids. However, the recent availability of dense genotyping platforms, such as the custom-made Illumina Infinium SNP chip (Immunochip) , and of MHC reference panels has helped to fine-map the locus, improving our understanding of its disease associations and our ability to identify functional variants.
In this review, we discuss recent advances in mapping susceptibility variants in the MHC, using autoimmune and infectious diseases as examples (Boxes 2 and 3). We also discuss the relationships between the MHC variants involved in both autoimmune and infectious diseases and offer insights into the MHC-associated immune responses underlying disease onset and pathogenesis. Finally, we discuss future directions for studying genetic variation in the MHC and how learning about the variation at this locus will aid understanding of disease pathogenesis.
Advances in mapping susceptibility variants in the MHC locus
Several computational and empirical challenges complicate the mapping of MHC susceptibility variants. One fundamental challenge is that the MHC has many sequence and structural variations , which differ between populations and complicate haplotype inference. Another is that high and extensive LD in the locus makes it difficult to identify causal and independent loci. Non-additive allelic effects in the MHC, and epistatic effects between the MHC and other loci, can also complicate inference of the underlying haplotype structure and disease susceptibility variants.
In recent years, large volumes of sequencing data have made it possible to impute MHC variation on a wide scale, thereby improving our understanding of variability at this locus and of the haplotype structures and enabling reference panels to be created. The availability of accurate reference panels and a large number of genotyped individuals has allowed the identification of independent variants and improved our understanding of their contribution to disease heritability and pathways underlying disease biology [9, 10].
Advances in laboratory-based mapping of MHC variation
Increased throughput, accuracy, and read length in next-generation sequencing (NGS) technologies, as well as the development of user-friendly bioinformatics tools, have enabled higher resolution MHC typing . For instance, whole-genome sequencing (WGS) was successfully used to type HLA-A alleles at full resolution in 1070 healthy Japanese individuals  and to fully evaluate HLA-E variability in West African populations . However, the main problem with MHC sequencing using current technologies is the relatively short read lengths, which limit the amount of allelic data that can be generated at a high resolution. Long-range PCR amplification approaches, such as the use of PacBio systems for single molecule real-time sequencing, significantly increase read-length and the resolution for typing MHC alleles . In a comparison of MHC typing in an Indian population using sequence-specific primers, NGS (Roche/454) and single molecule sequencing (PacBio RS II) platforms, higher resolution typing was achieved for MHC class I (HLA-A, HLA-B, and HLA-C ) and class II genes (HLA-DRB1 and HLA-DQB1) using the PacBio platform, with a median read length of 2780 nucleotides .
High-density SNP panels, such as the Immunochip platform , which has been widely implemented in immunogenetics studies, are a cheaper, faster, and easier alternative to genotyping than direct MHC typing and NGS methods. The Immunochip contains a dense panel of SNPs from the MHC locus, which enables missing classic MHC variants to be inferred in silico, where the imputation is based on the haplotype structure present in large reference panels (Fig. 1). This fine-mapping approach has been used for several autoimmune and inflammatory diseases (Table 1) and for a few infectious diseases (Additional file 1), thereby allowing comprehensive interrogation of the MHC. Moreover, population-specific reference panels made by deep sequencing and used to impute genotypes allow identification of very rare variants and novel single-nucleotide variants in the human genome. This is illustrated by a recent study in which the authors first built a Han Chinese MHC-specific database by deep sequencing the region in 9946 patients with psoriasis and 10,689 healthy controls, and then used this reference panel to impute genotype data to fine-map psoriasis-associated variants . Notably, functional variants in non-coding regions can be identified, as shown in a Japanese cohort of 1070 healthy individuals . These variants would be impossible to discover using SNP microarrays or low coverage sequencing on the same sample size (Fig. 1, Table 1).
MHC associations revealed by genome-wide association studies (GWAS) can often not be fine-mapped to a single allele at a single locus; rather they comprise independent effects from multiple loci (see “Role of MHC variants in human diseases”). The presence of these multiple, independent effects highlights the heterogeneous nature within and between diseases, which may lead to varying immunological responses. Fine-mapping has also shown that autoimmune diseases share MHC alleles and hence molecular pathways, which are likely to represent targets for shared therapies. For instance, the major associations within MHC class II across autoimmune diseases imply that modulating T-cell receptor (TCR) activation by using peptide-bearing MHC molecules on antigen-presenting cells (APCs) could be therapeutically useful . Shared MHC genetic factors have also been observed between autoimmune and infectious diseases, suggesting that human genetic architecture has evolved in response to natural selection as determined by various infectious pathogens .
Advances in computational approaches for mapping MHC variation
Long-range LD between loci and SNP markers across the MHC offers an alternative approach to interrogate functional MHC variation through imputation. The development of different imputation tools using population-specific reference panels has enhanced the interpretation of genotype data derived from genome-wide platforms. MHC imputation is done using reference panels containing both genetic information and classic HLA serotyping, thus allowing identification of MHC allelic and amino acid variants. It is advantageous to impute allele and amino acid variants in the MHC because background sequence diversity causes the binary SNP concept to fail, technically speaking, while many SNPs have more than two alleles and various amino acids can be contained in the same position. For instance, six possible amino acid variants at position 11 in the HLA-DRB1 gene show the strongest association to rheumatoid arthritis (RA) . Two of these (valine and leucine) confer susceptibility to RA, whereas the other four (asparagine, proline, glycine and serine) are protective.
Several tools allowing imputation of classic HLA alleles at four-digit resolution are now available for MHC imputation analysis; the most common are SNP2HLA , HLA*IMP:01 , and an improved HLA*IMP:02 . HLA*IMP:02 outperforms HLA*IMP:01 on heterogeneous European populations and it increases the power and accuracy in cross-European GWAS . Missing data are also better tolerated in HLA*IMP:02, while SNP genotyping platforms must be selected in HLA*IMP:01 [21, 22]. SNP2HLA not only imputes classic alleles but also amino acids by using two European reference panels, one based on data from HapMap-CEPH (90 individuals), and the other on the Type 1 Diabetes Genetics Consortium (T1DGC) study . Another tool, HLA-VBSeq, allows imputation of MHC alleles at full resolution from whole-genome sequence data . HLA-VBSeq does not require prior knowledge of MHC allele frequencies and can therefore be used for samples from genetically diverse populations . It has successfully typed HLA-A alleles at full resolution in a Japanese population and identified rare causal variants implicated in complex human diseases .
One commonly used European reference panel for imputation is the T1DGC panel, which covers SNP genotyping and classic HLA serotyping information for 5225 unrelated individuals . Similar population-specific reference panels have been developed for non-European studies to investigate the risk of psoriasis in Chinese populations  and of Graves’ disease and RA in Japanese populations. The panels have also been used to impute MHC alleles and amino acids for East Asian and Korean populations [24,25,26].
Using a single reference genome for regions like the MHC, which has substantial sequence and structural diversity, results in poor characterization. To counteract this, an algorithm was developed to infer much of the variation in the MHC; it allows genome inference from high-throughput sequencing data using known variation represented in a population reference graph (PRG) . Specifically, the PRG constructed for the MHC combined eight assembled haplotypes, the sequences of known classic HLA alleles, and 87,640 SNP variants from the 1000 Genomes Project . This approach is considered to be an intermediate step between de novo assembly and mapping to a single reference, but requires careful attention to the variation included in the PRG .
Despite the development of new tools to investigate MHC variation, the robustness of imputation depends largely on the reference panel and SNP selection. The frequency of alleles can differ between populations, thus highlighting the need to use population-specific reference panels to impute MHC alleles and amino acids. Additionally, the use of many samples is possible for analyzing the non-additive effects of MHC alleles on a wide scale, as described by Lenz et al. for celiac disease (CeD), psoriasis, and type 1 diabetes (T1D) . These non-additive effects could explain our inability to identify susceptibility variants. However, one important limitation of existing imputation methods is that they are limited to the classic MHC alleles and their amino acids. Another limitation is that accuracy is lower for low frequency or rare variants [20, 30]; this can be improved by increasing the reference panel size, together with the use of deep sequencing data. Ascertainment bias and lower LD also make it challenging to impute MHC variants in some non-European populations, such as Africans.
MHC genetic variation mediates susceptibility to a wide range of complex diseases, including infectious and autoimmune diseases. The large volume of data generated by recent GWAS has provided an excellent opportunity to apply imputation tools used to fine-map MHC associations to classic alleles and amino acids, as described below for autoimmune diseases. Overall, MHC imputation has proved to be a robust and cost-effective way to identify causal genes underlying disease pathogenesis. Ultimately, knowing the causal genes will help explain disease heritability and lead to a better understanding of the molecular pathways involved in disease pathogenesis. Such work helps to pinpoint potential therapeutic targets.
Role of MHC variants in human diseases
Insights into MHC susceptibility for autoimmune diseases: fine-mapping results, epistasis, and disease biology
Associations between the MHC and autoimmune diseases reported in the 1970s were some of the earliest described genetic associations [31, 32], and they remain the strongest risk factors for autoimmune diseases. After the development of wide-screen genotyping platforms and imputation pipelines, MHC imputation and fine-mapping were performed in European and Asian populations for most common autoimmune diseases, including RA [19, 25, 33, 34], CeD , psoriasis , ankylosing spondylitis (AS) , systemic lupus erythematosus (SLE) [33, 38,39,40,41], T1D [42, 43], multiple sclerosis (MS) [44, 45], Graves’ disease , inflammatory bowel disease (IBD) , and dermatomyositis (DM) . Table 1 shows the main associated variants and independently associated loci for autoimmune diseases.
In 2012, a pioneering MHC fine-mapping study, performed in individuals of European ancestry with RA , confirmed the strongest association with the class II HLA-DRB1 gene, as well as other independent associations. Previously an increased risk of RA was reported for a set of consensus amino acid sequences at positions 70–74 in the HLA-DRB1 gene, known as the “shared epitope” locus . The imputed data revealed the most significant associations were with two amino acids at position 11, located in a peptide-binding groove of the HLA-DR heterodimer. This suggested a functional role for this amino acid in binding the RA-triggering antigen. Similar fine-mapping studies followed for other autoimmune diseases (Table 1).
In general, in most autoimmune diseases, fine-mapping strategies have confirmed the main associated locus reported by serotype analysis within a certain MHC locus. Such strategies have also allowed identification of specific allelic variants or amino acids, as well as independent variants in different HLA classes. For instance, in CeD, the strongest association was with the known DQ-DR locus, and five other independent signals in classes I and II were also identified. CeD is the only autoimmune disease for which the antigen, gluten, is known and well studied. Gluten is a dietary product in wheat, barley, and rye. It is digested in the intestine and deamidated by tissue transglutaminase enzymes such that it perfectly fits the binding pockets of a particular CeD-risk DQ heterodimer (encoded by the DQ2.2, DQ2.5, and DQ8 haplotypes). This association was confirmed by MHC fine-mapping, which indicated roles for four amino acids in the DQ genes with the strongest independent associations to CeD risk . Similarly, the main associations were determined for T1D, MS, and SLE within the MHC class II locus (the associations for these three diseases are to a particular HLA-DQ-DR haplotype), and there are also independent, but weaker associations with the class I and/or III regions. In DM, fine mapping in an Asian population identified MHC associations driven by variants located around the MHC class II region, with HLA-DP1*17 being the most significant . In contrast, the primary and strongest associations in psoriasis and AS were to MHC class I molecules, while independent associations to the class I locus were also reported for IBD and Graves’ disease. Class III variants are weakly implicated in autoimmune diseases, but several associations in the MHC class III region were seen for MS; for instance, the association to rs2516489 belonging to the long haplotype between MICB and LST1 genes. The association signal to rs419788-T in the class III region gene SKIV2L has also been implicated in SLE susceptibility, representing a novel locus identified by fine-mapping in UK parent–child trios . An independent association signal to class III was also identified (rs8192591) by a large meta-analysis of European SLE cases and controls and, specifically, upstream of NOTCH4 . However, further studies are needed to explain how these genetic variations contribute to predisposition to SLE.
In addition to identifying independent variants, MHC fine-mapping studies permit analysis of epistatic and non-additive effects in the locus. These phenomena occur when the effect of one allele on disease manifestation depends on the genotype of another allele in the locus (non-additive effect), or on the genotype of the “modifier” gene in another locus (epistasis). Non-additive MHC effects were established in CeD, in which knowing gluten was the causal antigen offered an advantage in investigating the antigen-specific structure of the DQ-heterodimer. CeD risk is mediated by the presence of several HLA-DQ haplotypes, including the DQ2.5, DQ2.2, and DQ8 haplotypes, which form the specific pocket that efficiently presents gluten to T cells. These haplotypes can be encoded either in cis, when both DQA1 and DQB1 are located on the same chromosome, or in trans, when they are located on different chromosomes. Some DQ allelic variants confer susceptibility to CeD only in combination with certain other haplotypes, forming a CeD-predisposing trans-combination. For example, HLA-DQA1*0505-DQB1*0301 (DQ7) confers risk to CeD only if it is combined with DQ2.2 or DQ2.5, contributing to the formation of susceptible haplotypes in trans. In particular, DQ7/DQ2.2 heterozygosity confers a higher risk for CeD than homozygosity for either of these alleles, and is an example of a non-additive effect for both alleles.
Unlike CeD, the exact haplotypes and their associated properties remain unknown for most other autoimmune diseases; therefore, analyzing non-additive effects might yield new insights into potentially disease-causing antigens. Lenz et al. provided evidence of significant non-additive effects for autoimmune diseases, including CeD, RA, T1D, and psoriasis, which were explained by interactions between certain classic HLA alleles . For instance, specific interactions that increase T1D disease risk were described between HLA-DRB1*03:01-DQB1*02:01/DRB1*04:01-DQB1*03:02 genotypes  and for several combinations of the common HLA-DRB1, HLA-DQA1, and HLA-DQB1 haplotypes . In AS, epistatic interaction was observed for combinations of HLA-B60 and HLA-B27, indicating that individuals with the HLA-B27+/HLA-B60+ genotype have a high risk of developing AS . Moreover, a recent study in MS found evidence for two interactions involving class II alleles: HLA-DQA1*01:01-HLA-DRB1*15:01 and HLA-DQB1*03:01-HLA-DQB1*03:02, although their contribution to the missing heritability in MS was minor .
Epistatic interactions between MHC and non-MHC alleles have also been reported in several autoimmune diseases, including SLE, MS, AS, and psoriasis. For instance, in a large European cohort of SLE patients, the most significant epistatic interaction was identified between the MHC region and cytotoxic T lymphocyte antigen 4 (CTLA4) , which is upregulated in T cells upon encountering APCs. This highlights that appropriate antigen presentation and T-cell activation are important in SLE pathogenesis . Notably, interactions between MHC class I and specific killer immunoglobulin receptor (KIR) genes are important in predisposition to autoimmune diseases such as psoriatic arthritis, scleroderma, sarcoidosis, and T1D [51,52,53,54]. KIR genes are encoded by the leukocyte receptor complex on chromosome 19q13 and expressed on natural killer cells and subpopulations of T cells . Finally, epistatic interactions between MHC class I and ERAP1 have been described for AS, psoriasis, and Behçet’s disease .
Association of novel MHC variants and identification of interaction effects within the MHC are increasing our understanding of the biology underlying autoimmune and inflammatory diseases. Fine-mapping the main associated locus within HLA-DQ-DR haplotypes has allowed determination of the key amino acid positions in the DQ or DR heterodimer. Pinpointing specific amino acids leads to a better understanding of the structure and nature of potential antigens for autoimmune or inflammatory diseases, and these can then be tested through binding assays and molecular modeling. The fact that these positions are located in peptide-binding grooves suggests they have a functional impact on antigenic peptide presentation to T cells, either during early thymic development or during peripheral immune responses . In addition, analysis of non-additive effects in MHC-associated loci offers the possibility to identify antigen-specific binding pockets and key amino acid sequences. For example, identification of the protective, five-amino acid sequence DERAA as a key sequence in the RA-protective HLA-DRB1:13 allele, and its similarity to human and microbial peptides, led to identification of (citrullinated) vinculin and some pathogen sequences as novel RA antigens .
The identification of independent signals in MHC classes I and III for many autoimmune diseases implies that these diseases involve novel pathway mechanisms. For example, association of CeD to class I molecules suggests a role for innate-like intraepithelial leucocytes that are restricted to class I expression and that are important in epithelial integrity and pathogen recognition . Class I associations to RA, T1D, and other autoimmune diseases suggest that CD8+ cytotoxic cells are involved in disease pathogenesis, as well as CD4+ helper T cells.
Discovering the epistatic effects of MHC and non-MHC loci can also shed light on disease mechanisms. For example, ERAP1 loss-of-function variants reduce the risk of AS in individuals who are HLA-B27-positive and HLAB-40:01-positive, but not in carriers of other risk haplotypes . Similar epistatic effects were also observed for psoriasis, such that individuals who carry variants in ERAP1 showed an increased risk only when they also carried an HLA-C risk allele . In line with these observations, mouse studies have shown that ERAP1 determines the cleavage of related epitopes in such a way that they can be presented by the HLA-B27 molecule . Confirming that certain epitopes must be cleaved by ERAP1 to be efficiently presented by CD4+ and CD8+ cells will be a critical step in identifying specific triggers for autoimmune diseases.
The recent discoveries of genetic associations between MHC alleles and autoimmune diseases are remarkable and offer the potential to identify disease-causing antigens. This would be a major step towards developing new treatments and preventing disease. However, we still do not understand exactly how most associated alleles and haplotypes work, and extensive functional studies are needed to clarify their involvement in disease.
Explained heritability by independent MHC loci for autoimmune diseases
Heritability is an estimation of how much variation in a disease or phenotype can be explained by genetic variants. Estimating heritability is important for predicting diseases but, for common diseases, it is challenging and depends on methodological preferences, disease prevalence, and gene–environment interactions that differ for each phenotype . It is therefore difficult to compare heritability estimates across diseases. Nevertheless, for many diseases, estimates have been made as to how much phenotypic variance can be explained by the main locus and by independent MHC loci .
For autoimmune diseases with a main association signal coming from a class II locus, the reported variance explained by MHC alleles varies from 2 − 30% . The strongest effect is reported for T1D, in which the HLA-DR and HLA-DQ haplotypes explain 29.6% of phenotypic variance; independently associated loci in HLA-A, HLA-B, and HLA-DPB1 together explain about 4% of the total phenotypic variance, while all other non-MHC loci are responsible for 9% . Similarly, in CeD, which has the same main associated haplotype as T1D, the HLA-DQ-DR locus explains 23 − 29% of disease variance (depending on the estimated prevalence of disease, which is 1 − 3%), whereas other MHC alleles explain 2 − 3%, and non-MHC loci explain 6.5 − 9% . In seropositive RA, 9.7% of phenotypic variance is explained by all the associated DR haplotypes, whereas a model including three amino acid positions in DRB1, together with independently associated amino acids in HLA-B and HLA-DP loci, explains 12.7% of the phenotypic variance . This indicates that non-DR variants explain a proportion of heritability comparable to that in other non-MHC loci (4.7 − 5.5% in Asians and Europeans) . The non-additive effects of DQ-DR haplotypes can also explain a substantial proportion of phenotypic variance: 1.4% (RA), 4.0% (T1D), and 4.1% (CeD) . In MS, the major associated allele, DRB1*15:01, accounts for 10% of the phenotypic variance, whereas all the alleles in DRB1 explain 11.6%. A model including all of the independent variants (and those located in classes I, II, and III) explains 14.2% of the total variance in MS susceptibility .
In SLE, the proportion of variance explained by the MHC is notably lower, at only 2% , and is mostly due to class II variants. In IBD, the association with MHC is weaker than in classic autoimmune diseases, with a lower contribution seen in Crohn’s disease (CD) than in ulcerative colitis (UC) . The main and secondary variants can now explain 3.1% of heritability in CD and 6.2% in UC, which is two to ten times greater than previously attributed by main effect analysis in either disease (0.3% in CD and 2.3% in UC for the main SNP effect) . Among all the diseases discussed here, the main effect of the associated haplotype is far stronger than the independent effects from other loci (with the exception of IBD, in which the MHC association is weaker overall). However, independent MHC loci can now explain a comparable amount of the disease variance to that explained by the non-MHC associated genes known so far.
Insights into MHC susceptibility for infectious diseases: GWAS, fine-mapping results, and epistasis
In principle, an infectious disease is caused by interactions between a pathogen, the environment, and host genetics. Here, we discuss MHC genetic associations reported in infectious diseases from GWAS (Table 2) and how these findings can explain increased susceptibility or protection by affecting human immune responses. This is why certain MHC classes are important in infectious diseases. We note that fewer MHC associations have been found for infectious diseases than for autoimmune diseases, mainly because of the smaller cohort sizes for infectious diseases. Thus, extensive fine-mapping studies (and imputation) have yet to be performed, with the exception of a few studies on infections such as human immunodeficiency virus (HIV) , human hepatitis B virus (HBV) [63, 64], human hepatitis C virus (HCV) , human papilloma virus (HPV) seropositivity , and tuberculosis .
From a genetic viewpoint, one of the best-studied infectious diseases is HIV infection. MHC class I loci have strong effects on HIV control [62,69,70,, 68–71] and acquisition , viral load set point [69,70,71], and non-progression of disease  in Europeans [69, 70, 72, 73], and in multi-ethnic populations (Europeans, African-Americans, Hispanics, and Chinese) [62, 68, 71]. A GWAS of an African-American population indicated a similar HIV-1 mechanism in Europeans and African-Americans: about 9.6% of the observed variation in viral load set point can be explained by HLA-B*5701 in Europeans , while about 10% can be explained by HLA-B*5703 in African-Americans . In contrast, the MHC associations and imputed amino acids identified in Europeans and African-Americans were not replicated in Chinese populations, possibly because of the varied or low minor allele frequencies of these SNPs in Chinese people . A strong association to the MHC class I polypeptide-related sequence B (MICB) was also revealed by a recent GWAS for dengue shock syndrome (DSS) in Vietnamese children . This result was replicated in Thai patients, indicating MICB can be a strong risk factor for DSS in Southeastern Asians .
HLA-DP and HLA-DQ loci, along with other MHC or non-MHC loci (TCF19, EHMT2, HLA-C, HLA-DOA, UBE2L3, CFB, CD40, and NOTCH4) are consistently associated with susceptibility to HBV infection in Asian populations [76,77,78,79,80,81,82,83]. Significant associations between the HLA-DPA1 locus and HBV clearance were also confirmed in independent East Asian populations [79, 81]. A fine-mapping study of existing GWAS data from Han Chinese patients with chronic HBV infection used SNP2HLA as the imputation tool and a pan-Asian reference panel. It revealed four independent associations at HLA-DPβ1 positions 84–87, HLA-C amino acid position 15, rs400488 at HCG9, and HLA-DRB1*13; together, these four associations could explain over 72.94% of the phenotypic variance caused by genetic variations . Another recent study using imputed data from Japanese individuals indicated that class II alleles were more strongly associated with chronic HBV infection than class I alleles (Additional file 1) . Similarly, the HLA-DQ locus influences the spontaneous clearance of HCV infection in cohorts of European and African ancestry, while DQB1*03:01, which was identified by HLA genotyping together with the non-MHC IL28B, can explain 15% of spontaneous HCV infection clearance cases . HLA-DQB1*03 also confers susceptibility to chronic HCV in Japanese people . A GWAS in a European population revealed that HPV8 seropositivity is influenced by the MHC class II region . However, HPV type 8 showed a higher seropositivity prevalence than other HPV types at the population level ; this led to a limited power to detect associations with other HPV types. Fine-mapping using the same European population as in the GWAS  revealed significant associations with HPV8 and HPV77 seropositivity, but only with MHC class II alleles, not with class I alleles. This indicates a pivotal role for class II molecules in antibody immune responses in HPV infection. Notably in this study, imputation was performed using HLA*IMP:02 and reference panels from the HapMap Project  and the 1958 British Birth Cohort, as well as using SNP2HLA with another reference panel from the T1DGC. Both imputation tools provided comparable results, thus highlighting the important role of MHC class II alleles in antibody response to HPV infection .
A GWAS on leprosy in Chinese populations pointed to significant associations with HLA-DR-DQ loci [87, 88]; these results were replicated in an Indian population . Fine-mapping the MHC showed that variants in HLA class II were extensively associated with susceptibility to leprosy in Chinese people, with HLA-DRB1*15 being the most significant variant . HLA class II variants also influence the mycobacterial infection tuberculosis in European and African populations [67, 90]. Fine-mapping identified the DQA1*03 haplotype, which contains four missense variants and contributes to disease susceptibility . A meta-analysis showed that five variants (HLA-DRB1*04, *09, *10, *15, and *16) increase the risk of tuberculosis, especially in East Asian populations, whereas HLA-DRB1*11 is protective .
Using a population from Brazil, the first GWAS on visceral leishmaniasis revealed that the class II HLA-DRB1-HLA-DQA1 locus had the strongest association signal; this was replicated in an independent Indian population . This common association suggests that Brazilians and Indians share determining genetic factors that are independent of the different parasite species in these geographically distinct regions.
Finally, epistatic interactions between MHC class I alleles and certain KIR alleles (between KIR3DS1 combined with HLA-B alleles) are associated with slower progression to acquired immunodeficiency syndrome (AIDS)  and better resolution of HCV infection (between KIR2DL3 and its human leukocyte antigen C group 1, HLA-C1) .
Insights into the biology of infectious diseases
Associations with the MHC class I locus suggest a critical role for CD8+ T-cell responses in major viral infections such as HIV, dengue, and HCV. This critical role of CD8+ T-cell responses in HIV infection is reflected by the slow disease progression seen in infected individuals because of their increasing CD8+ T-cell responses that are specific to conserved HIV proteins such as Gap p24 . Interestingly, five out of six amino acid residues (Additional file 1) identified as associated with HIV control  lie in the MHC class I peptide-binding groove, implying that MHC variation affects peptide presentation to CD8+ T cells. In particular, the amino acid at position 97, which lies in the floor of the groove in HLA-B, was most significantly associated with HIV control (P = 4 × 10−45) . This amino acid is also implicated in MHC protein folding and cell surface expression . An association found in severe dengue disease also underscores the role of CD8+ T cells in disease pathogenesis: class I alleles that were associated with an increased risk of severe dengue disease were also associated with weaker CD8+ T-cell responses in a Sri Lankan population from an area of hyper-endemic dengue disease . In HCV, similar to the protective alleles against HIV infection , HLA-B*27 presents the most conserved epitopes of HCV to elicit strong cytotoxic T-cell responses, thereby reducing the ability of HCV to escape from host immune responses .
Associations between genetic variants in the MHC class II region and disease susceptibility imply that impaired antigen presentation or unstable MHC class II molecules contribute to insufficient CD4+ T-cell responses and, subsequently, to increased susceptibility to infections. For instance, the amino acid changes at positions of HLA-DPβ1 and HLA-DRβ1 in the antigen-binding groove that influence HBV infection may result in defective antigen presentation to CD4+ T cells or to impaired stability of MHC class II molecules, thereby increasing susceptibility to HBV infection . CD4+ T-cell responses are also critical in mycobacterial infections, such as has been described for leprosy and tuberculosis [99, 100]. Notably, monocyte-derived macrophages treated with live Mycobacterium leprae showed three main responses that explain infection persistence: downregulation of certain pro-inflammatory cytokines and MHC class II molecules (HLA-DR and HLA-DQ), preferentially primed regulatory T-cell responses, and reduced Th1-type and cytotoxic T-cell function . Macrophages isolated from the lesions of patients with the most severe disease form, lepromatous leprosy, also showed lower expression of MHC class II molecules, providing further evidence that defective antigen presentation by these molecules leads to more persistent and more severe M. leprae infection .
Recently, it has been shown that CD4+ T-cells are essential for the optimal production of IFNγ by CD8+ T-cells in the lungs of mice infected with M. tuberculosis, indicating that communication between these two distinct effector cell populations is critical for a protective immune response against this infection . Impaired antigen processing and presentation from Leishmania-infected macrophages (which are the primary resident cells for this parasite) to CD4+ T cells could explain increased susceptibility to leishmaniasis . The association between HPV seropositivity and the MHC class II region also suggests that class II molecules bind and present exogenous antigens more effectively to a subset of CD4+ T cells known as Th2. These Th2 cells help primed B lymphocytes to differentiate into plasma cells and to secrete antibodies against the HPV virus.
In support of the hypothesis that genetic effects on both CD8+ (class I) and CD4+ (class II) cells modify the predisposition to infections, it should be noted that some infectious diseases, such as HIV, HBV, HCV, and leprosy, show associations to more than one of the classic MHC classes and, in some cases, the associations differ between populations (Table 2). Moreover, consideration must be given to the differences between viral and bacterial genotypes in the same infection, which play a role in determining potentially protective effects. Overall, associations with multiple MHC loci reflect the complex and interactive nature of host immune responses when the host encounters a pathogen.
Relationship between the MHC variants involved in autoimmune and infectious diseases
Both autoimmune and infectious diseases seem to involve certain MHC classes (Fig. 2a), and only a few MHC alleles are shared between these two distinct disease groups (Fig. 2b). The identification of shared MHC variation has provided insight into the relationships between the MHC variants involved in autoimmune and infectious diseases and which have been uniquely shaped throughout human evolution .
Two hypotheses have been proposed to explain the relationships between the MHC variants involved in both groups of diseases. The first, known as the “pathogen-driven selection” hypothesis, states that pressure exerted on the human genome by pathogens has led to the advantageous selection of host defense genes and, subsequently, to much higher polymorphism in the MHC. This polymorphism has contributed to the development of complex immune defense mechanisms that protect humans against a broad range of pathogens. Thus, heterozygosity at MHC loci is evolutionarily favored and has become an efficient mechanism contributing to the highly polymorphic MHC (the “MHC heterozygosity advantage”) . Two examples of MHC heterozygote advantage are HIV-1-infected heterozygotes at class I loci, which are slower to progress to AIDS [104, 105], and HBV-infected heterozygotes at class II loci, which seem more likely to clear the infection . In addition, human populations exposed to a more diverse range of pathogens display higher class I genetic diversity than those exposed to a smaller range . However, the true effect of infectious diseases on selection might be underestimated because of the heterogeneity of many pathogens and the changing prevalence of infectious diseases over evolutionary time.
Positive selection of the advantageous effect of MHC polymorphism in infections may also be accompanied by a higher risk of developing autoimmune diseases. For example, the non-MHC locus SH2B3 rs3184504*A is a risk allele for CeD but has been under positive selection because it offers the human host protection against bacterial infections . To investigate whether other genetic variants in the MHC show this opposite direction effect between autoimmune and infectious diseases (Fig. 2b), we compared SNPs and alleles in the MHC identified by GWAS and fine-mapping studies on autoimmune diseases (Table 1; Additional file 2) with those identified in infectious diseases (Table 2; Additional file 1). On the one hand, HLA-B*27:05, which has one of the strongest associations to AS in the MHC (P < 1 × 10−2000)  and is present in all ethnic groups, increases AS risk. On the other hand, it also has a protective effect against HIV infection, showing a nominal significant value of 5.2 × 10–5 . The second example of opposite allelic effect is the association between the rs2395029*G allele and susceptibility to psoriasis (OR = 4.1; P = 2.13 × 10–26)  and AIDS non-progression (P = 9.36 × 10–12) . Located in the HLA complex P5 (HCP5), rs2395029 is a proxy for the HLA-B*57:01 allele , the strongest protective allele against AIDS progression . Non-progressors carrying the rs2395029-G allele had a lower viral load than other non-progressors .
Another study showed that psoriasis patients carry the same genetic variants as HIV controllers/non-progressors and that they are particularly enriched for the protective allele HLA-B*57:01 (P = 5.50 × 10–42) . Moreover, the intergenic variant rs10484554*A, which is in LD with HLA-C (r2 ≥ 0.8), was significantly associated with AIDS non-progression (P = 6.27 × 10–8)  and with susceptibility to psoriasis (OR = 4.66, P = 4 × 10–214) . HLA-C*06:02 (equivalent to HLA-Cw6) was most strongly associated with susceptibility to psoriasis (OR = 3.26; P = 2.1 × 10–201)  and is also protective against HIV infection (OR = 2.97; P = 2.1 × 10 –19) . The same allele has been associated with susceptibility to CD (OR = 1.17; P = 2 × 10–13) . Interestingly, the role of MHC in HIV control also relates to the influence of MHC expression levels. For instance, rs9264942 shows one of the most significant genome-wide effects observed on HIV control [62, 69, 70]: it is located 35 kb upstream of the HLA-C locus (Table 2) and has been associated with high HLA-C expression, conferring protection against HIV infection . Explaining this protective effect, HLA-C allelic expression was correlated with increasing likelihood of CD8+ T-cell cytotoxicity . However, the −35 SNP is not a causal variant, but is in LD with a SNP at the 3′ end of HLA-C; this affects HLA-C expression by influencing binding of the microRNA Hsa-miR-148a . Notably, high HLA-C expression has a deleterious effect by conferring risk for Crohn’s disease . The potential mechanism by which HLA expression levels confer resistance to pathogens, and also lead to greater autoimmunity, could be through promiscuous peptide binding . Lastly, HLA-DQB1*03:02 showed a dominant risk effect for MS (OR = 1.30; P = 1.8 × 10–22) , whereas it is a resistant allele against chronic HBV infection (OR = 0.59; P = 1.42 × 10–5) .
The second hypothesis states that pathogens can trigger autoimmunity, as suggested by epidemiological studies [115, 116]. For example, it has recently been shown that apoptosis of infected colonic epithelial cells in mice induces the proliferation of self-reactive CD4+ T cells that are specific to cellular and to pathogenic antigens . Self-reactive CD4+ T cells differentiate into Th17 cells, which promote production of auto-antibodies and auto-inflammation, implying that infections can trigger autoimmunity . Other mechanisms have been proposed, such as molecular mimicry, bystander activation, exposure of cryptic antigens, and superantigens . Common genetic signatures between autoimmune and infectious diseases indirectly imply that pathogens can indeed trigger autoimmunity. In line with this second hypothesis, we have identified common risk factors between autoimmune and infectious diseases, such as the alleles: HLA-DRB1*15 for MS, SLE (Table 1), and leprosy (OR = 2.11; P = 3.5 × 10–28) ; rs9275572*C, located in HLA-DQ, for chronic HCV infection (OR = 0.71; P = 2.62 × 10–6) , and SLE (P = 1.94 × 10–6) ; HLA-DQB1*03:02 for MS (OR = 1.30; P = 1.8 × 10–22)  and pulmonary tuberculosis (OR = 0.59; P = 2.48 × 10–5) ; HLA-C*12:02 for UC (OR = 2.25; P = 4 × 10–37) , CD (OR = 1.44; P = 3x 10–8) , and chronic HBV infection (OR = 1.70; P = 7.79 × 10–12) ; and rs378352*T, located in HLA-DOA, for chronic HBV infection (OR = 1.32; P = 1.16 × 10–7)  and RA (OR = 1.24; P = 4.6 × 10–6)  (Fig. 2a).
Associations within the MHC region for several autoimmune diseases, such as RA, CeD, AS, T1D, Graves’ disease, and DM, and HBV infection are driven by variants and alleles around HLA-DPB1 (Table 1), implying that viruses like HBV could trigger autoimmunity. Although there is no convincing evidence, HBV and HCV are associated with extra-hepatic autoimmune perturbations [120, 121]. Lastly, the DQA1*03:01 allele, which contributes to tuberculosis susceptibility (OR = 1.31; P = 3.1 × 10–8) , is also a well-known risk factor for CeD as part of the DQ8 (DQA1*03-DQB1*03:02) and DQ2.3 (trans-DQA1*03:01 and DQB1*02:01) haplotypes . DQA1*03 also increases susceptibility to T1D, RA, and juvenile myositis [123,124,125]. Overall, the direction of association is the same for most shared MHC class II loci, suggesting that bacteria and viruses can trigger immune responses. No viruses have been proven to cause an autoimmune disease thus far, but multiple virus infections could prime the immune system and eventually trigger an autoimmune response; this is a hypothesis that has been supported by animal studies on MS .
Conclusions and future perspectives
We have discussed recent advances in understanding the genetic variation in the MHC in relation to autoimmune and infectious diseases. However, confidence in the associations between MHC and infectious diseases is limited, mainly because of the relatively small patient cohort sizes available. Further limitations to identifying and replicating associations with infectious diseases include: strain differences, heterogeneity in clinical phenotypes, use of inappropriate controls (such as individuals with asymptomatic infections), and population-specific differences in allele frequency and/or haplotype structure. Finally, with the exception of a few described above, no imputation has been performed in most infectious disease studies. In certain populations, such as Africans, lower LD makes it challenging to perform MHC imputation.
Although application of a traditional GWAS is challenging for infectious diseases, other approaches may increase the power of genetic studies. For instance, a combination of transcriptional analysis and systems biology allowed the identification of a novel role for type I IFN signaling pathway in the human host immune response against Candida albicans . The use of control subjects for whom it is known whether they clear the infection, and who come from the same hospital as patients, could be appropriate for infectious diseases so that co-morbidities and clinical risk factors are as similar as possible between groups. Overall, initiating collaborative efforts to increase patient cohort numbers, designing better studies by using more appropriate controls and more homogenously clinically defined patient phenotypes, and applying imputation using population-specific reference genomes would open new avenues to study the genetics of infectious diseases.
In contrast to infectious diseases, the added value of fine-mapping the MHC to pinpoint genetic risk factors for autoimmune disease has been well demonstrated by numerous studies. The associations that have been found in both European and Asian populations to the same amino acids by fine-mapping the MHC suggest that the same molecular mechanism is involved, despite the differences in MHC allele frequencies seen between these ethnic groups.
MHC-based imputation approaches using genotype data, along with the use of population-specific reference panels for imputing MHC alleles and amino acids, has allowed identification of the MHC variation associated with complex diseases. Although identification is challenging, genetic variation in the MHC is of critical importance for two reasons. First, it sheds light on the development of autoimmunity, given the two hypotheses discussed above (pathogen-driven evolutionary selection of protective genes or pathogens as triggers of autoimmunity), and second, it yields greater understanding of the complexity of the human immune system. This knowledge will ultimately permit the design of better prophylactic and therapeutic strategies to achieve more balanced patient–immune responses during treatment.
Genome-wide association study
Hepatitis B virus
Hepatitis C virus
Human immunodeficiency virus
Human leukocyte antigen
Human papilloma virus
Inflammatory bowel disease
Killer immunoglobulin receptor
Major histocompatibility complex
Population reference graph
Systemic lupus erythematosus
Single nucleotide polymorphism
Shiina T, Hosomichi K, Inoko H, Kulski JK. The HLA genomic loci map: expression, interaction, diversity and disease. J Hum Genet. 2009;54:15–39.
Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;10:8192.
Song S, Miranda CJ, Braun L, Meyer K, Frakes AE, Ferraiuolo L, et al. Major histocompatibility compex class I molecules protect motor neurons from astrocyte-induced toxicity in amyotrophic lateral sclerosis (ALS). Nat Med. 2016;22:397–403.
Shatz CJ. MHC class I: an unexpected role in neuronal plasticity. Neuron. 2009;64(1):40–5.
Hamza TH, Zabetian CP, Tenesa A, Laederach A, Montimurro J, Yearout D, et al. Common genetic variation in the HLA region is associated with late-onset sporadic Parkinson’s disease. Nat Genet. 2010;42:781–5.
Hill-Burns EM, Factor SA, Zabetian CP, Thomson G, Payami H. Evidence for more than one Parkinson’s disease-associated variant within the HLA region. PLoS One. 2011;6, e27109.
Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther. 2011;13:101.
Horton R, Gibson R, Coggill P, Miretti M, Allcock RJ, Almeida J, et al. Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project. Immunogenetics. 2008;60:1–18.
Hughes T, Adler A, Kelly JA, Kaufman KM, Williams AH, Langefeld CD, et al. Evidence for gene-gene epistatic interactions among susceptibility loci for systemic lupus erythematosus. Arthritis Rheum. 2012;64:485–92.
Kirino Y, Bertsias G, Ishigatsubo Y, Mizuki N, Tugal-Tutkun I, Seyahi E, et al. Genome-wide association analysis identifies new susceptibility loci for Behçet’s disease and epistasis between HLA-B*51 and ERAP1. Nat Genet. 2013;45:202–7.
Carapito R, Radosavljevic M, Bahram S. Next-generation sequencing of the HLA locus: methods and impacts on HLA typing, population genetics and disease association studies. Hum Immunol. 2016;77(11):1016–23.
Nagasaki M, Yasuda J, Katsuoka F, Nariai N, Kojima K, Kawai Y, et al. Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals. Nat Commun. 2015;6:8018.
Castelli EC, Mendes-Junior CT, Sabbagh A, Porto IO, Garcia A, Ramalho J, et al. HLA-E coding and 3’ untranslated region variability determined by next-generation sequencing in two West-African population samples. Hum Immunol. 2015;76:945–53.
Rhoads A, Au KF. PacBio sequencing and its applications. Genomics Proteomics Bioinforma. 2015;13:278–89.
Gowda M, Ambardar S, Dighe N, Manjunath A, Shankaralingu C, Hallappa P, et al. Comparative analyses of low, medium and high-resolution HLA typing technologies for human populations. J Clin Cell Immunol. 2016;7:399.
Zhou F, Cao H, Zuo X, Zhang T, Zhang X, Liu X, et al. Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease. Nat Genet. 2016;48:1–11.
Cho JH, Feldman M. Heterogeneity of autoimmune diseases: pathophysiologic insights from genetics and implications for new therapies. Nat Med. 2015;21:730–8.
Barreiro LB, Quintana-Murci L. From evolutionary genetics to human immunology: how selection shapes host defence genes. Nat Rev Genet. 2010;11:17–30.
Raychaudhuri S, Sandor C, Stahl EA, Freudenberg J, Lee H, Jia X, et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nat Genet. 2012;44:291–6.
Jia X, Han B, Onengut-Gumuscu S, Chen WM, Concannon PJ, Rich SS, et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One. 2013;8(6), e64683.
Dilthey AT, Moutsianas L, Leslie S, McVean G. HLA*IMP--an integrated framework for imputing classical HLA alleles from SNP genotypes. Bioinformatics. 2011;27:968–72.
Dilthey A, Leslie S, Moutsianas L, Shen J, Cox C, Nelson MR, et al. Multi-population classical HLA type imputation. PLoS Comput Biol. 2013;9(2), e1002877.
Nariai N, Kojima K, Saito S, Mimori T, Sato Y, Kawai Y, et al. HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data. BMC Genomics. 2015;16 Suppl 2:S7.
Okada Y, Momozawa Y, Ashikawa K, Kanai M, Matsuda K, Kamatani Y, et al. Construction of a population-specific HLA imputation reference panel and its application to Graves’ disease risk in Japanese. Nat Genet. 2015;47:798–802.
Okada Y, Suzuki A, Ikari K, Terao C, Kochi Y, Ohmura K, et al. Contribution of a non-classical HLA gene, HLA-DOA, to the risk of rheumatoid arthritis. Am J Hum Genet. 2016;99:366–74.
Kim K, Bang SY, Lee HS, Bae SC. Construction and application of a Korean reference panel for imputing classical alleles and amino acids of human leukocyte antigen genes. PLoS One. 2014;9(11), e112546.
Dilthey A, Cox C, Iqbal Z, Nelson MR, McVean G. Improved genome inference in the MHC using a population reference graph. Nat Genet. 2015;47:682–8.
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526:68–74.
Lenz TL, Deutsch AJ, Han B, Hu X, Okada Y, Eyre S, et al. Widespread non-additive and interaction effects within HLA loci modulate the risk of autoimmune diseases. Nat Genet. 2015;47:4–7.
Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nat Rev Genet. 2010;11:499–511.
Mulder DJ. HLA antigens and coeliac disease. Lancet. 1974;2:727.
Grumet FC, Coukell A, Bodmer JG, Bodmer WF, McDevitt HO. Histocompatibility (HL-A) antigens associated with systemic lupus erythematosus. A possible genetic predisposition to disease. N Engl J Med. 1971;285:193–6.
Kim K, Bang S-Y, Yoo DH, Cho S-K, Choi C-B, Sung Y-K, et al. Imputing variants in HLA-DR beta genes reveals that HLA-DRB1 is solely associated with rheumatoid arthritis and systemic lupus erythematosus. PLoS ONE. 2016;11(2):e0150283.
Okada Y, Kim K, Han B, Pillai NE, Ong RT-H, Saw W-Y, et al. Risk for ACPA-positive rheumatoid arthritis is driven by shared HLA amino acid polymorphisms in Asian and European populations. Hum Mol Genet. 2014;23:6916–26.
Gutierrez-Achury J, Zhernakova A, Pulit SL, Trynka G. Fine-mapping in the MHC region accounts for 18% additional genetic risk for celiac disease. Nat Genet. 2015;47:577–8.
Okada Y, Han B, Tsoi LC, Stuart PE, Ellinghaus E, Tejasvi T, et al. Fine mapping major histocompatibility complex associations in psoriasis and its clinical subtypes. Am J Hum Genet. 2014;95:162–72.
Cortes A, Pulit SL, Leo PJ, Pointon JJ, Robinson PC, Weisman MH, et al. Major histocompatibility complex associations of ankylosing spondylitis are complex and involve further epistasis with ERAP1. Nat Commun. 2015;6:7146.
Kim K, Bang SY, Lee HS, Okada Y, Han B, Saw WY, et al. The HLA-DRβ1 amino acid positions 11-13-26 explain the majority of SLE-MHC associations. Nat Commun. 2014;5:5902.
Fernando MM, Stevens CR, Sabeti PC, Walsh EC, McWhinnie AJ, Shah A, et al. Identification of two independent risk factors for lupus within the MHC in United Kingdom families. PLoS Genet. 2007;3, e192.
Morris DL, Taylor KE, Fernando MMA, Nititham J, Alarcón-Riquelme ME, Barcellos LF, et al. Unraveling multiple MHC gene associations with systemic lupus erythematosus: model choice indicates a role for HLA alleles and non-HLA genes in Europeans. Am J Hum Genet. 2012;91:778–93.
Sun C, Molineros JE, Looger LL, Zhou X, Okada Y, Ma J, et al. High-density genotyping of immune-related loci identifies new SLE risk variants in individuals with Asian ancestry. Nat Genet. 2016;48:323–30.
Howson JMM, Walker NM, Clayton D, Todd JA. Confirmation of HLA class II independent type 1 diabetes associations in the major histocompatibility complex including HLA-B and HLA-A. Diabetes Obes Metab. 2009;11:31–45.
Hu X, Deutsch AJ, Lenz TL, Onengut-Gumuscu S, Han B, Chen W-M, et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat Genet. 2015;47:898–905.
Moutsianas L, Jostins L, Beecham AH, Dilthey AT, Xifara DK, Ban M, et al. Class II HLA interactions modulate genetic risk for multiple sclerosis. Nat Genet. 2015;47:1107–13.
Patsopoulos NA, Barcellos LF, Hintzen RQ, Schaefer C, van Duijn CM, Noble JA, et al. Fine-mapping the genetic gssociation of the major histocompatibility complex in multiple sclerosis: HLA and non-HLA effects. PLoS Genet. 2013;9(11), e1003926.
Goyette P, Boucher G, Mallon D, Ellinghaus E, Jostins L, Huang H, et al. High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis. Nat Genet. 2015;47:172–9.
Zhang CE, Li Y, Wang ZX, Gao JP, Zhang XG, Zuo XB, et al. Variation at HLA-DPB1 is associated with dermatomyositis in Chinese population. J Dermatol. 2016;43:1307–13.
Gregersen PK, Silver J, Winchester RJ. The shared epitope hypothesis. An approach to understanding the molecular genetics of susceptibility to rheumatoid arthritis. Arthritis Rheum. 1987;30:1205–13.
Koeleman BP, Lie BA, Undlien DE, Dudbridge F, Thorsby E, de Vries RR, et al. Genotype effects and epistasis in type 1 diabetes and HLA-DQ trans dimer associations with disease. Genes Immun. 2004;5:381–8.
van Gaalen FA, Verduijn W, Roelen DL, Böhringer S, Huizinga TWJ, van der Heijde DM, et al. Epistasis between two HLA antigens defines a subset of individuals at a very high risk for ankylosing spondylitis. Ann Rheum Dis. 2013;72:974–8.
Mizuki M, Eklund A, Grunewald J. Altered expression of natural killer cell inhibitory receptors (KIRs) on T cells in bronchoalveolar lavage fluid and peripheral blood of sarcoidosis patients. Sarcoidosis Vasc Diffus Lung Dis. 2000;17:54–9.
Momot T, Koch S, Hunzelmann N, Krieg T, Ulbricht K, Schmidt RE, et al. Association of killer cell immunoglobulin-like receptors with scleroderma. Arthritis Rheum. 2004;50:1561–5.
Nelson GW, Martin MP, Gladman D, Wade J, Trowsdale J, Carrington M. Cutting edge: heterozygote advantage in autoimmune disease: hierarchy of protection/susceptibility conferred by HLA and killer Ig-like receptor combinations in psoriatic arthritis. J Immunol. 2004;173:4273–6.
van der Slik AR, Koeleman BP, Verduijn W, Bruining GJ, Roep BO, Giphart MJ. KIR in type 1 diabetes: disparate distribution of activating and inhibitory natural killer cell receptors in patients versus HLA-matched control subjects. Diabetes. 2003;52:2639–42.
Moretta L, Moretta A. Killer immunoglobulin-like receptors. Curr Opin Immunol. 2004;16(5):626–33.
van Heemst J, Jansen DT, Polydorides S, Moustakas AK, Bax M, Feitsma AL, et al. Crossreactivity to vinculin and microbes provides a molecular basis for HLA-based protection against rheumatoid arthritis. Nat Commun. 2015;6:6681.
Sheridan BS, Lefrançois L. Intraepithelial lymphocytes: to serve and protect. Curr Gastroenterol Rep. 2010;12(6):513–21.
Genetic Analysis of Psoriasis Consortium & the Wellcome Trust Case Control Consortium 2, Strange A, Capon F, Spencer CC, Knight J, Weale ME, et al. A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1. Nat Genet. 2010;42:985–90.
Visscher PM, Hill WG, Wray NR. Heritability in the genomics era--concepts and misconceptions. Nat Rev Genet. 2008;9:255–66.
Hu X, Deutsch AJ, Lenz TL, Onengut-Gumuscu S, Han B, Chen WM, et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat Genet. 2016;47:898–905.
Satsangi J, Welsh KI, Bunce M, Julier C, Farrant JM, Bell JI, et al. Contribution of genes of the major histocompatibility complex to susceptibility and disease phenotype in inflammatory bowel disease. Lancet. 1996;347:1212–7.
International HIV Controllers Study, Pereyra F, Jia X, McLaren PJ, Telenti A, de Bakker PI, et al. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science. 2011;330:1551–7.
Nishida N, Ohashi J, Khor S, Sugiyama M, Tsuchiura T. Understanding of HLA-conferred susceptibility to chronic hepatitis B infection requires HLA genotyping- based association analysis. Sci Rep. 2016;6:24767.
Zhu M, Dai J, Wang C, Wang Y, Qin N, Ma H, et al. Fine mapping the MHC region identified four independent variants modifying susceptibility to chronic hepatitis B in han chinese. Hum Mol Genet. 2015;25:1225–32.
Duggal P, Thio CL, Wojcik GL, Goedert JJ, Mangia A, Latanich R, et al. Genome-wide association study of spontaneous resolution of hepatitis C virus infection: data from multiple cohorts. Ann Intern Med. 2013;158:235–45.
Chen D, Gaborieau V, Zhao Y, Chabrier A, Wang H, Waterboer T, et al. A systematic investigation of the contribution of genetic variation within the MHC region to HPV seropositivity. Hum Mol Genet. 2015;24:2681–8.
Sveinbjornsson G, Gudbjartsson DF, Halldorsson BV, Kristinsson KG, Gottfredsson M, Barrett JC, et al. HLA class II sequence variants influence tuberculosis risk in populations of European ancestry. Nat Genet. 2016;48:318–22.
Pelak K, Goldstein DB, Walley NM, Fellay J, Ge D, Shianna KV, et al. Host determinants of HIV-1 control in African Americans. J Infect Dis. 2010;201:1141–9.
Fellay J, Shianna KV, Ge D, Colombo S, Weale M, Zhang K, et al. A whole-genome association study of major determinants for host control of HIV-1. Science. 2007;317:944–7.
Fellay J, Ge D, Shianna KV, Colombo S, Ledergerber B, Cirulli ET, et al. Common genetic variation and the control of HIV-1 in humans. PLoS Genet. 2009;5, e1000791.
Wei Z, Liu Y, Xu H, Tang K, Wu H, Lu L, et al. Genome-wide association studies of HIV-1 host control in ethnically diverse Chinese populations. Sci Rep. 2015;5:10879.
McLaren PJ, Coulonges C, Ripke S, van den Berg L, Buchbinder S, Carrington M, et al. Association study of common genetic variants and HIV-1 acquisition in 6,300 infected cases and 7,200 controls. PLoS Pathog. 2013;9(7), e1003515.
Limou S, Le Clerc S, Coulonges C, Carpentier W, Dina C, Delaneau O, et al. Genomewide association study of an AIDS-nonprogression cohort emphasizes the role played by HLA genes (ANRS Genomewide Association Study 02). J Infect Dis. 2009;199:419–26.
Khor CC, Chau TNB, Pang J, Davila S, Long HT, Ong RTH, et al. Genome-wide association study identifies susceptibility loci for dengue shock syndrome at MICB and PLCE1. Nat Genet. 2011;43:1139–41.
Dang T, Naka I, Sa-Ngasang A, Anantapreecha S, Chanama S, Wichukchinda N, et al. A replication study confirms the association of GWAS-identified SNPs at MICB and PLCE1 in Thai patients with dengue shock syndrome. BMC Med Genet. 2014;15:58.
Hu Z, Liu Y, Zhai X, Dai J, Jin G, Wang L, et al. New loci associated with chronic hepatitis B virus infection in Han Chinese. Nat Genet. 2013;45:1499–503.
Kim YJ, Kim HY, Lee J-H, Yu SJ, Yoon J-H, Lee H-S, et al. A genome-wide association study identified new variants associated with the risk of chronic hepatitis B. Hum Mol Genet. 2013;22:4233–8.
Jiang D-K, Ma X-P, Yu H, Cao G, Ding D-L, Chen H, et al. Genetic variants in five novel loci including CFB and CD40 predispose to chronic hepatitis B. Hepatology. 2015;62:118–28.
Chang S-W, Fann CS-J, Su W-H, Wang YC, Weng CC, Yu C-J, et al. A genome-wide association study on chronic HBV infection and its clinical progression in male Han-Taiwanese. PLoS One. 2014;9, e99724.
Kamatani Y, Wattanapokayakit S, Ochi H, Kawaguchi T, Takahashi A, Hosono N, et al. A genome-wide association study identifies variants in the HLA-DP locus associated with chronic hepatitis B in Asians. Nat Genet. 2009;41:591–5.
Nishida N, Sawai H, Matsuura K, Sugiyama M, Ahn SH, Park JY, et al. Genome-wide association study confirming association of HLA-DP with protection against chronic hepatitis B and viral clearance in Japanese and Korean. PLoS One. 2012;7(6), e39175.
An P, Winkler C, Guan L, O’Brien SJ, Zeng Z, Yu Y, et al. A common HLA-DPA1 variant is a major determinant of hepatitis B virus clearance in Han Chinese. J Infect Dis. 2011;203:943–7.
Mbarek H, Ochi H, Urabe Y, Kumar V, Kubo M, Hosono N, et al. A genome-wide association study of chronic hepatitis B identified novel risk locus in a Japanese population. Hum Mol Genet. 2011;20:3884–92.
Miki D, Ochi H, Takahashi A, Hayes CN, Urabe Y, Abe H, et al. HLA-DQB1*03 confers susceptibility to chronic hepatitis C in Japanese: a genome-wide association study. PLoS One. 2013;8(12), e84226.
Chen D, McKay JD, Clifford G, Gaborieau V, Chabrier A, Waterboer T, et al. Genome-wide association study of HPV seropositivity. Hum Mol Genet. 2011;20:4714–23.
Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449:851–61.
Liu H, Irwanto A, Fu X, Yu G, Yu Y, Sun Y, et al. Discovery of six new susceptibility loci and analysis of pleiotropic effects in leprosy. Nat Genet. 2015;47:267–71.
Zhang F-R, Huang W, Chen S-M, Sun L-D, Liu H, Li Y, et al. Genomewide association study of leprosy. N Engl J Med. 2009;361:2609–18.
Wong SH, Gochhait S, Malhotra D, Pettersson FH, Teo YY, Khor CC, et al. Leprosy and the adaptation of human toll-like receptor 1. PLoS Pathog. 2010;6:1–9.
Thye T, Vannberg FO, Wong SH, Owusu-Dabo E, Osei I, Gyapong J, et al. Genome-wide association analyses identifies a susceptibility locus for tuberculosis on chromosome 18q11.2. Nat Genet. 2010;42:739–41.
Tong X, Chen L, Liu S, Yan Z, Peng S, Zhang Y, et al. Polymorphisms in HLA-DRB1 gene and the risk of tuberculosis: a meta-analysis of 31 studies. Lung. 2015;193:309–18.
Fakiola M, Strange A, Cordell HJ, Miller EN, Pirinen M, Su Z, et al. Common variants in the HLA-DRB1-HLA-DQA1 HLA class II region are associated with susceptibility to visceral leishmaniasis. Nat Genet. 2013;45:208–13.
Martin MP, Gao X, Lee J-H, Nelson GW, Detels R, Goedert JJ, et al. Epistatic interaction between KIR3DS1 and HLA-B delays the progression to AIDS. Nat Genet. 2002;31:429–34.
Khakoo SI, Thio CL, Martin MP, Brooks CR, Gao X, Astemborski J, et al. HLA and NK cell inhibitory receptor genes in resolving hepatitis C virus infection. Science. 2004;305:872–4.
Borghans JAM, Mølgaard A, de Boer RJ, Keşmir C. HLA alleles associated with slow progression to AIDS truly prefer to present HIV-1 p24. PLoS One. 2007;2, e920.
Blanco-Gelaz MA, Suárez-Alvarez B, González S, López-Vázquez A, Martínez-Borra J, López-Larrea C. The amino acid at position 97 is involved in folding and surface expression of HLA-B27. Int Immunol. 2006;18:211–20.
Weiskopf D, Angelo MA, de Azeredo EL, Sidney J, Greenbaum JA, Fernando AN, et al. Comprehensive analysis of dengue virus-specific responses supports an HLA-linked protective role for CD8+ T cells. Proc Natl Acad Sci U S A. 2013;110:E2046–53.
Rao X, Hoof I, van Baarle D, Kesmir C, Textor J. HLA preferences for conserved epitopes: a potential mechanism for hepatitis C clearance. Front Immunol. 2015;6:1–9.
Yang D, Shui T, Miranda JW, Gilson DJ, Song Z, Chen J, et al. Mycobacterium leprae-infected macrophages preferentially primed regulatory T cell responses and was associated with lepromatous leprosy. PLoS Negl Trop Dis. 2016;10:1–13.
Winslow GM, Cooper A, Reiley W, Chatterjee M, Woodland DL. Early T-cell responses in tuberculosis immunity. Immunol Rev. 2008;225:284–99.
Bold TD, Ernst JD. CD4+ T cell-dependent IFN-γ production by CD8+ effector T cells in Mycobacterium tuberculosis infection. J Immunol. 2012;189:2530–6.
Liu D, Uzonna JE. The early interaction of Leishmania with macrophages and dendritic cells and its influence on the host immune response. Front Cell Infect Microbiol. 2012;2:83.
Sommer S. The importance of immune gene variability (MHC) in evolutionary ecology and conservation. Front Zool. 2005;2:16.
Carrington M, Nelson GW, Martin MP, Kissner T, Vlahov D, Goedert JJ, et al. HLA and HIV-1: heterozygote advantage and B*35-Cw*04 disadvantage. Science. 1999;283:1748–52.
Leslie A, Matthews PC, Listgarten J, Carlson JM, Kadie C, Ndung’u T, et al. Additive contribution of HLA Class I alleles in the immune control of HIV-1 infection. J Virol. 2010;84:9879–88.
Thursz MR, Thomas HC, Greenwood BM, Hill AVS. Heterozygote advantage for HLA class-II type in hepatitis B virus infection. Nat Genet. 1997;17:11–2.
Prugnolle F, Manica A, Charpentier M, Guégan JF, Guernier V, Balloux F. Pathogen-driven selection and worldwide HLA class I diversity. Curr Biol. 2005;15:1022–7.
Zhernakova A, Elbers CC, Ferwerda B, Romanos J, Trynka G, Dubois PC, et al. Evolutionary and functional analysis of celiac risk loci reveals SH2B3 as a protective factor against bacterial infection. Am J Hum Genet. 2010;86:970–7.
Liu Y, Helms C, Liao W, Zaba LC, Duan S, Gardner J, et al. A genome-wide association study of psoriasis and psoriatic arthritis identifies new disease loci. PLoS Genet. 2008;4(3), e1000041.
Migueles SA, Sabbaghian MS, Shupert WL, Bettinotti MP, Marincola FM, Martino L, et al. HLA B*5701 is highly associated with restriction of virus replication in a subgroup of HIV-infected long term nonprogressors. Proc Natl Acad Sci U S A. 2000;97:2709–14.
Chen H, Hayashi G, Lai OY, Dilthey A, Kuebler PJ, Wong TV, et al. Psoriasis patients are enriched for genetic variants that protect against HIV-1 disease. PLoS Genet. 2012;8:1–12.
Apps R, Qi Y, Carlson JM, Chen H, Gao X, Thomas R, et al. Influence of HLA-C expression level on HIV control. Science. 2013;340:87–91.
Kulkarni S, Qi Y, O’hUigin C, Pereyra F, Ramsuran V, McLaren P, et al. Genetic interplay between HLA-C and MIR148A in HIV control and Crohn disease. Proc Natl Acad Sci U S A. 2013;110:20705–10.
Chappell P, Meziane EK, Harrison M, Magiera L, Hermann C, Mears L, et al. Expression levels of MHC class I molecules are inversely correlated with promiscuity of peptide binding. Elife. 2015;2015:1–22.
Blander JM, Torchinsky MB, Campisi L. Revisiting the old link between infection and autoimmune disease with commensals and T helper 17 cells. Immunol Res. 2012;54:50–68.
Sfriso P, Ghirardello A, Botsios C, Tonon M, Zen M, Bassi N, et al. Infections and autoimmunity: the multifaceted relationship. J Leukoc Biol. 2010;87:385–95.
Campisi L, Barbet G, Ding Y, Esplugues E, Flavell RA, Blander JM. Apoptosis in response to microbial infection induces autoreactive TH17 cells. Nat Immunol. 2016;17:1084–92.
Ercolini AM, Miller SD. The role of infections in autoimmune disease. Clin Exp Immunol. 2009;155:1–15.
Armstrong DL, Zidovetzki R, Alarcón-Riquelme ME, Tsao BP, Criswell LA, Kimberly RP, et al. GWAS identifies novel SLE susceptibility genes and explains the association of the HLA region. Genes Immun. 2014;15:347–54.
McMurray RW, Elbourne K. Hepatitis C virus infection and autoimmunity. Semin Arthritis Rheum. 1997;26:689–701.
Maya R, Gershwin ME, Shoenfeld Y. Hepatitis B virus (HBV) and autoimmune disease. Clin Rev Allergy Immunol. 2008;34:85–102.
Sollid LM. Coeliac disease: dissecting a complex inflammatory disorder. Nat Rev Immunol. 2002;2:647–55.
Baschal EE, Eisenbarth GS. Extreme genetic risk for type 1A diabetes in the post-genome era. J Autoimmun. 2008;31(1):1–6.
Zanelli E, Breedveld FC, De Vries RR. HLA class II association with rheumatoid arthritis: facts and interpretations. Hum Immunol. 2000;61:1254–61.
Rider LG. The heterogeneity of juvenile myositis. Autoimmun Rev. 2007;6(4):241–7.
Libbey JE, Fujinami RS. Potential triggers of MS. Results Probl Cell Differ. 2010;51:21–42.
Smeekens SP, Ng A, Kumar V, Johnson MD, Plantinga TS, van Diemen C, et al. Functional genomics identifies type I interferon pathway as central for host defense against Candida albicans. Nat Commun. 2013;4:1342.
Gorer PA. The detection of a hereditary antigenic difference in the blood of mice by means of human group a serum. J Genet. 1936;32:17–31.
Bailey A, Dalchau N, Carter R, Emmott S, Phillips A, Werner JM, et al. Selector function of MHC I molecules is determined by protein plasticity. Sci Rep. 2015;5:14928.
Holling TM, Schooten E, van Den Elsen PJ. Function and regulation of MHC class II molecules in T-lymphocytes: of mice and men. Hum Immunol. 2004;65:282–90.
Burch HB, Cooper DS. Management of Graves disease: a review. JAMA. 2015;314:2544–54.
Burisch J, Jess T, Martinato M, Lakatos PL. The burden of inflammatory bowel disease in Europe. J Crohns Colitis. 2013;7(4):322–37.
Iaccarino L, Ghirardello A, Bettio S, Zen M, Gatto M, Punzi L, et al. The clinical features, diagnosis and classification of dermatomyositis. J Autoimmun. 2014;48–49:122–7.
Chang MH. Hepatitis B, virus infection. Semin Fetal Neonatal Med. 2007;12:160–7.
de Souza-Santana FC, Marcos EVC, Nogueira MES, Ura S, Tomimori J. Human leukocyte antigen class I and class II alleles are associated with susceptibility and resistance in borderline leprosy patients from Southeast Brazil. BMC Infect Dis. 2015;15:22.
We thank Kate McIntyre and Jackie Senior for editing the manuscript and Aggelo Matzaraki for help with graphic design.
VM is supported by a PhD scholarship from the Graduate School of Medical Sciences, University of Groningen, the Netherlands. VK is supported by the European Union's Framework Programme 7 (EU FP7) TANDEM project (HEALTH-F3-2012-305279). CW is supported by an FP7/2007-2013/ERC Advanced Grant (agreement 2012-322698), the Stiftelsen KG Jebsen Coeliac Disease Research Centre (Oslo, Norway), and a Spinoza Prize from the Netherlands Organization for Scientific Research (NWO SPI 92-266). AZ holds a Rosalind Franklin Fellowship (University of Groningen) and is also supported by CardioVasculair Onderzoek Nederland (CVON 2012-03).
VM and AZ drafted the manuscript. VK and CW critically revised the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Matzaraki, V., Kumar, V., Wijmenga, C. et al. The MHC locus and genetic susceptibility to autoimmune and infectious diseases. Genome Biol 18, 76 (2017). https://doi.org/10.1186/s13059-017-1207-1
- Major Histocompatibility Complex
- Ankylose Spondylitis
- Human Leukocyte Antigen
- Human Papilloma Virus
- Classic Human Leukocyte Antigen