High-throughput sequencing to decipher the genetic heterogeneity of deafness
© BioMed Central Ltd 2012
Published: 29 May 2012
Skip to main content
© BioMed Central Ltd 2012
Published: 29 May 2012
Identifying genes causing non-syndromic hearing loss has been challenging using traditional approaches. We describe the impact that high-throughput sequencing approaches are having in discovery of genes related to hearing loss and the implications for clinical diagnosis.
The identification of genes responsible for medically important traits is a major challenge in human genetics. This is particularly so for a disease with heterogenic pathology, such as hearing loss (HL). Hearing impairment is the most common sensory defect, affecting approximately one in 500 newborns and 4% of people younger than 45 years , reaching 50% by age 80 . It is estimated that 278 million people worldwide suffer from HL [3, 4], affecting child development, education, social integration, and the quality of life of the affected individual, with a substantial impact on public health. This high prevalence, combined with the striking genetic heterogeneity of deafness, has made this Mendelian disease a major challenge in terms of discovering its cause and deciphering the mechanisms underlying it. Sixty-three genes encoding proteins with a broad range of functions are known to be involved in HL. Many more are expected to be discovered, as over 100 loci have been mapped without the corresponding gene identified. However, because of the technological limitations in clinical diagnostics, mainly resulting from the large size of many genes and the high cost of Sanger sequencing, many hearing-impaired individuals with familial HL do not know the genetic cause of their HL. Owing to the complexity of the auditory apparatus and the vast genetic heterogeneity of HL, high-throughput sequencing, also known as massively parallel sequencing or next-generation sequencing (NGS), is the ideal tool to address this challenge (Box 1).
NGS has already had an important impact on both research and clinical diagnosis in other diseases, such as breast cancer , intractable inflammatory bowel disease  and Kabuki syndrome , as it enables screening of a large number of genes in one test. In addition, this approach does not require the collection of DNA samples from large affected families that previous linkage-based approaches to disease gene identification did. The recent advances in molecular biology and genomics have raised hopes of elucidating the complex network of auditory genes, which is the first step toward implementing a cure for HL. These advances highlight the need for additional discovery and characterization of all genes involved in HL. Here, we focus on the impact of genomics, and in particular NGS, on the progress in gene discovery for inherited deafness.
At least 60% of HL is regarded as having a genetic cause, as many of the environmental causes, such as ototoxic drugs, mumps or rubella, have been eliminated by modern medicine. Most environmental causes, including exposure to ototoxic drugs, exposure to rubella during gestation, trauma and excessive noise, are considered to have a genetic basis as well, as both the onset and severity of acquired hearing impairment may depend on the genetic background of the individual. Approximately 70% of all genetic HL is non-syndromic (NSHL), with half predicted to be monogenic. NSHL is inherited in a recessive mode in approximately 80% of cases, in a dominant mode in approximately 20%, and is either X-linked or mitochondrial in origin in 2 to 3%.
Mutations in over 60 genes have been found to disrupt auditory function, resulting in similar or different phenotypes, according to the function, location, pathway or network of proteins they encode . These genes encode a variety of proteins with a broad range of functions, including gap junctions (connexin 26 and connexin 32), transcription factors (POU4F3, POU3F4, TFCP2L3 and PAX3), ion channels (KCNQ1, KCNE1 and KCNQ4), molecular motors (myosin VI, myosin VIIA, SLC26A4 and prestin), extracellular proteins (alpha-tectorin, otoancorin and COL11A2) and structural proteins (otoferlin and diaphanous 1). Their expression pattern varies between proteins that are exclusively expressed in the mammalian inner ear (α-tectorin, cochlin and EYA4), to proteins that are expressed in several tissues (myosin VI, POU4F3 and whirlin), but surprisingly have mostly only been found to be involved in HL (reviewed in ). The many proteins required for proper functioning of the inner ear correlate with the complex structure of its six organs: the cochlea, responsible for hearing, and the saccule, utricle and the three semicircular canals, which control balance and spatial orientation. The development, differentiation and maintenance of this complex machinery explain the involvement of such a large number of genes with mutations leading to HL. Distinct proteins are responsible for the function of each compartment of the inner ear, as well as for the physiological and mechanistic aspects.
The gene most frequently involved in HL is GJB2, encoding the connexin 26 protein, with more than 100 dominant or recessive deafness-causing mutations detected in this small gene of only one coding exon of approximately 1 kb. Other frequently mutated genes include SLC26A4, MYO15A, OTOF, CDH23 and TMC1, with over 20 mutations reported to be involved in HL for each of these genes. A comprehensive list of genes implicated in HL can be found at the deafness variation database . The number of mutations in the other 'deafness' genes is lower and most of them have been reported in consanguineous families . However, unlike GJB2, many of these genes are large, with dozens of exons, such as MYO15A and CDH23, which have 65 and 68 exons, respectively. The number of genes and mutations found are most likely underestimated as a result of the pre-NGS era strategies used for gene identification. For example, large genes are not routinely completely analyzed using Sanger sequencing, as it is too time- and cost-consuming to be performed as part of routine clinical genetic diagnosis. Even for research purposes, the number of large genes involved in HL has made complete sequencing on a regular basis impractical, unless there were strong reasons to expect a mutation because of shared ethnic origin or phenotype. For the same practical reasons, the methods used for detection of known mutations (mutation-specific assays) have also led to an underestimation of the numbers of mutations in prevalent deafness genes, such as GJB2 and SLC26A4. In addition, some genes have been found to be associated with deafness in a particular population or one family, discouraging others from searching for mutations in these genes (if mutations in a particular gene were only found infrequently, scientists were less inclined to want to screen for mutations in this gene).
Although families with HL with different modes of inheritance are found all over the world, the majority of families reported with recessive deafness come from countries where consanguineous families are common, including North Africa, through the Middle East to India and Pakistan. The deafness loci for these consanguineous families were easily mapped by linkage analysis and homozygosity mapping, allowing locus identification using only a large single family. Dominant HL, in contrast, was mainly identified in families originating in Europe, North America and Australia .
Before the high-throughput technology era, disease locus identification was performed mainly by genome-wide linkage analysis using genetic markers, such as microsatellites or single nucleotide polymorphisms, commonly known as SNPs. The genetic linkage data obtained could be analyzed by various methods, such as parametric multipoint linkage analysis and, when relevant, homozygosity mapping, used to detect disease loci for autosomal recessive disorders, particularly in consanguineous pedigrees . Even though this approach led to the identification of many deafness genes, particularly in populations with social preference for endogamous or consanguineous marriage and large family size , it has significant limitations, including that it is suitable only for families with recessive diseases and needs at least two affected offspring, preferably with related parents. The limitations of linkage methods, the long time required and high cost of gene identification left many cases unsolved and the list of unresolved human loci linked with HL remained longer than the list of cloned genes . However, the linkage methods identified only one gene in each experiment and thus, in many of these cases, mutations were found in only one family, and in many other cases the causative gene has remained unknown.
To overcome this obstacle, efforts for large-scale screening of deafness genes have emerged, for example, by genotyping 198 mutations with a primer extension across eight prevalent genes (GJB2, GJB6, GJB3, GJA1, SLC26A4, SLC26A5, MTRNR1 and MTTS1) in a single test . Chromosomal imbalances have been identified by array comparative genomic hybridization (array CGH). For example, an inverted genomic duplication of the TJP2 gene was identified as the cause of progressive NSHL at the DFNA51 locus using this method . However, this technique can detect only large deletions or duplications, and was used after failing to detect standard mutations by other methods. Moreover, a systematic study of unsolved deafness cases has not been undertaken using array CGH, so it is not known what proportion of deafness is caused by large duplications or deletions. Clearly, there has been a need to develop techniques for large-scale screening of a larger number of genes in a reasonable amount of time and more cost-effective manner that can detect all types of mutations underlying deafness.
An example demonstrating the underestimated numbers of deafness genes in a given population is found in the Jewish Israeli population before the NGS era. In this population, the number of NSHL genes was estimated to be up to 22 across the Jewish ethnic groups . Before using high-throughput methods, nine NSHL genes were found over a period of 15 years . This number dramatically increased to 14, in a single experiment of targeted genomic capture followed by NGS, conducted on only five unrelated deaf individuals , promising a much larger number if all unsolved deaf probands were to be enrolled in this type of experiment.
Three commercial NGS technologies have been developed in recent years, each with its own advantages and disadvantages. These include pyrosequencing with the 454 sequencer (Roche Life Sciences) , cyclic reversible termination technology using the Illumina platform (Illumina) , and sequence-by-ligation technology using the SOLiD platform (Applied Biosystems)  (for a comprehensive review, see [23, 24]). These platforms have addressed the main problem of detecting causative mutations for heterogenic diseases, including those with dozens of genes involved, as is the case for deafness. Two of these platforms were compared in a targeted genomic capture and NGS experiment, in an effort to determine the most efficient method for identifying deafness genes for screening towards clinical diagnosis . Although both the SureSelect-Illumina and NimbleGen-454 platforms provided high specificity and sensitivity, the authors  concluded that the former platform was preferable with regard to scale, sensitivity and cost under their conditions. This combined approach, targeted capture and sequencing, seems to be the ideal tool to address the challenges of deciphering the genetics underlying HL: it enables the detection of all types of mutations; it allows the screening of large genes that have previously been largely untested; it can include all known deafness genes in a single test; and it can be used in cases of isolated deafness.
The contribution of targeted capture and next-generation sequencing to hearing research
Locus and inheritance
Whole exome region analyzed by bioinformatics
2.9 Mb, chromosome 9q34.3
Rehman et al. 
3.1 Mb, 1p13.3
Walsh et al. 
Pierce et al. 
1.81 Mb, 3q27
Sirmaci et al. 
Dominant and recessive NSHL
54 known deafness genes, exons
Three novel mutations in known deafness genes
Shearer et al. 
20 Mb, 19q12-13.4
Zheng et al. 
4.142-Mb linkage region, chromosome 5q31
Pierce et al. 
12.9 Mb, Xp22
Schraders et al. 
88 genes, exons, 1 kb promoter regions, 17.5 Mb region, chromosome Xp22.12
17.5 Mb, Xp22.12
Huebner et al. 
3.4 Mb, 19p13.2
Klein et al. 
Dominant and recessive NSHL
246 genes responsible for deafness in humans and mice, exons and 40 bp flanking introns
Four novel mutations in known deafness genes
Brownstein et al. 
1,034 nuclear genes encoding mitochondrial proteins, entire mtDNA and exons
Calvo et al. 
36.9 Mb, chromosomes 8, 15, 16, 19, 21
Mutation in known deafness gene
Sirmaci et al. 
Mutation in known deafness gene
Winkelmann et al. 
On a larger scale, whole exome sequencing is extremely promising, as it screens the exons of all genes in the human genome, enabling the discovery of novel deafness genes. It is estimated that approximately 60% of genes for Mendelian disease could be discovered using this technology . The data analysis is rather complicated and strategies are being used to ease this. One conducts homozygosity mapping for recessive families to narrow down the regions to analyze. This was done, in parallel to exome sequencing, leading to the identification of a GPSM2 mutation as the cause of HL associated with the DFNB82 locus  and a GIPC3 mutation in a family with consanguineous parents .
In addition to clinical diagnostics, a major goal of gene discovery is to decipher the mechanisms involved in deafness. Exome sequencing, through the initial findings of new genes associated with deafness, has provided this entry point for new biological insights. For example, although exome sequencing indicated that SMPX is the most suitable candidate for progressive hearing impairment in a large Dutch family, it did not seem to be an obvious candidate from a biological perspective, because the protein encoded by this gene had been implicated in striated muscle [31, 32]. Further investigations revealed that SMPX is indeed expressed in the cochlea and a role in development and/or maintenance of the sensory hair cells through integrin signaling and/or the insulin-like growth factor-1 pathway has been suggested. Thus, the next-generation sequencing approach has implicated new genes and pathways in deafness.
There are several remaining questions with regard to deep sequencing for hereditary HL. Will this technique become routine for HL? Will it be adopted by clinical laboratories on a routine basis? What are the ethical considerations involved? Much of the effort invested in the field of genomics aims to improve medical strategies in diagnostics, treatment, cure and prevention of disabilities. Although most disabled people welcome the new technologies, others might be ambivalent or even antagonistic towards genetic medicine . For deafness in particular, there has been an ongoing debate whether deafness is a real disability or rather a different culture . Although some communities, particularly in religious communities in the Middle East, consider deafness to be an unwanted disability, at the other extreme, there are deaf parents who would prefer to have deaf children and are ready to go as far as terminating a pregnancy if the fetus does not have HL . Even though the majority of deaf couples are not interested in prenatal diagnosis for HL, and tend to feel that termination of pregnancy on the basis of hearing status (either deaf or hearing) should be illegal , many hearing parents, after the birth of a deaf child, seek genetic counseling to prevent the reoccurrence of deafness in their family and some of them would consider termination of a hearing-impaired child .
From a wider point of view, HL has serious consequences for public health, with major economic and social implications, as it is estimated that at least 20% of the population develop a significant HL at some time during their lives . Because of the high heterogeneity of deafness, most deaf people are born to hearing parents, and most of this population has a strong desire for treatment or cure. Therefore, a major challenge is to discover the cause and decipher the mechanisms of deafness. Recent advances in genomics have made it possible to perform large-scale population screening, as well as for individualized testing. However, high-throughput genomics in the clinical setting is still in its infancy and before implementing it in routine clinical use, there is a clear need for standard laboratory procedures and regulations for quality control, data analysis and validation . Efforts in this direction have been initiated by the Centers for Disease Control and Prevention that sponsored a conference in 2011 on Next-Generation Sequencing, Standardization of Clinical Testing . In addition, there are controversial ethical issues regarding the immense amount of high-throughput data obtained and their applications, because these data are usually much broader than the specific topic of research. Precise rules for the use, storage and sharing of NGS data among collaborative research groups are currently lacking. In many cases data are required to be deposited in publicly accessible databases by research funding sources before publication . All these make informed consent for genomics difficult to define.
Nevertheless, initial findings of these advanced technologies for detection of deafness-causing mutations promise to solve the major portion of genetic deafness in the next few years, which will lead to improved genetic counseling and much more efficient treatment, as phenotypes could be predicted by the solved genotypes. Characterization of the proteins encoded by these genes will shed light on the biological mechanisms involved in the pathophysiology of HL, forming the basis for genetic-based therapeutics.
The development of Sanger sequencing in 1977  marked a turning point in the molecular genetics revolution, and this was followed by the implementation of the polymerase chain reaction 6 years later . These landmarks paved the way for the Human Genome Project, completed in 2001 . The field of genomics and its technological capacity have evolved at an extremely rapid pace since, leading to the development of high-throughput methods, including next-generation sequencing (NGS), also called massively parallel sequencing or deep sequencing. These advanced technologies, besides enhancing the ability to identify human disease mutations, have yielded a flood of genomic data. The increased wealth of genomic information has accelerated our understanding of complex biological processes and has provided broad clinical implications, but it has also required the development of advanced bioinformatics tools to deal with these massive amounts of data. NGS can produce over 10,000 times more data than the Sanger sequencing method. Whereas Sanger sequencing yields a 24-hour output of 120,000 bp for the cost of $4,000 per Mb sequenced , the output of a single NGS machine is larger than 30 Gb in 24 hours and costs less than $2 per Mb. The 3.2 Gb in a single human genome can therefore be sequenced in 1 day at a fairly low cost relative to what it would take with Sanger sequencing - up to 73 years at a cost of $200,000 . We are rapidly reaching the long-held dream of the $1,000 genome in less than a day, although the subsequent bioinformatic analysis is becoming the new bottleneck for rapid genome-based diagnosis.
Research in the Avraham laboratory for human genomics research is funded by the National Institutes of Health (NIDCD) R01DC011835, I-CORE Center No. 41/11, and the Hedrich Charitable Trust.