A low-cost open-source SNP genotyping platform for association mapping applications
© Macdonald et al.; licensee BioMed Central Ltd. 2005
Received: 6 June 2005
Accepted: 21 October 2005
Published: 2 December 2005
Association mapping aimed at identifying DNA polymorphisms that contribute to variation in complex traits entails genotyping a large number of single-nucleotide polymorphisms (SNPs) in a very large panel of individuals. Few technologies, however, provide inexpensive high-throughput genotyping. Here, we present an efficient approach developed specifically for genotyping large fixed panels of diploid individuals. The cost-effective, open-source nature of our methodology may make it particularly attractive to those working in nonmodel systems.
Understanding the genetic architecture of complex polygenic traits is a fundamental goal of modern biological and medical research, and the currently favored experimental paradigm is association mapping (reviewed by Carlson et al. ). Association studies genotype a dense set of single nucleotide polymorphisms (SNPs) in a large panel of individuals and test each SNP, or set of local haplotypes constructed from the SNP data, for a phenotype/disease association. A significant association at a query SNP suggests it is the causal polymorphism, or is in strong linkage disequilibrium with the causal site [2–4]. As a class, SNPs represent the most abundant form of genetic variation, with approximately two intermediate frequency SNPs per kilobase in the human genome . Thus, even with some a priori knowledge of a candidate gene region contributing to a disease phenotype, a large number of SNPs need to be genotyped in an association mapping study to ensure one of the genotyped SNPs is causative or is in strong linkage disequilibrium with the causative site. It is also important that SNPs are genotyped in a very large panel of individuals to provide sufficient power to detect variants that may have only subtle phenotypic effects. Studies suggest panel sizes of much larger than 1,000 individuals are required to achieve modest power to detect associations if they are present [4, 6, 7].
A plethora of SNP genotyping platforms is currently available (reviewed by Kwok  and Syvänen [9, 10]). Several excellent technologies genotype thousands of sites simultaneously, for example, Perlegen Sciences Inc. genotyping arrays , Affymetrix Inc. GeneChip arrays [12–15], and Illumina Inc. BeadArray technology coupled with the GoldenGate genotyping assay [16–18]. Such methods may not be cost effective for genotyping a large panel for a more modest number of SNPs. Other methods, such as Biotage Inc. Pyrosequencing [19, 20], Applied Biosystems TaqMan approach [21, 22], and certain template-directed single base extension methods , are readily applied to a large panel, but optimal probes must be designed for each SNP, and multiplexing may be difficult or impossible. Between these two extremes (ultra-high multiplexing and low/no multiplexing) it is difficult to identify the right genotyping system to efficiently and cost-effectively generate genotypes for a very large sample (thousands of individuals) for an intermediate number of SNPs (tens to hundreds of sites). This may be particularly true for those working on nonhuman systems. For human biologists there are several 'off-the-shelf' commercial genotyping solutions. For instance, Affymetrix produce GeneChip 100K arrays , offering a fixed set of 100,000 SNPs distributed across the human genome, and pre-designed Applied Biosystems TaqMan assays [21, 22] are available for over two million human SNPs. Outside of humans, however, readily available inexpensive genotyping solutions are unavailable, and are likely to remain so for some time. Thus, even as the cost of sequencing continues to fall, and the number of SNPs identified in a variety of nonhuman organisms increases, researchers in nonmodel systems may have difficulty identifying a genotyping system that suits their needs.
Here we describe a low cost SNP genotyping platform developed specifically for large panel sizes and an intermediate number of SNPs. Our platform allows hundreds of SNPs and insertion/deletion polymorphisms to be genotyped in thousands of individuals, and thus may be particularly appropriate for dissecting complex traits in cases where the search space is limited to a set of candidate gene regions. In common with many SNP genotyping systems used today, our method is an amalgam of well-known, robust techniques, including PCR [24, 25], hybridization , and the oligonucleotide ligation assay (OLA) . We employ a multiplexed OLA, ligation-dependent amplification of allele-specific products, and array-based allele-detection. Our approach builds on the work of Gerry et al. , and shares a number of similarities with commercial technologies, including Keygene's SNPWave , and Applied Biosystem's SNPlex , yet offers potentially higher throughput as it detects allele-specific products via arrays as opposed to size separation using a capillary sequencing instrument. Our method is cost-effective for very large panels of individuals (less than $0.03/genotype), does not entail purchasing expensive proprietary equipment or modified long oligonucleotides, and allows robust, parallelized genotyping of many SNPs with limited sample handling. In pursuit of an open-source genotyping system, in the manner of the Brown-style  microarray technology, we have made all details of the method available in the Additional data files. These include plans for constructing a Cartesian arraying robot, the associated controller software, detailed protocols for the molecular biology steps, and software for designing the SNP assays and for calling genotypes.
Results and discussion
We designed SNP genotyping assays for 156 biallelic polymorphisms in the Enhancer of split locus and 12 SNPs upstream of the hairy locus in Drosophila melanogaster. These 168 polymorphisms were genotyped in a fixed panel of approximately 2,000 flies from a single outbred population. DNA extracted from the fly population was arrayed into six 384-well plates, and used as template for 12 long (2 to 3 kb) PCR amplicons, which in turn were used as template for multiplexed OLA reactions. Twenty 8-plex OLA reactions were performed on single 2 to 3 kb amplicons as template, and one 8-plex reaction used two pooled PCR amplicons as template. Following amplification of the products of ligation, each sample was printed in duplicate onto nylon membranes. This resulted in a set of 10 membranes holding SNPs incorporating barcode pairs 01 to 08, and a set of 11 membranes holding SNPs incorporating barcode pairs 09 to 16. Within each set, membranes were combined and sequentially hybridized with the appropriate 16 labeled barcodes to generate the genotype data. The background-subtracted array intensity data are provided in Additional data files 9 (replicate spot 1) and 10 (replicate spot 2), and the genotypes assigned to the individuals are given in Additional data file 11.
Sensitivity to secondary SNPs
All OLA-based genotyping approaches rely on oligos binding to a small region flanking the query SNP. If this flanking region harbors a minor allele at a SNP other than the query SNP, binding and subsequent ligation efficiency could be hindered if designed OLA oligonucleotides only match the major allele at this secondary SNP. Thus, a secondary SNP could cause the entire genotyping assay to fail, or in double heterozygotes for the query and secondary SNPs, result in incorrect genotype assignment. Because full resequencing data were available around each of the query SNPs (16 alleles for Enhancer of split  and 10 alleles for hairy ), we were able to assess the sensitivity of OLA-based genotyping to secondary SNPs in oligo binding regions.
When the resequencing data indicate that there are no secondary SNPs flanking a query SNP, 86% (104/121) of the assays we designed converted. In contrast, just 65% (22/34) of the assays converted when a secondary SNP was present, and OLA oligos were designed to match only the major allele at that secondary SNP. It is of interest that the likelihood of an assay with a secondary SNP failing did not seem to depend on whether the secondary SNP was in the upstream or downstream oligo binding region, or on the distance of the secondary SNP from the query SNP. If we controlled for secondary SNPs by incorporating degenerate bases into the OLA oligos, then the success rate was equivalent (85%, 11/13) to that observed for query SNPs without secondary SNPs. Thus, our data suggest that if SNPs are identified via resequencing, employing degenerate bases in the OLA oligos can control for secondary SNPs.
For OLA assays that convert, but have an uncontrolled secondary SNP, the miscall rate can be appreciably higher than for sites without a secondary SNP. The OLA assay for site es09.C20633T in Enhancer of split did not control for a pair of secondary SNPs (one 8 base pairs (bp) upstream and one 9 bp downstream, both at a frequency 1/16 in the resequenced alleles) and converted to an apparently working assay. To check the accuracy of the OLA genotypes for es09.C20633T we sequenced 354 diploid individuals (GenBank accession numbers AY905900 to AY906258), and 3.1% (11/354) gave discordant genotypes. In each case a true C/T heterozygote was incorrectly called a T/T homozygote due to heterozygosity at a secondary SNP: in 10/11 individuals one of the previously identified segregating sites was to blame, while the remaining error was due to a previously unidentified SNP 1 bp downstream of the query SNP. Secondary SNPs may present a general problem for OLA-based genotyping methodologies, although their impact is dependent on the likelihood of there being a segregating site within the 16 base pairs upstream and downstream of the query SNP. Thus, for species with high levels of nucleotide diversity, such as Drosophila, the effect of secondary SNPs on OLA-based genotyping is expected to be more pronounced than for species with lower levels of diversity, such as humans.
Adherence to Hardy-Weinberg equilibrium (HWE) is a common criterion with which to assess the quality of a genotyping assay, as a deviation can suggest incorrect genotype assignments . However, selection, mutation or migration can also cause deviation from HWE, and the power to detect these processes increases with the sample size . Of our 115 converting OLA assays with either no secondary SNPs or secondary SNPs controlled for via degenerate bases in the OLA oligos, 34 showed a deviation from HWE at P < 0.05. This is more than expected by chance, although the deviations from HWE were generally slight (the absolute mean disequilibrium for these 34 sites is D = 0.012). We hypothesized that the large panel size employed in our study (2,000 individuals) enabled detection of subtle violations of the HWE assumptions, which would not have been observed in a smaller panel. To test this hypothesis, we sampled 96 genotyped individuals at random from the population, and estimated the deviation from HWE for the same 115 SNPs. Over 1,000 sampled replicates, the average number of assays deviating from HWE was 8, similar to the 6 expected by chance alone.
Number of OLA and sequence data points*
Identical data points
In the SNP genotyping literature, repeatability, or how often a technology gives concordant genotypes across replicates, is sometimes used as a surrogate for accuracy, or how often a technology yields the correct genotype. We suspect that the cases of incorrect genotype calls caused by uncontrolled secondary SNPs that we mention above are highly repeatable. Thus, for ligation-based genotyping of material not subject to resequencing multiple alleles, measures of repeatability will overestimate the genotyping accuracy for some SNPs.
Conversion and call rate
We attempted to genotype 168 SNPs and biallelic insertion/deletion polymorphisms. If we ignore the 34 assays developed without regard to secondary SNPs in OLA genotyping oligo binding regions, 86% (115/134) of the assays convert. This conversion rate is particularly notable because it is derived from the actual production genotyping pipeline rather than independent proof-of-principal experiments. Furthermore, subsequent work has demonstrated very similar conversion rates for OLA genotyping assays conducted at 12- and 16-plex (data not shown). The call rate (that is, the number of individuals assigned a genotype) for the 115 converting assays here averages 93.9%, and we estimate the miscall rate to be <0.35%. Over the 115 converting assays, on average 1.1% of the individuals were assigned a genotype for only one of the two replicate spots on the membrane, and just 0.06% were assigned different genotypes for each replicate spot. Thus, for a very slight reduction in assay robustness, one could effectively double membrane density, and therefore throughput, by spotting samples only once.
Comparison with existing methods
It has been pointed out by Syvänen [9, 10] that while a plethora of SNP genotyping platforms exist, they are generally based on only a small number of basic reaction principles (for example, OLA , ASO [24, 40, 41], and single-base extension ), assay formats (for example, arrays, beads/microparticles, electrophoresis), and allele detection methods (for example, fluorescence, radiation, size separation, mass spectrometry). As such, most SNP genotyping platforms can be seen as modular, and the system we describe here is no exception: Following an initial, complexity-reducing PCR amplification, we genotype multiple SNPs in liquid-phase using OLA reactions, and subsequently detect SNP alleles by hybridizing radiolabeled probes to nylon membrane arrays.
Originally developed by Landegren et al. , many SNP genotyping methods have taken advantage of the high specificity and multiplexing capability of ligation-based genotyping [17, 18, 22, 28, 29, 36, 44–55]. A common way to distinguish the products of a multiplex genotyping reaction (not only OLA-based reactions) is to incorporate specific nucleotide sequences (variously called barcodes, addresses, zip-codes, stuffer sequences, or tags) into the allele-specific genotyping oligos [17, 18, 28, 29, 35, 37, 38, 42, 44, 53–57]. In combination with fluorescent labeling of oligonucleotides, this procedure allows different SNPs, and alternative SNP alleles to be recognized. In the system we describe, alleles are detected by hybridizing radiolabeled oligonucleotide probes - complementary to the barcode sequences - to nylon membrane arrays of denatured, PCR amplified OLA products. This has the advantage of allowing a very large sample of individuals (up to 4,608) to be simultaneously genotyped for an intermediate number of SNPs (by probing multiple membranes). A reverse approach, pioneered by Gerry et al. , is to probe universal barcode, or tag arrays, with the genotyping reaction products, and discriminate alleles with fluorescent labels. The use of tag arrays has been employed in a variety of SNP genotyping technologies [16–18, 28, 35, 37, 38, 42, 54, 55, 57]. Given that the density of features on a tag array can be very high, methods that make use of them can genotype a very large number of SNPs simultaneously. However, because the number of individuals assayed is dependent on how many tag arrays can be examined, projects may be limited to hundreds, rather than thousands, of individuals. To increase the number of individuals assayed for a more modest number of SNPs, some researchers have had success using arrays-of-arrays [37, 58]. Small tag arrays are printed in standard microtitre plate format, such that the contents of each well (a multiplexed genotyping reaction for a single individual) is hybridized to a separate array.
Array-based technologies are in widespread use. Arrays are used for applications as diverse as whole-genome expression profiling, polymorphism identification , and sequencing , and some of the companies providing ultra high-throughput genotyping solutions (for example, Illumina, Affymetrix) employ arrays. Nevertheless, SNP genotyping on arrays may not be an ideal solution for all researchers, particularly those with moderate genotyping requirements who may not wish to invest in array equipment. There are a variety of methods available that use the flexibility of ligation-based genotyping to generate sets of fluorescently labeled products of differing electophoretic mobility that can be resolved on an automated capillary sequencing instrument [22, 29, 44, 46, 48, 52].
The full cost of any method is difficult to measure, and also may not translate well among institutions. We estimate that the cost in consumables (for example, oligonucleotides, reagents, plasticware, nylon membranes, and radiation/disposal costs), including the cost of failing assays, for the work presented in this study is less than $0.03/genotype. Across genotyping technologies, this is at the lower end of the cost per genotype scale. In common with every other genotyping method, some form of robotic liquid-handling system is required for our approach, as is a reasonable thermocycling capacity. Unlike some other methods however, the platform-specific requirements of the method we outline are few (membrane arraying robot, hybridization oven, phosphor imager, and phosphor screens), and we contend that much of this equipment is available to the majority of academic researchers, or in the case of the arraying robot, can be inexpensively built (Additional data file 6).
An ideal genotyping system, capable of genotyping millions of SNPs for thousands of individuals at low cost, does not exist. Therefore, the best genotyping method must be chosen on the basis of the specific requirements of the envisioned genotyping project, and the resources available. Our method adds to the diversity of the available technology, in particular because it fits into a multiplexing niche (high panel size, moderate number of SNPs) not well covered by existing technologies, and because of its open-source nature. Our method has been developed specifically for the high-depth association mapping applications we carry out in our laboratory (for example, Macdonald et al. ), and the method achieves cost-effectiveness in large part due to the very large panel sizes employed. Thus, the method is very unlikely to be suitable for projects involving thousands of SNPs in just a few hundred individuals, or for projects that do not involve a large fixed panel of individuals. Radioactive allele-detection also contributes to the low cost of the presented method. Such a detection strategy is clearly unwieldy in an ultra-high-throughput genome center. As such, we envisage our technology being employed in individual academic research laboratories where, given the widespread use of other radiation-based approaches, presumably utilizing radiation is not a barrier. The open-source nature of our platform, in contrast to similar commmercial genotyping systems (for example, Applied Biosystem's SNPlex), may also be attractive to some researchers, as it allows the technology to be altered to suit a specific need. The method we outline may fill a genotyping niche in an academic research environment where commercial solutions are unavailable, as is regularly the case for those working on the genetics of nonhuman systems.
We describe a genotyping pipeline that uses a multiplexed OLA applied to PCR amplified DNA, followed by amplification of ligation products using common primers, and array-based detection. We tested 168 genotyping assays in parallel for a panel of 2,000 D. melanogaster individuals, and collected over a quarter of a million genotypes at a cost of less than $0.03/genotype. The assay conversion rate was 86%, and for converting assays 94% of the individuals were assigned a genotype with 99.65% accuracy, as determined by dideoxy sequencing. The methods we describe do not require a great deal of specialized equipment, and may be of great utility for carrying out high-power association mapping of candidate gene regions in individual laboratories. The methodology may help bridge the gap between highly multiplexed technologies capable of genotyping thousands of sites simultaneously, but which can be very costly for large samples of individuals, and methods that are easily extended to large populations, but can be difficult to multiplex beyond a small number of SNPs.
Materials and methods
Genomic DNA and PCR amplification
Over 2,000 Drosophila melanogaster flies were collected from a single outbred population, and genomic DNA extracted from each using the PureGene cell and tissue DNA isolation kit (Gentra Systems Inc. Minneapolis, MN, USA). The DNA from each fly was diluted to 200 μl in 0.1 × TE (1 mM Tris-HCl pH 8.0, 0.1 mM EDTA), and 1 μl aliquoted directly into a series of 384-well plates and dried down. The resulting DNA panel consisted of six 384-well plates (including the 2,000 outbred individuals and a variety of controls), and each set of DNA was used as template in standard 5 μl PCR reactions. We amplified twelve 2 to 3 kb amplicons for the complete panel of D. melanogaster DNA: eleven amplicons were developed across the Enhancer of split locus, and a single amplicon was developed upstream of the hairy locus. Oligo sequences are listed in Additional data file 2.
We identified polymorphisms using an alignment of 16 resequenced alleles for the Enhancer of split locus (GenBank accession numbers AY779906 to AY779921; Additional data file 3) , and designed genotyping assays for 156 biallelic polymorphisms (both SNPs and simple insertion/deletion events). Also, an alignment of 10 alleles for the hairy locus (GenBank accession numbers AY055833 to AY055842)  was used to design genotyping assays for 12 SNPs upstream of the hairy gene. Genotyping oligo sequences are listed in Additional data file 2, and details of the polymorphisms are provided in Additional data file 4.
Probes and barcode sequences
Unmodified genotyping oligos were purchased at the lowest synthesis scale from Illumina Inc. (San Diego, CA, USA) and from Sigma-Genosys (St. Louis, MO, USA), and resuspended at a concentration of 100 μM in 1 × low EDTA TE (10 mM Tris-HCl pH 8.0, 0.1 mM EDTA). Downstream genotyping oligos were individually phosphorylated at the 5' end in 12.5 μl reactions containing 1 × T4 polynucleotide kinase buffer (New England Biolabs Inc., Ipswich, MA, USA), 1 mM ATP, 10 units T4 polynucleotide kinase (NEB), and 200 pmol oligo. These reactions were incubated for 60 minutes at 37°C and 20 minutes at 65°C. We found it difficult to reliably phosphorylate several oligonucleotides simultaneously (data not shown).
OLA and OLA amplification reaction conditions
The OLA reactions are just 3 μl in volume, and contain 1 × OLA buffer (50 mM Tris-HCl pH 8.5, 50 mM KCl, 7.5 mM MgCl2, 1 mM NAD), 2.5 mM dithiothreitol, 1.6 units Taq (Thermus aquaticus) DNA ligase (NEB), and 0.03 pmol of each genotyping oligo. Each OLA reaction mix is spiked with 0.2 μl of PCR product using a HydraII 96-syringe pipetting unit (Matrix Technologies Corporation, Hudson, NH, USA). The small reaction sizes ensure that reagent costs are kept to a minimum. Ligation is performed using the following cycling profile: initial denaturation for 5 minutes at 95°C, followed by 3 cycles of 30 s at 95°C and 25 minutes at 45°C, followed by storage at 4°C. When perfectly matched up- and downstream oligos are juxtaposed to form a duplex with the amplified DNA they are ligated together (Figure 1a). The OLA is very efficient at discriminating between perfectly and imperfectly matched upstream oligonucleotides [27, 44, 62]. We genotyped 168 query polymorphisms using this approach; 160 of these were assayed in 20 8-plex OLA reactions using single 2 to 3 kb amplicons as template, while the remaining 8 were genotyped in a single 8-plex reaction using two pooled PCR amplicons as template.
Ligation products are PCR amplified using M13 forward and reverse primers matching the tails incorporated into the up- and downstream genotyping oligos (Figure 1b). To minimize plate handling, this is achieved by directly adding 12 μl post-OLA amplification cocktail directly to the OLA reactions. The amplification cocktail consists of 1 × amplification buffer (50 mM KCl, 0.1% Triton X-100), 50 μM each dNTP (NEB), Taq DNA polymerase, and 1 μM of the M13 forward and reverse amplification oligos. The ligation products are amplified using the following cycling profile: initial denaturation for 2 minutes at 94°C, followed by 32 cycles of 25 s at 94°C, 35 s at 58°C and 35 s at 72°C, followed by 2 minutes at 72°C, and storage at 4°C.
Array-based allele detection
The 15 μl OLA amplification reactions are dried down at 65°C in a thermal cycler, resuspended in denaturing buffer (0.5 M NaOH, 1.5 M NaCl), heated for 15 minutes at 65°C and 5 minutes at 95°C, and immediately arrayed onto uncharged nylon membranes (Millipore Corporation, Billerica, NH, USA) without cleanup. Following UV cross-linking at 50 mJ, membranes are bathed in neutralization buffer (0.4 M Tris-HCl pH 7.4, 2× SSC) for 30 minutes, and stored at 4°C in neutralization buffer until required. Our home-built Cartesian arraying robot uses 384 solid pins (V & P Scientific Inc., San Diego, CA, USA), can be inexpensively constructed (Additional data file 6), and is controlled by our custom Arrayatron perlscript (Additional data file 7) from a regular PC. Our standard production macroarray membranes are 120 mm × 75 mm, and hold 4,608 features. Each sample was printed in duplicate, resulting in a set of 10 membranes holding SNPs incorporating barcode pairs 01 to 08, and a set of 11 membranes holding SNPs incorporating barcode pairs 09 to 16. Each set of membranes were combined in single hybridization tubes, and pre-hybridized for 3 hours (overnight for first use of membranes) at 42°C in 5 ml hybridization buffer (0.525 M sodium phosphate buffer pH 7.2, 7% SDS, 1 mM EDTA, 10 mg/ml bovine serum albumin) containing 0.1 mg/ml denatured sonicated herring sperm DNA. Following pre-hybridization, the membranes were hybridized for 4 hours at 42°C in 5 ml hybridization buffer with 0.1 mg/ml denatured sonicated herring sperm DNA and a [γ-33P]ATP end-labeled oligonucleotide probe (complementary to the barcode sequence; Table 2). The 10 μl end-labeling reaction contains 1 × T4 polynucleotide kinase buffer (NEB), 10 units T4 polynucleotide kinase (NEB), 1 μM oligo, and 2 μCi/μl [γ-33P]ATP (PerkinElmer Life and Analytical Sciences Inc., Boston, MA, USA), and is incubated for 40 minutes at 37°C and 15 minutes at 80°C. After hybridization, the membranes are washed five times for 20 minutes at 40°C in 50 ml washing buffer (5 × SSPE, 0.1% SDS), and exposed against phosphor screens (Figure 1c). After scanning, membranes are stripped for 15 minutes at 80°C in 50 ml stripping buffer (0.1% SDS), and stored at 4°C in neutralization buffer until re-probing.
In concert with recycling barcodes across different SNPs, hybridizing multiple membranes allows simultaneous scoring of many SNPs. Radioactive detection is cost-effective, robust, and does not require a great deal of equipment (for example, hybridization oven, phosphor imager) not already available to many investigators. We have found, however, that the same arrays simultaneously probed with two infrared-labeled probes (IR-700 and IR-800) and detected using an Odyssey imaging system (Li-Cor Inc., Lincoln, NE, USA) yield equivalent genotypes. This non-radioactive detection system has several advantages and may prove a worthwhile extension to our method.
Additional data files
The following additional files are available with the online version of this article. Additional data file 1 is a PDF providing full step-by-step protocols for the described SNP genotyping platform. Additional data file 2 is a spreadsheet giving all of the oligonucleotide sequences used for PCR, sequencing and genotyping. Additional data file 3 holds the alignment of the 16 D. melanogaster alleles sequenced for the Enhancer of split gene region. Additional data file 4 is a spreadsheet providing details of the 168 polymorphisms assayed in this study. Additional data file 5 is the SNPatron perlscript, used to extract the sequence flanking all SNPs and polymorphic insertion/deletion events from a set of aligned sequences. Additional data file 6 is a PDF describing the construction of our arraying robot. Additional data file 7 presents the Arrayatron perlscript used to control the arraying robot. Additional data file 8 gives the script used to call genotypes, which is written in the statistical programming language R. The background-subtracted array intensity data for each allele from each genotyped site are provided in Additional data files 9 (replicate spot 1) and 10 (replicate spot 2), and the called genotypes are given in Additional data file 11. Additional data file 12 plots the intensity data for the entire panel of individuals for the 19 SNPs used in the genotype-validation test, with the tested individuals color-coded by the genotype assigned.
We thank JD Gruber and three anonymous reviewers for helpful comments on the manuscript. This work was supported by National Institutes of Health grant GM 58564 to A.D.L..
- Carlson CS, Eberle MA, Kruglyak L, Nickerson DA: Mapping complex disease loci in whole-genome association studies. Nature. 2004, 429: 446-452. 10.1038/nature02623.PubMedView ArticleGoogle Scholar
- Risch N, Merikangas K: The future of genetic studies of complex human diseases. Science. 1996, 273: 1516-1517.PubMedView ArticleGoogle Scholar
- Kruglyak L: Prospects of whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet. 1999, 22: 139-144. 10.1038/9642.PubMedView ArticleGoogle Scholar
- Long AD, Langley CH: The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 1999, 9: 720-731.PubMedPubMed CentralGoogle Scholar
- Kruglyak L, Nickerson DA: Variation is the spice of life. Nat Genet. 2001, 27: 234-236. 10.1038/85776.PubMedView ArticleGoogle Scholar
- Altshuler D, Hirschhorn JN, Klannemark M, Lindgren CM, Vohl MC, Nemesh J, Lane CR, Schaffner SF, Bolk S, Brewer C, et al: The common PPARγPro12Ala polymorphism is associated with decreased risk of type 2 diabetes. Nat Genet. 2000, 26: 76-80. 10.1038/79839.PubMedView ArticleGoogle Scholar
- Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN: Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003, 33: 177-182. 10.1038/ng1071.PubMedView ArticleGoogle Scholar
- Kwok PY: Methods for genotyping single nucleotide polymorphisms. Annu Rev Genomics Hum Genet. 2001, 2: 235-258. 10.1146/annurev.genom.2.1.235.PubMedView ArticleGoogle Scholar
- Syvänen AC: Accessing genetic variation: genotyping single nucleotide polymorphisms. Nat Rev Genet. 2001, 2: 930-942. 10.1038/35103535.PubMedView ArticleGoogle Scholar
- Syvänen AC: Toward genome-wide SNP genotyping. Nat Genet. 2005, 37 Suppl: S5-S10. 10.1038/ng1558.PubMedView ArticleGoogle Scholar
- Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, Cox DR: Whole-genome patterns of common DNA variation in three human populations. Science. 2005, 307: 1072-1079. 10.1126/science.1105436.PubMedView ArticleGoogle Scholar
- Fodor SP, Read JL, Pirrung MC, Stryer L, Lu AT, Solas D: Light-directed, spatially addressable parallel chemical synthesis. Science. 1991, 251: 767-773.PubMedView ArticleGoogle Scholar
- Pease AC, Solas D, Sullivan EJ, Cronin MT, Holmes CP, Fodor SP: Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci USA. 1994, 91: 5022-5026.PubMedPubMed CentralView ArticleGoogle Scholar
- Matsuzaki H, Loi H, Dong S, Tsai YY, Fang J, Law J, Di X, Liu WM, Yang G, Liu G, et al: Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array. Genome Res. 2004, 14: 414-425. 10.1101/gr.2014904.PubMedPubMed CentralView ArticleGoogle Scholar
- Matsuzaki H, Dong S, Loi H, Di X, Liu G, Hubbell E, Law J, Berntsen T, Chadha M, Hui H, et al: Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat Methods. 2004, 1: 109-111. 10.1038/nmeth718.PubMedView ArticleGoogle Scholar
- Oliphant A, Barker DL, Stuelpnagel JR, Chee MS: BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques. 2002, Suppl: 56-58.PubMedGoogle Scholar
- Fan JB, Oliphant A, Shen R, Kermani BG, Garcia F, Gunderson KL, Hansen M, Steemers F, Butler SL, Deloukas P, et al: Highly parallel SNP genotyping. Cold Spring Harbor Symp Quant Biol. 2003, 68: 69-78. 10.1101/sqb.2003.68.69.PubMedView ArticleGoogle Scholar
- Shen R, Fan JB, Campbell D, Chang W, Chen J, Doucet D, Yeakley J, Bibikova M, Wickham Garcia E, McBride C, et al: High-throughput SNP genotyping on universal bead arrays. Mutat Res. 2005, 573: 70-82.PubMedView ArticleGoogle Scholar
- Ronaghi M, Uhlen M, Nyren P: A sequencing method based on real-time pyrophosphate. Science. 1998, 281: 363-365. 10.1126/science.281.5375.363.PubMedView ArticleGoogle Scholar
- Alderborn A, Kristofferson A, Hammerling U: Determination of single-nucleotide polymorphisms by real-time pyrophosphate DNA sequencing. Genome Res. 2000, 10: 1249-1258. 10.1101/gr.10.8.1249.PubMedPubMed CentralView ArticleGoogle Scholar
- Livak KJ: Allelic discrimination using fluorogenic probes and the 5' nuclease assay. Genet Anal. 1999, 14: 143-149.PubMedView ArticleGoogle Scholar
- De La Vega FM, Lazaruk KD, Rhodes MD, Wenz MH: Assessment of two flexible and compatible SNP genotyping platforms: TaqMan SNP genotyping assays and the SNPlex genotyping system. Mutat Res. 2005, 573: 111-135.PubMedView ArticleGoogle Scholar
- Chen X, Levine L, Kwok PY: Fluorescence polarization in homogeneous nucleic acid analysis. Genome Res. 1999, 9: 492-498.PubMedPubMed CentralGoogle Scholar
- Saiki RK, Bugawan TL, Horn GT, Mullis KB, Erlich HA: Analysis of enzymatically amplified β-globin and HLA-DQα DNA with allele-specific oligonucleotide probes. Nature. 1986, 324: 163-166. 10.1038/324163a0.PubMedView ArticleGoogle Scholar
- Mullis KB, Faloona FA: Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction. Methods Enzymol. 1987, 155: 335-350.PubMedView ArticleGoogle Scholar
- Southern EM: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol. 1975, 98: 503-517.PubMedView ArticleGoogle Scholar
- Landegren U, Kaiser R, Sanders J, Hood L: A ligase-mediated gene detection technique. Science. 1988, 241: 1077-1080.PubMedView ArticleGoogle Scholar
- Gerry NP, Witowski NE, Day J, Hammer RP, Barany G, Barany F: Universal DNA microarray method for multiplex detection of low abundance point mutations. J Mol Biol. 1999, 292: 251-262. 10.1006/jmbi.1999.3063.PubMedView ArticleGoogle Scholar
- van Eijk MJ, Broekhof JL, van der Poel HJ, Hogers RC, Schneiders H, Kamerbeek J, Verstege E, van Aart JW, Geerlings H, Buntjer JB, et al: SNPWave: a flexible multiplexed SNP genotyping technology. Nucleic Acids Res. 2004, 32: e47-10.1093/nar/gnh045.PubMedPubMed CentralView ArticleGoogle Scholar
- The Patrick Brown Laboratory Guide to Microarraying. [http://cmgm.stanford.edu/pbrown/mguide/index.html]
- Macdonald SJ, Long AD: Identifying signatures of selection at the Enhancer of split neurogenic gene complex in Drosophila. Mol Biol Evol. 2005, 22: 607-619. 10.1093/molbev/msi046.PubMedView ArticleGoogle Scholar
- Robin C, Lyman RF, Long AD, Langley CH, Mackay TF: hairy : a quantitative trait locus for Drosophila sensory bristle number. Genetics. 2002, 162: 155-164.PubMedPubMed CentralGoogle Scholar
- Hosking L, Lumsden S, Lewis K, Yeo A, McCarthy L, Bansal A, Riley J, Purvis I, Xu CF: Detection of genotyping errors by Hardy-Weinberg equilibrium testing. Eur J Hum Genet. 2004, 12: 395-399. 10.1038/sj.ejhg.5201164.PubMedView ArticleGoogle Scholar
- Weir BS: Genetic Data Analysis II. 1996, Sunderland, Massachusetts: Sinauer Associates, Inc. PublishersGoogle Scholar
- Hirschhorn JN, Sklar P, Lindblad-Toh K, Lim YM, Ruiz-Gutierrez M, Bolk S, Langhorst B, Schaffner S, Winchester E, Lander ES: SBE-TAGS: an array-based method for efficient single-nucleotide polymorphism genotyping. Proc Natl Acad Sci USA. 2000, 97: 12164-12169. 10.1073/pnas.210394597.PubMedPubMed CentralView ArticleGoogle Scholar
- Faruqi AF, Hosono S, Driscoll MD, Dean FB, Alsmadi O, Bandaru R, Kumar G, Grimwade B, Zong Q, Sun Z, et al: High-throughput genotyping of single nucleotide polymorphisms with rolling circle amplification. BMC Genomics. 2001, 2: 4-10.1186/1471-2164-2-4.PubMedPubMed CentralView ArticleGoogle Scholar
- Bell PA, Chaturvedi S, Gelfand CA, Huang CY, Kochersperger M, Kopla R, Modica F, Pohl M, Varde S, Zhao R, et al: SNPstream UHT: ultra-high throughput SNP genotyping for pharmacogenomics and drug discovery. Biotechniques. 2002, Suppl: 70-72.PubMedGoogle Scholar
- Hardenbol P, Banér J, Jain M, Nilsson M, Namsaraev EA, Karlin-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, Landegren U, Davis RW: Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol. 2003, 21: 673-678. 10.1038/nbt821.PubMedView ArticleGoogle Scholar
- Genissel A, Pastinen T, Dowell A, Mackay TF, Long AD: No evidence for an association between common nonsynonymous polymorphisms in Delta and bristle number variation in natural and laboratory populations of Drosophila melanogaster. Genetics. 2004, 166: 291-306. 10.1534/genetics.166.1.291.PubMedPubMed CentralView ArticleGoogle Scholar
- Wallace RB, Shaffer J, Murphy RF, Bonner J, Hirose T, Itakura K: Hybridization of synthetic oligodeoxyribonucleotides to Φχ174 DNA: the effect of single base pair mismatch. Nucleic Acids Res. 1979, 6: 3543-3557.PubMedPubMed CentralView ArticleGoogle Scholar
- Conner BJ, Reyes AA, Morin C, Itakura K, Teplitz RL, Wallace RB: Detection of sickle cell β S -globin allele by hybridization with synthetic oligonucleotides. Proc Natl Acad Sci USA. 1983, 80: 278-282.PubMedPubMed CentralView ArticleGoogle Scholar
- Hardenbol P, Yu F, Belmont J, MacKenzie J, Bruckner C, Brundage T, Boudreau A, Chow S, Eberle J, Erbilgin A, et al: Highly multiplexed molecular inversion probe genotyping. Over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res. 2005, 15: 269-275. 10.1101/gr.3185605.PubMedPubMed CentralView ArticleGoogle Scholar
- Syvänen AC, Aalto-Setala K, Harju L, Kontula K, Söderlund H: A primer-guided nucleotide incorporation assay in the genotyping of apolipoprotein E. Genomics. 1990, 8: 684-692. 10.1016/0888-7543(90)90255-S.PubMedView ArticleGoogle Scholar
- Schouten JP, McElgunn CJ, Waaijer R, Zwijnenburg D, Diepvens F, Pals G: Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res. 2002, 30: e57-10.1093/nar/gnf056.PubMedPubMed CentralView ArticleGoogle Scholar
- Nickerson DA, Kaiser R, Lappin S, Stewart J, Hood L, Landegren U: Automated DNA diagnostics using an ELISA-based oligonucleotide ligation assay. Proc Natl Acad Sci USA. 1990, 87: 8923-8927.PubMedPubMed CentralView ArticleGoogle Scholar
- Grossman PD, Bloch W, Brinson E, Chang CC, Eggerding FA, Fung S, Iovannisci DM, Woo S, Winn-Deen ES: High-density multiplex detection of nucleic acid sequences: oligonucleotide ligation assay and sequence-coded separation. Nucleic Acids Res. 1994, 22: 4527-4534.PubMedPubMed CentralView ArticleGoogle Scholar
- Samiotaki M, Kwiatkowski M, Parik J, Landegren U: Dual-color detection of DNA sequence variants by ligase-mediated analysis. Genomics. 1994, 20: 238-242. 10.1006/geno.1994.1159.PubMedView ArticleGoogle Scholar
- Eggerding FA: A one-step coupled amplification and oligonucleotide ligation procedure for multiplex genetic typing. PCR Methods Appl. 1995, 4: 337-345.PubMedView ArticleGoogle Scholar
- Delahunty C, Ankener W, Deng Q, Eng J, Nickerson DA: Testing the feasibility of DNA typing for human identification by PCR and an oligonucleotide ligation assay. Am J Hum Genet. 1996, 58: 1239-1246.PubMedPubMed CentralGoogle Scholar
- Tobe VO, Taylor SL, Nickerson DA: Single-well genotyping of diallelic sequence variations by a two-color ELISA-based oligonucleotide ligation assay. Nucleic Acids Res. 1996, 24: 3728-3732. 10.1093/nar/24.19.3728.PubMedPubMed CentralView ArticleGoogle Scholar
- Nilsson M, Krejci K, Koch J, Kwiatkowski M, Gustavsson P, Landegren U: Padlock probes reveal single-nucleotide differences, parent of origin and in situ distribution of centromeric sequences in human chromosomes 13 and 21. Nat Genet. 1997, 16: 252-255. 10.1038/ng0797-252.PubMedView ArticleGoogle Scholar
- Favis R, Day JP, Gerry NP, Phelan C, Narod S, Barany F: Universal DNA array detection of small insertions and deletions in BRCA1 and BRCA2. Nat Biotechnol. 2000, 18: 561-564. 10.1038/75452.PubMedView ArticleGoogle Scholar
- Iannone MA, Taylor JD, Chen J, Li MS, Rivers P, Slentz-Kesler KA, Weiner MP: Multiplexed single nucleotide polymorphism genotyping by oligonucleotide ligation and flow cytometry. Cytometry. 2000, 39: 131-140. 10.1002/(SICI)1097-0320(20000201)39:2<131::AID-CYTO6>3.0.CO;2-U.PubMedView ArticleGoogle Scholar
- Busti E, Bordoni R, Castiglioni B, Monciardini P, Sosio M, Donadio S, Consolandi C, Rossi Bernardi L, Battaglia C, De Bellis G: Bacterial discrimination by means of a universal array approach mediated by LDR (ligase detection reaction). BMC Microbiol. 2002, 2: 27-10.1186/1471-2180-2-27.PubMedPubMed CentralView ArticleGoogle Scholar
- Banér J, Isaksson A, Waldenström E, Jarvius J, Landegren U, Nilsson M: Parallel gene analysis with allele-specific padlock probes and tag microarrays. Nucleic Acids Res. 2003, 31: e103-10.1093/nar/gng104.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen J, Iannone MA, Li MS, Taylor JD, Rivers P, Nelsen AJ, Slentz-Kesler KA, Roses A, Weiner MP: A microsphere-based assay for multiplexed single nucleotide polymorphism analysis using single base chain extension. Genome Res. 2000, 10: 549-557. 10.1101/gr.10.4.549.PubMedPubMed CentralView ArticleGoogle Scholar
- Fan JB, Chen X, Halushka MK, Berno A, Huang X, Ryder T, Lipshutz RJ, Lockhart DJ, Chakravarti A: Parallel genotyping of human SNPs using generic high-density oligonucleotide tag arrays. Genome Res. 2000, 10: 853-860. 10.1101/gr.10.6.853.PubMedPubMed CentralView ArticleGoogle Scholar
- Pastinen T, Raitio M, Lindroos K, Tainola P, Peltonen L, Syvänen AC: A system for specific, high-throughput genotyping by allele-specific primer extension on microarrays. Genome Res. 2000, 10: 1031-1042. 10.1101/gr.10.7.1031.PubMedPubMed CentralView ArticleGoogle Scholar
- Borevitz JO, Liang D, Plouffe D, Chang HS, Zhu T, Weigel D, Berry CC, Winzeler E, Chory J: Large-scale identification of single-feature polymorphisms in complex genomes. Genome Res. 2003, 13: 513-523. 10.1101/gr.541303.PubMedPubMed CentralView ArticleGoogle Scholar
- Zwick ME, Mcafee F, Cutler DJ, Read TD, Ravel J, Bowman GR, Galloway DR, Mateczun A: Microarray-based resequencing of multiple Bacillus anthracis isolates. Genome Biol. 2005, 6: R10-10.1186/gb-2004-6-1-r10.PubMedPubMed CentralView ArticleGoogle Scholar
- Macdonald SJ, Pastinen T, Long AD: The effect of polymorphisms in the Enhancer of split gene complex on bristle number variation in a large wild-caught cohort of Drosophila melanogaster. Genetics. 2005, Google Scholar
- Luo J, Bergstrom DE, Barany F: Improving the fidelity of Thermus thermophilus DNA ligase. Nucleic Acids Res. 1996, 24: 3071-3078. 10.1093/nar/24.15.3071.PubMedPubMed CentralView ArticleGoogle Scholar
- The R Project for Statistical Computing. [http://www.R-project.org]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.