Optimization of oligonucleotide arrays and RNA amplification protocols for analysis of transcript structure and alternative splicing
- John Castle†1,
- Phil Garrett-Engele†1,
- Christopher D Armour1,
- Sven J Duenwald1,
- Patrick M Loerch1,
- Michael R Meyer1,
- Eric E Schadt1,
- Roland Stoughton1,
- Mark L Parrish1,
- Daniel D Shoemaker1 and
- Jason M Johnson1Email author
© Castle et al., licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. 2003
Received: 30 May 2003
Accepted: 15 August 2003
Published: 19 September 2003
Microarrays offer a high-resolution means for monitoring pre-mRNA splicing on a genomic scale. We have developed a novel, unbiased amplification protocol that permits labeling of entire transcripts. Also, hybridization conditions, probe characteristics, and analysis algorithms were optimized for detection of exons, exon-intron edges, and exon junctions. These optimized protocols can be used to detect small variations and isoform mixtures, map the tissue specificity of known human alternative isoforms, and provide a robust, scalable platform for high-throughput discovery of alternative splicing.
As the sequencing of the human and mouse genomes nears completion, the apparently similar number of genes in species of different complexities suggests that other sources of genomic richness are important, such as gene regulation, post-translational modification, and alternative splicing . Recent estimates from expressed sequence tag (EST) studies indicate that 40-60% of human genes are alternatively spliced [2, 3], and in many cases alternative isoforms result in proteins of distinct function . Biologically relevant isoform differences range from subtle, such as a few nucleotides at an alternative 5' or 3' splice site, to skipping several consecutive exons. Variant isoforms can be specific to tissue types or developmental stages and are involved in a large number of normal cellular functions. Defects in splicing also account for a substantial fraction of human genetic disease [5, 6].
The most common ways to identify alternative splicing events involve aligning and comparing EST and cDNA sequences from the same gene [2, 3, 7–16]. These methods are effective, but have significant limitations as a result of biases in transcript coverage and non-uniformity of tissue libraries or sampling . Reverse transcriptase polymerase chain reaction (RT-PCR) experiments followed by sequencing may also be used to discover novel isoforms. This approach can be powerful for analyzing a few genes in a small number of tissues, but it only provides a limited view of a gene's structure and is labor-intensive and challenging to scale up to thousands of genes and hundreds of tissues.
The highly parallel and sensitive nature of microarrays makes them ideal for monitoring gene expression on a tissue-specific, genome-wide level [17, 18]. Initial efforts have demonstrated that microarrays can be used to detect pre-mRNA splicing [19–21]. However, these early efforts have significant limitations. For instance, a typical experiment using oligonucleotide microarrays involves a 3'-biased labeling protocol and by necessity a probe or probes placed near the 3' end of the mRNA transcript . This experimental set-up limits discovery and monitoring of alternatively spliced isoforms to regions near the 3' end of the transcript. Probe placements within the 3' UTR , or not including probes spanning exon-exon junctions , also limit the types of isoforms that can potentially be monitored and detected. Methods using fiber-optic arrays  require pre-selection of known isoforms of interest and were not designed for novel isoform discovery. The utility of probes to exon junctions for measuring intron retention in yeast has been demonstrated , but the use of array probes was not experimentally optimized to monitor and discover alternatively spliced isoforms in complex human samples. In addition, the RNA labeling approach used in the yeast system would not be appropriate for samples that require an amplification step because of limited tissue or RNA availability.
One contribution of this work is a full-length RNA amplification protocol that samples complete transcripts. This provides an alternative to standard amplification methods that prime from the 3' poly(A) tail and do not accurately reproduce sequences distant from the 3' end. This new protocol generates sufficient material for several hybridizations from as little as 5 μg total RNA or 50 ng mRNA as starting material. We also provide the results of array experiments used to define experimental parameters and analysis strategies for mapping intron-exon structure and alternative splicing at high resolution. Together these methods facilitate high-throughput discovery of alternative splicing events on a genomic scale.
Results and discussion
Our optimization efforts focused on two well characterized genes, retinoblastoma (RB1) and synexin (ANXA7). We selected RB1 because it has a well characterized genomic region  and a relatively large number (27) of exons, whereas ANXA7 has two known isoforms that are differentially expressed in smooth and skeletal muscle .
We sought to develop and optimize microarray-based methods of determining the structure of transcripts that scale easily to many genes and many tissues. The methods we describe extend expression profiling to sub-exon resolution sufficient for detecting and discriminating between alternative splice forms. As alternative splicing might occur anywhere in a transcript, it is essential to use a protocol that labels the entire length of a transcript. One solution is simply to random-prime large amounts of mRNA using a one-step reverse transcription reaction [17, 21]. Although effective, this approach requires more than approximately 1 μg mRNA for every one to two hybridizations, which is unacceptably high for small or rare tissue samples.
Distinguishing exonic regions from intron and intragenic regions is a prerequisite for monitoring transcript structure. To determine optimal probe length and hybridization conditions for detecting and defining exons, we designed tiling arrays with probes placed in one-nucleotide intervals across the entire 180-kilobase (kb) genomic region encoding the RB1 gene. Separate tiling arrays for probe lengths 20, 25, 30, 35, 40, 45, 50, and 60 nucleotides were hybridized with labeled cDNA from two cell lines, Jurkat and K562, over a range of stringencies controlled by adjusting the amount of formamide in the hybridization buffer (20, 25, 30, 35, 40, and 45% formamide). We found that detection sensitivity for finding exons within genomic sequence, measured as the ratio between the intensity of probes in exons and probes in introns, peaks near 35% formamide and probe lengths of 50-60 nucleotides. These results are consistent with prior studies with ink-jet arrays [17, 24].
Several conclusions can be drawn from these data. First, exon edges can be clearly identified by tiling at high resolution. Second, the intensities of shorter probes fall off sharply at exon boundaries, which may allow more precise estimates of splice sites (see below). One can also observe in Figure 4 a slight asymmetry in the probe intensity profiles extending from the 5' and 3' sides of the exon. This effect, due to polarity in probe sensitivity with respect to distance from the glass surface, necessitates slightly different edge-detection parameters for each side of the exon.
The accuracy of splice-site detection was also measured for different tiling step intervals. By computationally removing data from the original one-nucleotide step dataset (with 30% formamide and 35-nucleotide probes), we simulated data collected from probes at step intervals of 2-10 nucleotides. In Jurkat cells, where RB1 is highly expressed, edge-detection accuracy remains high even as the step interval increased to eight nucleotides between probes. However, in K562, the consistency of the predicted location decreased significantly for tiling step intervals greater than five nucleotides, suggesting that this parameter is sensitive to the expression level of the gene. In summary, splice sites can be detected by tiled microarray 'edge' probes to an accuracy of approximately 10 nucleotides, and step sizes equal to or less than five nucleotides are sufficient for this accuracy.
The idea of using probes complimentary to exon-exon junctions to monitor splicing was published as early as 2000 . These probes span two exons and should be at maximum intensity only when both exons are present and connected [15, 20, 21]. Here we show the results of experiments aimed at optimizing the use of junction probes (length, placement, and hybridization stringency) for monitoring alternative splicing.
Optimal probe lengths, probe positions, and formamide concentrations for purposes of exon monitoring, exon edge detection, and exon-exon junction monitoring
50 to 60 nucleotides
35 to 40 nucleotides
= 5-nucleotide steps
Centered ± 5 nucleotides
30 to 40%
Alternative splicing of the ANXA7transcript
The methods described above were then applied to the alternatively spliced isoforms of ANXA7. Genomic mapping of the ANXA7 transcript shows 14 exons, and previous studies  have shown that a 66-nucleotide exon (exon 6) is absent in smooth muscle but present in skeletal muscle. Tiling arrays were designed with probes placed at one-nucleotide intervals through the 39-kb genomic region of ANXA7. Exon-6 probes had much higher intensities when hybridized with skeletal muscle RNA than with smooth muscle RNA, as expected (data not shown).
Extension to many tissues
It is straightforward to extend the methods described here to parallel measurements of larger numbers of genes and tissues. As exon-exon junction probes can map transcript structure with a small number of probes, the alternative splicing events of thousands of genes can be monitored simultaneously on the same array. By further modifying the amplification protocol, we were able to use as little as 50 ng of mRNA or 5 μg of total RNA as input, and we have automated the protocol to amplify and label 96 tissues simultaneously.
In summary, we have developed a protocol for effective full-length RNA labeling and have optimized experimental and computational microarray-based methods for determining transcript structures at high resolution. With a combination of finely spaced 'exon' and 'intron' probes, 'edge' probes, and 'junction' probes, these methods can: discriminate exons from introns; detect splice sites at less than 10 nucleotides resolution using array data alone; assemble and order exons in a transcript using junction probe intensities; identify changes in exon-exon junctions as small as one to three nucleotides; and detect alternatively spliced isoforms which are tissue-specific or present in mixtures. The experimental and computational methods described here are currently being used to carry out high-throughput detection of alternative splicing events on a genomic scale.
Materials and methods
Genomic mapping and array design
RB1 and ANXA7 mRNA sequences were mapped to genome sequence based on the assembly available though the National Center for Biotechnology Information (NCBI)  using sim4 . Repeat sequences were masked using the Scylla repeat-masking program from Paracel (Pasadena, CA) that uses the HASTE algorithm to identify repeat sequences, including both simple and interspersed repeats . Overlapping probes of lengths 20, 30, 40, 50, and 60 nucleotides were designed at one-nucleotide steps: (1) throughout each exon, (2) across each intron-exon edge, starting approximately 100 nucleotides into the intron, (3) across each exon-exon junction as described in the preceding text, and (4) at 10-nucleotide steps through each intron. For 20-nucleotide probes in ANXA7, for example, this resulted in 1,810 exon probes, 3,156 edge probes, 4,387 junction probes (including modified junctions), and 11,944 intron probes. For 20-nucleotide probes in RB1, this resulted in 4,179 exon probes, 5,852 edge probes, 5,550 junction probes, and 7,529 intron probes. All probes of lengths less than 60 nucleotides were placed on stilts of 10 thymidines. Probe intensities were background corrected and log values were used. Oligonucleotide arrays were synthesized on 1 × 3-inch glass slides with ink-jet technology  using mRNA samples obtained from Clontech (Palo Alto, CA) and the cell lines Jurkat (T lymphocyte, ATCC no. TIB-152) and K562 (chronic myelogenous leukemia, ATCC no. CCL-243).
Preparation of labeled cDNA
Hybridization material was generated through a random-priming amplification procedure (RP-Amp) using primers with a random sequence at the 3' end and a fixed motif at the 5' end. The following amplification protocol generated enough labeled material for approximately 400 hybridizations starting from 1.5 μg of mRNA. shDNP256 (first-strand synthesis): 5'-TAGATGCTGTTGNNNNNNNNN-3', and shT7N9 (second-strand synthesis): 5'-ACTATAGGGAGANNNNNNNNN-3'. mRNA (1.5 μg) was reverse-transcribed with Superscript II and the DP256 random primer for 20 min at 42°C (10 mM DTT, 50 mM Tris-HCl pH 8.3, 75 mM KCl, 8 mM MgCl2, 0.5 mM dNTPs, 5 U/μl Superscript II). The RNA was degraded with the addition of 20 μl volume of 0.5 N sodium hydroxide and 0.25 M EDTA for 20 min at 65°C. The single-stranded cDNA was purified using a commercial kit (Qiagen Qiaquick). The resulting single-stranded cDNA product was placed in its entirety in a second-strand reaction. Second-strand synthesis reactions utilized shT7N9 random primer and the Klenow fragment of DNA polymerase utilizing standard reaction conditions (37°C for 60 min, 0.2 mM DTT, 2.1 mM Tris-HCl pH 7.9, 2.1 mM MgCl2, 10.7 mM NaCl, 1.07 mM dNTPs, 0.1 U/μl Klenow), followed by another Qiaquick purification. Multiple PCRs were run using 0.15 μg double-stranded (ds) DNA and standard reaction conditions. Amplification was achieved using 10 cycles of PCR with the DP256 and T7 primers (20 mM Tris-HCl pH 8.4, 50 mM KCl, 0.01 mM dNTPs, 1.5 mM MgCl2, 0.01 U/μl Taq Polymerase), where DP256: 5'-GTTCGAGACCTCTAGATGCTGTTG-3', and T7: 5'-AATTAATACGACTCACTATAGGGAGA-3', followed by Qiaquick purification. Further amplification was achieved using in vitro transcription reactions with 0.5 μg dsDNA and T7 RNA polymerase (7.5 mM DTT, 40 mM Tris-HCl pH 7.5, 14.25 mM MgCl2, 10 mM NaCl, 2 mM spermidine, 125 U/ml RNAguard, 2.5 mM dNTPs, 15 U/ml IPPase, 25 kU/ml T7 polymerase) for 16 h at 42°C. The cRNA was purified (RNeasy) and reverse transcribed using Superscript II, random 9-mers, and amino-allyl dUTP (42°C for 20 min, 10 mM DTT, 50 mM Tris-HCl pH 8.3, 75 mM KCl, 8 mM MgCl2, 0.5 mM dNTPs, 0.5 mM amino-acyl-dUTP, 5 U/μl Superscript II). The final product was coupled to Cy3 or Cy5 dye in 1 M bicarbonate buffer for 1 h. Reactions were finished with the addition of 4 M hydroxylamine followed by purification. The percentage dye incorporation and total cDNA yield were determined spectrophotometrically. Formamide concentrations in the hybridization solution were adjusted while keeping the overall volume at 2 ml. Pairs of Cy3/Cy5-labeled cDNA samples were combined and hybridized as described previously . Arrays were hybridized for 48 h then washed and scanned on Agilent Microarray Scanners.
A higher-throughput version of the above protocol was designed for use with automation and a lower requirement for total RNA. This was accomplished by modifying a magnetic bead-based mRNA extraction (Ambion, Poly(A) Purist) for use with a Biomek FX and 96-well plates. Starting with as little as 5 μg total RNA (totRP-Amp) or 50 ng mRNA (mRP-Amp) will yield enough material for six dye-coupled hybridizations. The entire mRNA yield was used for first-strand synthesis. Subsequent steps were performed as above, except that reaction purifications were done using 96-well Qiaquick and RNeasy products. All material from the first-strand synthesis, second-strand synthesis, and PCR was concentrated by means of evaporation and used in its entirety without quantitation.
RT-PCR of ANXA7
We designed primers to exons flanking the cassette exon, used the Qiagen OneStep RT-PCR kit, and resolved the RT-PCR products on 2% agarose gels. Forward primer: 5'-TTCACAGTCTTATGGAGGTGGT-3'; reverse: 5'-CTTACGAAGAATTTCTGCATCTC-3'.
A cDNA representing exon 2 to exon 10 of ANXA7 (NM_004034) was generated with the T7 motif at the 5' end in a two-step reaction. Skeletal muscle mRNA was used to generate the long form (1,026 base-pairs (bp)) and smooth muscle mRNA to generate the short form (940 bp). RT-PCR was performed with the ANXA7-specific primers followed by additional rounds of PCR to incorporate the T7. In vitro transcription of this cDNA generated cRNA that was subsequently reverse transcribed using Superscript and random 9-mers and amino-allyl dUTP. The percentage dye incorporation and total cDNA yield were determined spectrophotometrically and the long and short forms were combined in the following ratios (5:0, 4:1, 3:2, 1:4, 0:5) with a final mass of 500 ng for each hybridization. ANXA121: 5'-GTCAGGAGTCATCTTTTCCCCCTTC-3'; ANXA1147: 5'-AGATTCATCGGTCCCTAGTCTCCCC-3'; T7ANXA121: 5'-ACTATAGGGAGAGTCAGGAGTCATC-3'
Calculation of exon edges
To estimate splice-site positions from tiling data, we tested Green's function deconvolution, derivative estimates, and wavelets. For the Green's function deconvolution method, we averaged probes placed across all 27 RB1 intron-exon boundaries to generate 5' and 3' intensity profiles of intron-exon edges. For each intron-exon boundary, this averaged profile was deconvolved from probe intensities; ideally, this should result in a spike marking the intron-exon boundary, but this method was sensitive to noise. We also tested using the derivative of each smoothed intron-exon profile. Finally, we also tested convolving each intron-exon intensity profile with wavelets, including Haar wavelets (step functions), b-spline wavelets, and Gaussian wavelets. For each of the wavelets, we varied the size of the wavelet and, for the Gaussian wavelets, the number of zero-crossings. Of the Green's function, derivative, and wavelet methods, we chose a 50-point Haar wavelet on the basis of its performance and simplicity. Thus, to estimate intron-exon boundary, we convolved this 50-point Haar wavelet with each intron-exon probe intensity profile, starting 100 nucleotides into the intron through the exon, and identified the maximum value. The location of the maximum does not coincide with the splice site, but is a constant offset from the splice site. This constant depends on probe length, can be found by profiling through known intron-exon edges (for example in RB1) and subsequently applied to new data.
Note added in proof
While this article was under review, a related article was published by Wang et al.  who describe an algorithm for determining the relative abundance of known alternative isoforms from microarray data. This article makes an important contribution to analysis methods for monitoring mRNA isoform levels using multiple oligonucleotide probes and is a useful complement to the work presented here.
Additional data files
Additional data, available with this article online, include a figure showing the optimization of formamide concentration and probe length for detecting insertions and deletions in exon-exon junctions (Additional data file 1), a figure showing the ANXA7 junction probe intensities for all pairwise combinations of exons (with 40-nucleotide probes, centrally positioned, in 35% formamide, without 'edge probe' normalization) (Additional data file 2), and a figure illustrating the detection of isoform mixtures in a single sample (using purified ANXA7 RT-PCR products from skeletal and smooth muscle over a series of isoform ratios) (Additional data file 3). Additional data file 4 provides additional details of the full-length mRNA amplification and labeling protocol.
G. Cavet, S. Carlson, and J. Burchard provided assistance and helpful discussions. J. Schelter helped develop the precursor to the protocols presented here and provided useful insight.
- Maniatis T, Tasic B: Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature. 2002, 418: 236-243. 10.1038/418236a.PubMedView ArticleGoogle Scholar
- Modrek B, Resch A, Grasso C, Lee C: Genome-wide detection of alternative splicing in expressed sequences of human genes. Nucleic Acids Res. 2001, 29: 2850-2859. 10.1093/nar/29.13.2850.PubMedPubMed CentralView ArticleGoogle Scholar
- Kan Z, Rouchka EC, Gish WR, States DJ: Gene structure prediction and alternative splicing analysis using genomically aligned ESTs. Genome Res. 2001, 11: 889-900. 10.1101/gr.155001.PubMedPubMed CentralView ArticleGoogle Scholar
- Black DL: Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell. 2000, 103: 367-370. 10.1016/S0092-8674(00)00128-8.PubMedView ArticleGoogle Scholar
- Cooper TA, Mattox W: The regulation of splice-site selection, and its role in human disease. Am J Hum Genet. 1997, 61: 259-266.PubMedPubMed CentralView ArticleGoogle Scholar
- Blencowe BJ: Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends Biochem Sci. 2000, 25: 106-110. 10.1016/S0968-0004(00)01549-8.PubMedView ArticleGoogle Scholar
- Mironov AA, Fickett JW, Gelfand MS: Frequent alternative splicing of human genes. Genome Res. 1999, 9: 1288-1293. 10.1101/gr.9.12.1288.PubMedPubMed CentralView ArticleGoogle Scholar
- Croft L, Schandorff S, Clark F, Burrage K, Arctander P, Mattick JS: ISIS, the intron information system, reveals the high frequency of alternative splicing in the human genome. Nat Genet. 2000, 24: 340-341. 10.1038/74153.PubMedView ArticleGoogle Scholar
- Brett D, Hanke J, Lehmann G, Haase S, Delbruck S, Krueger S, Reich J, Bork P: EST comparison indicates 38% of human mRNAs contain possible alternative splice forms. FEBS Lett. 2000, 474: 83-86. 10.1016/S0014-5793(00)01581-7.PubMedView ArticleGoogle Scholar
- Hide WA, Babenko VN, van Heusden PA, Seoighe C, Kelso JF: The contribution of exon-skipping events on chromosome 22 to protein coding diversity. Genome Res. 2001, 11: 1848-1853.PubMedPubMed CentralGoogle Scholar
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.PubMedView ArticleGoogle Scholar
- Burge CB: Chipping away at the transcriptome. Nat Genet. 2001, 27: 232-234. 10.1038/85772.PubMedView ArticleGoogle Scholar
- Xu Q, Modrek B, Lee C: Genome-wide detection of tissue-specific alternative splicing in the human transcriptome. Nucleic Acids Res. 2002, 30: 3754-3766. 10.1093/nar/gkf492.PubMedPubMed CentralView ArticleGoogle Scholar
- Zavolan M, Van Nimwegen E, Gaasterland T: Splice variation in mouse full-length cDNAs identified by mapping to the mouse genome. Genome Res. 2002, 12: 1377-1385. 10.1101/gr.191702.PubMedPubMed CentralView ArticleGoogle Scholar
- Modrek B, Lee C: A genomic view of alternative splicing. Nat Genet. 2002, 30: 13-19. 10.1038/ng0102-13.PubMedView ArticleGoogle Scholar
- Kochiwa H, Suzuki R, Washio T, Saito R, Bono H, Carninci P, Okazaki Y, Miki R, Hayashizaki Y, Tomita M: Inferring alternative splicing patterns in mouse from a full-length cDNA library and microarray data. Genome Res. 2002, 12: 1286-1293. 10.1101/gr.220302.PubMedPubMed CentralView ArticleGoogle Scholar
- Shoemaker DD, Schadt EE, Armour CD, He YD, Garrett-Engele P, McDonagh PD, Loerch PM, Leonardson A, Lum PY, Cavet G, et al: Experimental annotation of the human genome using microarray technology. Nature. 2001, 409: 922-927. 10.1038/35057141.PubMedView ArticleGoogle Scholar
- Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR: Large-scale transcriptional activity in chromosomes 21 and 22. Science. 2002, 296: 916-919. 10.1126/science.1068597.PubMedView ArticleGoogle Scholar
- Hu GK, Madore SJ, Moldover B, Jatkoe T, Balaban D, Thomas J, Wang Y: Predicting splice variant from DNA chip expression data. Genome Res. 2001, 11: 1237-1245. 10.1101/gr.165501.PubMedPubMed CentralView ArticleGoogle Scholar
- Yeakley JM, Fan JB, Doucet D, Luo L, Wickham E, Ye Z, Chee MS, Fu XD: Profiling alternative splicing on fiber-optic arrays. Nat Biotechnol. 2002, 20: 353-358. 10.1038/nbt0402-353.PubMedView ArticleGoogle Scholar
- Clark TA, Sugnet CW, Ares M: Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science. 2002, 296: 907-910. 10.1126/science.1069415.PubMedView ArticleGoogle Scholar
- Friend SH, Bernards R, Rogelj S, Weinberg RA, Rapaport JM, Albert DM, Dryja TP: A human DNA segment with properties of the gene that predisposes to retinoblastoma and osteosarcoma. Nature. 1986, 323: 643-646. 10.1038/323643a0.PubMedView ArticleGoogle Scholar
- Magendzo K, Shirvan A, Cultraro C, Srivastava M, Pollard HB, Burns AL: Alternative splicing of human synexin mRNA in brain, cardiac, and skeletal muscle alters the unique N-terminal domain. J Biol Chem. 1991, 266: 3228-3232.PubMedGoogle Scholar
- Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR, et al: Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol. 2001, 19: 342-347. 10.1038/86730.PubMedView ArticleGoogle Scholar
- Thanaraj TA: A clean data set of EST-confirmed splice sites from Homo sapiens and standards for clean-up procedures. Nucleic Acids Res. 1999, 27: 2627-2637. 10.1093/nar/27.13.2627.PubMedPubMed CentralView ArticleGoogle Scholar
- Burset M, Seledtsov IA, Solovyev VV: Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Res. 2000, 28: 4364-4375. 10.1093/nar/28.21.4364.PubMedPubMed CentralView ArticleGoogle Scholar
- Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W: A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 1998, 8: 967-974.PubMedPubMed CentralGoogle Scholar
- Boysen C, Smith CP, Pao S, Paul C, Borkowski JA: The Paracel filtering package (PFP): a novel approach to filtering and masking of DNA and protein sequences. ISMB Proc. 2001, 36-Google Scholar
- Roberts CJ, Nelson B, Marton MJ, Stoughton R, Meyer MR, Bennett HA, He YD, Dai H, Walker WL, Hughes TR, et al: Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science. 2000, 287: 873-880. 10.1126/science.287.5454.873.PubMedView ArticleGoogle Scholar
- Wang H, Hubbell E, Hu JS, Mei G, Cline M, Lu G, Clark T, Siani-Rose MA, Ares M, Kulp DC, Haussler D: Gene structure-based splice variant deconvolution using a microarray platform. Bioinformatics. 2003, 19 (suppl 1): I315-I322. 10.1093/bioinformatics/btg1044.View ArticleGoogle Scholar