Cell culture and RNA isolation
HL60 (human promyelocytic leukemia) cells were cultured in RPMI supplemented with 10% fetal bovine serum and antibiotics. Cells were treated with 1 μmol/l tretinoin (all-trans retinoic acid; Sigma-Aldrich, St Louis, MO, USA) in dimethyl sulfoxide (DMSO; final concentration 0.1%) or DMSO alone for five days. Total RNA was isolated from bulk cultures with TRIzol Reagent (Invitrogen, Carlsbad, CA, USA), in accordance with the manufacturer's directions. For the classification exercise, microtiter plate cultures were treated with 200 nmol/l tretinoin or DMSO for two days to mimic the submaximal signatures likely to be encountered in a small molecule screen, and were and prepared for mRNA capture by the addition of Lysis Buffer (RNAture, Irvine, CA, USA).
Microarrays
Total RNA was amplified and labeled using a modified Eberwine method, the resulting cRNA was hybridized to Affymetrix GeneChip HG-U133A oligonucleotide microarrays, and the arrays were scanned in accordance with the manufacturer's directions. Raw data were deposited in the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO [13]) and are accessible through GEO series accession number GSE5007. Intensity values were scaled such that the overall fluorescence intensity of each microarray was equivalent. Expression values below an arbitrary baseline (20) were set to 20.
Gene selection
The 9,466 probe sets reporting above baseline were first divided into upregulated and downregulated groups by differences in mean expression levels between tretinoin and vehicle treatments. Each of these groups was further divided into three sets of approximately equal size on the basis of the lower mean expression level. The selected basal expression categories were 20-60 (low), 60-125 (moderate), and >125 (high). Probe sets reporting small (1.5-2.5×), medium (3-4.5×), or large (>5×) changes in mean expression level within each basal expression category were extracted and ranked by signal to noise ratio. The top five probes mapping to unique RefSeq identifiers according to NetAffx [14] in each of the 18 categories were selected, populating nine sets of 10 genes (Additional data file 1).
Probes and primers
Upstream probes were composed (5' to 3') of the complement of the T7 primer site (TAA TAC GAC TCA CTA TAG GG), a 24 nucleotide (nt) barcode, and a 20 nt gene-specific sequence. Downstream probes were 5'-phosphorylated, and contained a 20 nt gene-specific sequence and the T3 primer site (TCC CTT TAG TGA GGG TTA AT). Barcode sequences were developed by Tm Bioscience (Toronto, Ontarion, Canada) [15] and detailed in the Luminex FlexMAP Microspheres Product Information Sheet [8]. Gene-specific fragments of probes were designed against the Oligator Human Genome RefSet, keyed by RefSeq identifier, where available. A 40 nt region was manually selected from within these 70 nt sequences to yield two fragments of equal length with roughly similar base composition and juxtaposing nucleotides being C-G or G-C, where possible. Probe sequences are provided in Additional data file 2. Capture probes contained the complement of the barcode sequences and had 5'-amino modification and a C12 linker. The T7 primer (5'-TAA TAC GAC TCA CTA TAG GG-3') was 5'-biotinylated. The T3 primer has the sequence 5'-ATT AAC CCT CAC TAA AGG GA-3'. Oligonucleotides (all with standard desalting) were from Integrated DNA Technologies (Coralville, IA, USA).
Beads and bead coupling
Luminex xMAP Multi-Analyte COOH Microspheres [8] were coupled to capture probes in a semi-automated microtiter plate format. Approximately 2.5 × 106 microspheres were dispensed to the wells of a V-bottomed microtiter plate, pelleted by centrifugation at 1800 g for 3 minutes, and the supernatant removed. Beads were resuspended in 25 μl binding buffer (0.1 M 2- [N-morpholino]ethansulfonic acid; pH 4.5) by sonication and pipeting, and 100 pmol capture probe was added. A volume of 2.5 μl of a freshly prepared 10 mg/ml aqueous solution of 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide hydrochloride (Pierce, Milwaukee, WI, USA) was added, and the plate incubated at room temperature in the dark for 30 minutes. This addition and incubation step was repeated, and 180 μl 0.02% Tween-20 added with mixing. Beads were pelleted by centrifugation, as before, and washed sequentially in 180 μl 0.1% sodium dodecyl sulfate and 180 μl tris-EDTA (TE) (pH 8.0) with intervening spins. Coupled microspheres were resuspended in 50 μl TE (pH 8.0) and stored in the dark at 4°C for up to one month. Bead mixes were freshly prepared and contained about 1.5 × 105/ml of each microsphere in 1.5× TMAC buffer (4.5 mol/l tetramethylammonium chloride, 0.15% N-lauryl sarcosine, 75 mmol/l tris-HCl [pH 8.0], and 6 mmol/l EDTA [pH 8.0]). The mapping of bead number to capture probe sequence is provided in Additional data file 3.
Ligation-mediated amplification
Transcripts were captured in oligo-dT coated 384 well plates (GenePlateHT; RNAture) from total RNA (500 ng) in Lysis Buffer (RNAture) or whole cell lysates (20 μl). Plates were covered and centrifuged at 500 g for one minute, and incubated at room temperature for one hour. Unbound material was removed by inverting the plate onto an absorbent towel and spinning as before. A volume of 5 μl of an M-MLV reverse transcriptase reaction mix (Promega, Madison, WI, USA) containing 125 μmol/l of each dNTP (Invitrogen) was added. The plate was covered, spun as before, and incubated at 37°C for 90 minutes. Wells were emptied by centrifugation, as before. A volume of 10 fmol of each probe was added in 1× Taq Ligase Buffer (New England Biolabs, Ipswich, Ma, USA; 5 μl), the plate covered, spun as before, heated at 95°C for two minutes and maintained at 50°C for six hours. Unannealed probes were removed by centrifugation, as before. A volume of 5 μl of 1× Taq Ligase Buffer containing 2.5 U Taq DNA ligase (New England Biolabs) was added, the plate covered, spun as before, and incubated at 45°C for one hour followed by 65°C for 10 minutes. Wells were emptied by centrifugation, as before. A volume of 15 μl of a HotStarTaq DNA Polymerase mix (Qiagen, hilden, Germany) containing 16 μmol/l of each dNTP (Invitrogen) and 100 nmol/l of T3 primer and biotinylated T7 primer was added. The plate was covered, spun as before, and polymerase chain reaction performed in a Thermo Electron (Milford, MA, USA) MBS 384 Satellite Thermal Cycler (initial denaturation of 92°C for 9 minutes, 92°C for 30 s, 60°C for 30 s, 72°C for 30 s for 39 cycles; final extension at 72°C for 5 minutes). Total time from the addition of lysis buffer to hybridization-ready product for 96 samples processed in parallel in a single microtiter plate is approximately 14 hours.
Hybridization and detection
A volume of 15 μl of LMA reaction product was mixed with 5 μl TE (pH 8.0) and 30 μl bead mix (about 4,500 of each microsphere) in the wells of a Thermowell P microtiter plate (Costar, Corning, NY, USA). The plate was covered and incubated at 95°C for two minutes and maintained at 45°C for 60 minutes. A volume of 20 μl of a reporter mix containing 10 ng/μl streptavidin R-phycoerythrin conjugate (Molecular Probes, Eugene, OR, USA) in 1× TMAC buffer (3 mol/l tetramethylammonium chloride, 0.1% N-lauryl sarcosine, 50 mmol/l tris-HCl [pH 8.0], 4 mmol/l EDTA [pH 8.0]) was added with mixing and incubation continued at 45°C for five minutes. Beads were analyzed with a Luminex 100 instrument [8]. Sample volume was set at 50 μl and flow rate was 60 μl/minute. A minimum of 100 events were recorded for each bead set and median fluorescence intensities (MFIs) computed. Total time from the start of hybridization to download of raw data from the instrument for 96 samples processed in parallel in a single microtiter plate is approximately three hours. Expression values for each transcript were corrected for background signal by subtracting the MFI of corresponding bead sets from blank (TE only) wells. Values below an arbitrary baseline (5) were set to 5, and all were normalized against an internal control feature (GAPDH_3).
k-Nearest-neighbor classifier
The microarray-derived expression signature from long duration, high-dose tretinoin or vehicle treatments was used to train a series of KNN classifiers in the spaces of the full 90-member gene set and each of the nine 10-member gene categories. These were applied to the corresponding data from the 88 LMF test samples whose internal reference feature (GAPDH_3) was within two standard deviations from the mean. To permit the cross-platform analysis, both the train and test data sets were normalized so that each gene had a mean of zero and a standard deviation of one. The KNN algorithm classifies a sample by assigning it the label most frequently represented among the k nearest samples. In this case k was set to 3. The votes of the nearest neighbors were weighted by one minus the cosine distance. This analysis was performed with the GenePattern software package [16].