Skip to main content
Figure 3 | Genome Biology

Figure 3

From: An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era

Figure 3

The strategy for cross-platform gene mapping and the consistency of cross-platform gene expression measurements. The microarray probes/probe sets are mapped to RNA-Seq genes in one of two ways: public gene ID mapping or genome location mapping (a). Using the gene ID mapping approach requires that one of the following public gene IDs be available: gene symbol, RefSeq transcript ID, Ensembl gene ID, or Entrez gene ID. Using the genome location mapping requires an RNA-Seq gene annotation file in either the Gene Transfer Format (GTF) or the General Feature Format (GFF). The process produces separate mapping lists for microarrays and RNA-Seq. Each of them consists of A, B, C, and D groups. Group A for microarrays corresponds to the group A in RNA-Seq. The microarray group B is a subset of RNA-Seq group C, and vice versa. The D group for microarrays and for RNA-Seq contain genes and probes/probe sets that cannot be mapped between the two platforms. The intensities of Affymetrix microarray probe sets in mapping groups A, B, and C are separately compared to those of RNA-Seq gene counts in panels (b), (c), and (d) for one of the eight RNA samples in the NCTR toxicogenomics data set. The microarray data are from Rat_230_2 arrays normalized with the MAS5 algorithm, and the RNA-Seq reads are from the Illumina GA II platform with the single-end 36 base pairs RNA-Seq protocol and gene counts from the P2 pipeline (Novoalign with RefSeq rat gene models). The mappings from microarray probe sets to RNA-Seq genes are based on the genome location mapping approach.

Back to article page