Unsupervised cluster analysis of the combined gene expression data for 232 human breast tumor samples and 122 mouse mammary tumor samples. (a) A color-coded matrix below the dendrogram identifies each sample; the first two rows show clinical ER and HER2 status, respectively, with red = positive, green = negative, and gray = not tested; the third row includes all human samples colored by intrinsic subtype as determined from Additional data file 6; red = basal-like, blue = luminal, pink = HER2+/ER-, yellow = claudin-low and green = normal breast-like. The remaining rows correspond to murine models indicated at the right. (b) A gene cluster containing basal epithelial genes. (c) A luminal epithelial gene cluster that includes XBP1 and GATA3. (d) A second luminal cluster containing Keratins 8 and 18. (e) Proliferation gene cluster. (f) Interferon-regulated genes. (g) Fibroblast/mesenchymal enriched gene cluster. (h) The Kras2 amplicon cluster. See Additional data file 5 for the complete cluster diagram.