Unsupervised analysis of DNA methylation in acute lymphoblastic leukemia (ALL) samples and non-leukemic reference samples. (A) Principal component analysis (PCA) of the DNA methylation data for 435,941 CpG sites across all samples included in the study. The data from 764 ALL patients and 137 reference samples are plotted using the first two principal components. The top left panel shows the data for the ALL samples, with each individual sample indicated by a ring. Data from BCP ALL samples are shown in blue and data from T-ALL samples are in red. In each panel, the data from the samples with the indicated cytogenetic subtype of ALL are highlighted. The data from the four different cell types in the reference cell panel are plotted by triangles with the cell types indicated by the color key to the right of the panels. (B) The fraction of the variance explained by each principal component. The two first PCs shown in (A) explain approximately 63% of the variance in methylation levels. (C) Hierarchical clustering of the ALL and reference samples based on the methylation levels of 435,941 CpG sites. The 1,000 most variable CpG sites are shown in the heatmap. Clustering of samples by cell type and cytogenetic profiles is shown below the heatmap.