Duplication of subunits in protein complexes. (a) Nearly 40% of the protein complexes have homologous subunits (gray bars). These levels are higher than expected by chance (white bars). Random expectation levels are the averages of 10,000 randomized protein complex datasets, where the complex size distribution is kept constant. While the MIPS dataset is the result of manual curation (see table 1), both TAP and HMS-PCI are the result of large-scale experiments, and some redundancy may exist from multiple baits picking up the same complex. For the TAP dataset, the authors provide a smaller set of predicted complexes based on bioinformatics methods. We repeated the calculation on this set of predicted TAP complexes and found that 47% of the complexes have duplicated subunits, while 18 ± 2% would be expected at random. The significance level remains the same for this predicted set of complexes as for the raw purification data. (b) Percentage of complexes of known three-dimensional structure that have duplicated subunits, as a function of complex size. Grey bars are for the complete data set, whereas black bars are from a dataset that excludes purely homomeric complexes, as these dominate the dataset (see Table 1) and may distort the results. On average, between 9% and 30% of the complexes display duplicated subunits (including and excluding purely homomeric complexes, respectively). This is not an artifact introduced by the complex size distribution.