Clusters in PINs. (a) A small section of a PIN in S. cerevisiae is represented as a graph where nodes correspond to proteins and edges to physical interactions between pairs of proteins. One definition of a module in this work is a highly connected subgraph, such as that shaded in the figure (left), in which the central (green) node has a maximum clustering coefficient (C = 1). A clustering coefficient can be calculated for each protein in the network and measures the number of interactions between neighbors of that protein, divided by the total number of possible interactions between those neighbors. In this example, the green node and its fully connected neighborhood correspond to the protein complex AP-2 . Fully connected subgraphs can also represent interactions that are dissociated in time and/or in space. For example, the shaded cluster on the right represents members of the basic helix-loop-helix transcriptional regulator family, in which duplication of a homodimeric protein with inheritance of interactions resulted in Max existing as a homodimer, as well as distinct dimers of paralogous proteins (c-Myc and Mad1) [34,35]. (b) Cumulative frequency distribution of the clustering coefficients in the Yeast PIN and in randomized networks with exactly the same degree distribution (scale-free random; see the Randomization by link shuffling section in Materials and methods for details). This shows that high clustering of real PINs, and thus their modularity, is a characteristic of their biology and not of the degree distribution. (c) Cartoon illustrating the consequences of duplication with conservation of interactions for the clustering coefficient of node (protein) i (C
). In each case the network is shown before and after duplication of a protein that either interacts with itself or does not. The bottom part of the cartoon summarizes the effect on the clustering coefficient of the protein. (d) Cumulative frequency distribution of clustering coefficients in the simulated networks, with varying proportions of self-interactors at the start of the simulation. The fraction of proteins with higher clustering coefficients increases with the proportion of self-interactors.