Skip to main content
Figure 1 | Genome Biology

Figure 1

From: How biologically relevant are interaction-based modules in protein networks?

Figure 1

Overlap algorithm and multi-response randomization test method. (a) Overlap algorithm. C-based and L-based matrices are obtained from the interaction matrix. These matrices are then the input data of a standard hierarchical agglomerative average-linkage clustering algorithm [20] which extracts modules according to a given number of branches present in the clustering tree () (see text). Finally, in the C-based modular structure, we kept in each module only those components which also appeared in the corresponding L-based module with which the selected C-module had the greatest overlap. The organization thus obtained is the putative modular organization of the network under consideration. (b) Multi-response permutation procedure. We validate the previous modular organization with the use of the phylogenetic conservation of module protein constituents across species. We calculate a matrix of mean pairwise similarities (or distances) among those phylogenetic profiles [18] of proteins belonging to the same module, W i , or every two pairs of modules, W ij , and computed a representative statistic ξ observed . P-values are obtained by randomly permuting the data and recomputing the statistic. This step is repeated a large number of times, 10,000 in our case. The resulting values form a randomized distribution. The observed value from the original data can then be compared with this distribution to compute the P-value.

Back to article page