Coverage and error models of protein-protein interaction data by directed graph analysis

Table 5 Estimates for the FP errors of each filtered dataset

Dataset (ref.)	N	m	P _FP	E [Y₁]	E [Y₂]	U _obs	R _obs
ItoFull [1]	720	258,840	0.0008	414	0.17	435	68
ItoCore [1]	128	8,128	0.0025	41	0.05	43	36
Uetz et al. [6]	108	5,778	0.003	35	0.05	36	10
Gavin et al. [10]	852	362,526	0.0017	1230	1.10	1201	743
Krogan et al. [11]	1,458	1,062,153	0.0019	4,029	3.80	3945	538

Shown are the expected number of false positive (FP) errors on the filtered datasets for [1,6,10,11]. N is the number of proteins within each filtered dataset. The values for P_FP and m are estimated upper bounds obtained by setting P_FN = 0 and using the solution curves of Figure 6c,d. Denote Y₁ as the random variable for the number of unreciprocated FP observations, and Y₂ for the number of reciprocated FP observations. The variables U_obsand R_obsshow the observed number of unreciprocated and reciprocated interactions from the data, respectively. The table implies that even in the worst case scenario for maximal P_FP, reciprocated edges mostly report true interactions.

ISSN: 1474-760X