Coverage and error models of protein-protein interaction data by directed graph analysis

Table 6 Estimates for the FN errors of each filtered dataset

Dataset (ref.)	N	n	P _FN	E [W₁]	E [W₂]	U _obs	R _obs
ItoFull [1]	720	1,200	0.76	438	693	435	259,132
ItoCore [1]	128	100	0.38	47	14	43	8,156
Uetz et al. [6]	108	78	0.65	35	33	36	5,822
Gavin et al. [10]	852	2,429	0.44	1197	470	1,201	362,209
Krogan et al. [11]	1,458	11,744	0.80	3758	7,516	3,945	1,062,344

The expected number of false-negative (FN) errors on the filtered datasets for [1,6,10,11]. N is the number of proteins within each filtered dataset. The values for P_FN and n are estimated upper bounds obtained by setting P_FP = 0 and using the solution curves of Figure 6c,d. Denote W₁ as the random variable for the number of unreciprocated FN observations, and W₂ for the number of reciprocated FN observations. The variables U_obsand R_obsshow the observed number of unreciprocated and reciprocated interactions from the data, respectively. The table implies that in the worst case scenario for P_FN, the doubly tested, reciprocated noninteracting protein pairs do not give us a conclusive indication about the presence or absence of an interaction. For this, more data are needed.

ISSN: 1474-760X