Skip to main content

Table 2 Summary of evaluation criteria for the single and multi-species methods for the B. subtili s-B. anthraci s pairing

From: Multi-species integrative biclustering

         

GO

KEGG

 

Conservation score

Mean correlation: absolute value

Mean net P-value (-log10)

Mean number of genes

Mean number of conditions

Number of biclusters

Coverage element-wise

Mean overlap element-wise

Percent (bi)clusters enriched (P< 0.01)

Number of unique enriched terms

Percent (bi)clusters enriched (P< 0.01)

Number of unique enriched pathways

EO MSCM-SH

1

0.52 (0.69)

8.21 (6.45)

16.78 (16.78)

125.74 (25.86)

148 (148)

18.69% (15.73%)

4.76% (5.20%)

33.78% (37.16%)

378 (338)

4.05% (6.76%)

10 (16)

FD MSCM-SH

1

0.59 (0.85)

9.10 (8.57)

21.82 (21.82)

116.97 (24.87)

150 (150)

21.71% (18.53%)

5.33% (5.93%)

51.33% (51.33%)

575 (500)

12.67% (12.67%)

24 (28)

ISA-P

1

0.60 (0.56)

5.92 (5.63)

16.90 (16.90)

10.22 (6.85)

41 (41)

0.41% (0.95%)

22.24% (34.64%)

53.66% (75.61%)

160 (164)

19.51% (19.51%)

12 (15)

MSKM-SH

1

0.58 (0.52)

11.49 (11.62)

14.99 (14.99)

314 (51)

148 (148)

56.49% (37.83%)

0% (0%)

50.68% (39.19%)

617 (559)

14.19% (14.86%)

22 (25)

BMSKM-SH

1

0.49 (0.72)

9.89 (12.19)

15.00 (15.00)

314 (51)

148 (148)

56.52% (37.85%)

0% (0%)

50.00% (48.65%)

658 (578)

16.89% (15.54%)

29 (34)

EO MSCM-EL

0.907

0.54 (0.69)

7.41 (6.35)

22.74 (23.60)

129.69 (27.07)

148 (148)

25.03% (21.68%)

4.38% (5.06%)

40.54% (60.81%)

449 (485)

11.49% (10.81%)

18 (18)

FD MSCM-EL

0.852

0.61 (0.84)

7.64 (8.65)

33.75 (34.63)

119.87 (26.26)

150 (150)

31.29% (29.90%)

4.00% (5.72%)

56.00% (72.67%)

649 (664)

15.33% (21.33%)

30 (37)

ISA-R

0.093

0.55 (0.51)

3.54 (8.87)

106.05 (335.71)

10.22 (6.93)

41 (41)

2.36% (6.90%)

18.34% (46.28%)

95.12% (100.00%)

287 (235)

24.39% (58.54%)

10 (20)

MSKM-EL

0.956

0.56 (0.58)

10.27 (6.65)

26.49 (39.44)

314 (51)

148 (148)

99.80% (99.52%)

0% (0%)

63.51% (75.68%)

732 (675)

14.86% (12.16%)

31 (30)

BMSKM-EL

0.959

0.50 (0.71)

8.58 (7.93)

26.54 (39.63)

314 (51)

148 (148)

100% (100%)

0% (0%)

52.70% (81.76%)

743 (710)

15.54% (11.49%)

35 (25)

EO SSCM

0.098

0.70 (0.91)

8.58 (7.43)

26.19 (34.11)

193.40 (38.66)

161 (210)

39.48% (46.81%)

9.44% (14.10%)

42.24% (66.19%)

499 (629)

10.56% (17.62%)

19 (29)

FD SSCM

0.124

0.56 (0.82)

10.14 (7.31)

23.06 (40.65)

200.76 (39.81)

295 (315)

54.55% (61.24%)

7.53% (15.46%)

50.51% (61.59%)

746 (712)

11.53% (9.52%)

32 (31)

EO COAL

0.107

0.58 (0.64)

5.21 (5.06)

86.65 (115.71)

20.09 (13.13)

300 (158)

40.21% (66.40%)

1.94% (2.12%)

63.67% (76.58%)

744 (659)

17.67% (9.49%)

32 (24)

FD COAL

0.101

0.59 (0.62)

5.27 (5.69)

88.16 (131.12)

20.24 (14.24)

287 (136)

39.39% (66.63%)

2.06% (2.16%)

64.81% (80.88%)

776 (686)

16.03% (14.71%)

24 (24)

QUBIC

0.054

0.36 (0.49)

1.38 (5.90)

71.59 (188.25)

25.45 (12.63)

150 (150)

2.43% (12.95%)

38.34% (26.49%)

43.33% (88.67%)

227 (331)

3.33% (14.67%)

5 (13)

  1. We compare several metrics of bicluster conservation, coverage, and functional enrichment. In all cases metrics are averaged over all biclusters produced by that method for each species. In each column, the results for B. subtilis are listed first, with those for B. anthracis listed in parentheses. 'Conservation score' provides an estimate of the conservation identified between biclusters of the different organisms as defined in the methods. 'Mean correlation' measures the coherence of the biclusters given the expression. 'Mean net P-value' measures the enrichment of network edges within biclusters. 'Mean number of genes', 'Mean number of conditions' and 'Number of biclusters' summarize the size distributions of the (bi)clusters identified. 'Coverage element-wise' is the percentage of the total expression data that is found in one or more (bi)cluster. 'Mean overlap element-wise' estimates the redundancy of the (bi)clusters; overlap is calculated as the mean of the maximum percentage of overlap for each bicluster in the full set of biclusters for a given method. 'Percent (bi)clusters enriched (P < 0.01)' for GO and KEGG provides an estimate of the functional significance of the (bi)clusters identified. 'Number of unique enriched terms' for GO and 'Number of unique enriched pathways' for KEGG are the number of unique terms/pathways across all biclusters for that method; this number of enriched terms/pathwaus provides an estimate of the redundancy of the biological functions enriched in one or more biclusters across the full set of biclusters for any given method. Further explanations of these metrics can be found within the text and Additional file 1. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; ISA, Iterative Signature Algorithm; EO, expression only; MSCM, multi-species cMonkey; SH, shared biclusters; MSISA, multi-species ISA; P, purified biclusters; MSKM, multi-species k-means; BMSKM, balanced multi-species k-means; EL, elaborated biclusters; R, refined biclusters (applies only to the ISA algorithm (MSISA-R)); FD, full data; SSCM, single-species cMonkey; COAL, Coalesce biclustering method; QUBIC, QUalitative BIClustering algorithm.