Skip to main content

Table 6 Summary of prediction accuracy results

From: Erratum to: Multiclass classification of microarray data with repeated measurements: application to cancer

Data

Parameters

EWUSC

USC

SC

Published results

NCI 60 data*

ρ0

NA

0.6

1.0

NA

 

Δ

NA

0.6

0.9

NA

 

# relevant genes

NA

2,116 (2315)

3,998

200

 

Prediction accuracy

NA

72%

72%

~40–60% [4]

Multiple tumor data (estimated optimal parameters) †

ρ0

0.8

0.8

1.0

NA

 

Δ

4.8 (5.6)

4 (5.6)

8.8

NA

 

# relevant genes

241 (680)

356 (735)

3902

All genes

 

Prediction accuracy

93%

82%(85%)

63%(78%)

78% [5]

Multiple tumor data (global optimal parameters) ‡

ρ0

0.9

0.9

1.0

NA

 

Δ

0

0

0.4

NA

 

# relevant genes

1,622 (1626)

1634

7129

All genes

 

Prediction accuracy

74% (78%)

74%

59%(74%)

78% [5]

Breast cancer data

ρ0

0.6 (0.7)

0.6

1.0

NA

 

Δ

0.80

0.55 (1.15)

0.5 (1.1)

NA

 

# relevant genes

189 (271)

1,114 (82)

3,193(187)

70

 

Prediction accuracy

84% (89%)

84% (79%)

84%

89% [6]

  1. Results different from those previously reported are highlighted in bold. Previous results are in brackets. Results improved over previously reported are highlighted in italic, while results worse than previously reported are underlined. The optimal parameters (ρ0 and Δ), number of relevant genes chosen, and prediction accuracy for the NCI 60 data, multiple tumor data and breast cancer data are summarized here. Both EWUSC (error-weighted, uncorrelated shrunken centroid) and USC (uncorrelated shrunken centroid) were motivated by SC (shrunken centroid) [2]. Both EWUSC and USC take advantage of interdependence between genes by removing highly correlated relevant genes. EWUSC makes use of error estimates or variability over repeated measurements. SC [2] is equivalent to USC at ρ0 = 1. The optimal parameters (Δ, ρ0) for EWUSC are estimated from the cross-validation results of EWUSC, while the optimal parameters (Δ, ρ0) for USC are independently estimated from the cross-validation results of USC. *Since no repeated measurements or error estimates are available, EWUSC is not applicable to the NCI 60 data. In addition, there is no separate test set available for the NCI 60 data, typical results of random partitions of the original 61 samples into training and test sets are shown. The prediction accuracy and number of relevant genes are produced using optimal parameters (Δ, ρ0) estimated by visual observation of 'bends' in the random cross-validation curves. The prediction accuracy and number of relevant genes are produced using global optimal parameters, that is (Δ, ρ0) that produces the minimum average numbers of cross-validation errors over all Δ and all ρ0.