Skip to main content

Table 2 Monorail performance metrics run on TACC, AWS and MARCC

From: recount3: summaries and queries for large-scale RNA-seq expression and splicing

Metric

Human SRA TACC

Human SRA AWS

Mouse SRA TACC

Mouse SRA AWS

Human GTEx MARCC

Human TCGA MARCC

Totals

Sequencing Runs Processed

286,000

27,618

304,131

109,889

19,214

11,348

758,200

Compressed input size (TBs)

441.78

44.2

236.28

111.873

81

75

990.133

Compressed output size (TBs)

64.81

6.5

39.7

16.7

11.6

7.0

146.31

Node hours (NHs)

10,133

798

8,179

5,967

2421

1467

28,965

NHs per sequencing run

0.035

0.029

0.027

0.054

0.126

0.129

0.038

NHs per compressed input TB

22.9

18.1

34.6

53.3

29.9

19.6

29.3

Sequencing runs per NH

28

35

37

18

8

8

26

Compressed input TB per NH

0.044

0.055

0.029

0.019

0.033

0.051

0.034

  1. Statistics for GTEx and TCGA were extrapolated from a subset of each project (9277, 1567 samples respectively). GTEx output was increased by keeping whole BAM files for a subset of the samples. These numbers tally the number of run accessions processed, which can exceed the numbers in Table 1 due to some runs being processed multiple times, and due to runs that were later removed for QC or metadata reasons. Missing from this table are several thousand SRA human run accessions that were analyzed on MARCC but whose log files were discarded