Skip to main content

Table 1 Size and computational cost of Monorail runs. Processing wall-clock times are estimated from run logs and are approximate. Wall times are roughly “node hours”, where a typical node used here has 48 cores and 192 GB of RAM. Node types vary somewhat across clusters used. Input and output sizes are calculated from compressed files

From: recount3: summaries and queries for large-scale RNA-seq expression and splicing

Collection

Input Size (TB)

Output Size (TB)

# Sequence Runs

# Studies

# Junctions (M)

# bigWigs (M)

Processing WallTime (h)

SRA Human v3

474

72

316,443

8,677

228

1.2

1,728

SRA Mouse v1

362

62

416,859

10,088

148

1.7

1,608

TCGA

75

7

11,348

1

31.5

0.045

170

GTEx v6

35

6.7

9,911

1

22

0.040

168

GTEx v7 & v8

46

4.9

9,303

1

10.6

0.037

123

Total

992

152.6

762,995

18,768

396

3.022

3,797

  1. *Output size for GTEx v6 includes BAM files in addition to typical summaries
  2. Number of additional junctions beyond those in GTEx v6
  3. Total after counting only the distinct junctions