recount3: summaries and queries for large-scale RNA-seq expression and splicing

Table 1 Size and computational cost of Monorail runs. Processing wall-clock times are estimated from run logs and are approximate. Wall times are roughly “node hours”, where a typical node used here has 48 cores and 192 GB of RAM. Node types vary somewhat across clusters used. Input and output sizes are calculated from compressed files

Collection	Input Size (TB)	Output Size (TB)	# Sequence Runs	# Studies	# Junctions (M)	# bigWigs (M)	Processing WallTime (h)
SRA Human v3	474	72	316,443	8,677	228	1.2	1,728
SRA Mouse v1	362	62	416,859	10,088	148	1.7	1,608
TCGA	75	7	11,348	1	31.5	0.045	170
GTEx v6	35	6.7 ^∗	9,911	1	22	0.040	168
GTEx v7 & v8	46	4.9	9,303	1	10.6 ^†	0.037	123
Total	992	152.6	762,995	18,768	396 ^‡	3.022	3,797

ISSN: 1474-760X