Skip to main content

Table 1 Estimates of a whole-genome sequence and an RNA-seq experiment using Illumina HiSeq 2000 machine

From: The real cost of sequencing: higher than you think!

   Whole genome sequencing   RNA-Seq
   2011 cost output 2011 time   2011 cost output 2011 time
Sample collection and experimental design from blood samples (easy to collect) to brain tissue (hard to collect) ~$100 onwards   from a few hours to several days same as for whole genome sequencing
Sequencing library preparation + running the sequencer (whole dual flow cell) ~$6500 = ~$500 + ~$6000 ~380 M reads/lane; 1 individual: ~1140 M total reads (~3 lanes for a 30 × coverage); ~250 Gb (intermediate files) ~11-12 day library preparation + running the sequencer (whole dual flow cell) ~$3300 = ~$300 + ~$3000 ~380 M reads/lane ~12-14 day
  Data storage, low-level processing        
  Alignment (transfer* and storing raw data + mapping) ~$40 = ~$33 + ~$7 300 Gb (BAM file) ~1/2 day *** (including transferring 250 Gb FASTQ ~7.5 hrs) Alignment (transfer* and storing raw data + mapping) ~$5 = ~$3 + ~$2 ~30 Gb (BAM); ~22 Gb (MRF) < 2 hrs ***
  (data transfer and storage for 10 days)*; ** ~$40   ~8.5 hrs (data transfer and storage for 10 days)*; ** < $4   < 1 hr
Data reduction and management High-level summaries***        
  SNP calling (compute + transfer out) < $5 = ~$4 + ~$0.60 < 1 Gb ~3 hrs Gene and exon
expression quantification
< $1 < 1 M < 1 hr (1 CPU)
  Indel calling (compute + transfer out) < $35 = ~$32 + ~$0.60 < 1 Gb ~1 day Isoform quantification ~$6 < 1 M ~4 h
  SV calling (compute + transfer out) < $35 = ~$32 + ~$0.60 < 1 Gb ~1 day     
Downstream analyses   > $100 K ~310 Gb months   > $100 K ~30 Gb months
Total of sequencing, data management and reduction ~$6500 ~310 Gb ~15 days   ~3500 ~30 Gb ~12-14 days
  1. aAssuming a 10 MB/s transfer rate. bThe cost of transferring 300 GB (BAM file) if the mapping is performed locally. cSixteen CPUs were used for all calculations, unless specified. BAM, Binary Sequence Alignment/Map; CPU, central processing unit; GB, gigabyte; MB, megabyte; SNP, single nucleotide polymorphism; ~, approximately. The table gives an estimation of the current costs of a whole-genome sequencing experiment and a functional genomic experiment (RNA-seq) using an Illumina HiSeq 2000 machine. The cost of sequencing is based on that reported by the Center for Cancer Research [53]. It is assumed that the same human sample is used for the genome sequence as well as for the RNA-seq experiment. In this scenario, a group of four technicians and bioinformaticians can run the entire pipeline. The costs related to data processing are estimated by considering all tasks performed in the Amazon Web Services 'cloud' environment. Pricing is based on a 'US standard' of the S3 Amazon Services (storage) [54] and a cost of $0.68 per hour (US, East Virginia) for the use of the Amazon EC2 (computation) [55] (July 2011). See Additional file 1 for a version of this table in a colored layout for easier reading.