Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: ReSeq simulates realistic Illumina high-throughput sequencing data

Fig. 3

Coverage model. a GC bias for Ec-Hi2000-TruSeq at the four steps of the bias fit for 30 fits with different fragment lengths. The red dots in the normalized panel are the median value and represent the final result. The horizontal lines in the GC spline panel are the chosen knots for one example fragment length. b Flanking bias. The effect of nucleotides in the genome relative to the fragment start or end with position 0 being the start/end. Negative positions are outside of the fragment. Only position − 3 to 11 are shown from the total model that includes position − 10 to 19. Each box summarizes 30 to 40 fits with different fragment lengths. The boxes are arranged around their true position for improved readability. The three datasets are all created with Nextera adapters. c Comparison of combining the different positions in the flanking bias by a product or a sum. Each dot is one bin of 2·105 fragment sites for one of 30 fits with different fragment length. The fragment sites are ordered by their predicted mean counts μn before binning. The x-axis is the mean of observed counts in the bin. The y-axis is the mean of predicted mean counts. For the sum the dots scatter around the identity, while for the product a curve is visible. d, e The observed counts kn for the bins defined in c are fitted with a negative binomial with constant dispersion r for Ec-Hi2000-TruSeq (d) and Ec-Hi4000-Nextera (e). While Ec-Hi2000-TruSeq shows a significant slope and nearly no y-intercept, Ec-Hi4000-Nextera shows the exact opposite

Back to article page