Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: DNA polymerase stalling at structured DNA constrains the expansion of short tandem repeats

Fig. 4

DNA synthesis at structured STRs is a highly dynamic and error-prone process. Deep sequencing of the stalled products of replication allows DNA synthesis to be followed over time and the mapping of stalled DNA polymerase at single-nucleotide resolution. a Plotting the relative read coverage at STRs, e.g. the 72-nt long ACGT and AAGGG repeats folding into hairpin-like and G4 structures respectively, reveals time-dependent sharp transitions corresponding to the main positions where the DNA polymerase stalls and dissociates during DNA synthesis. Computing the distance of the stalling sites, i.e. the mid-transition value, from the start of the repeat for each STR at different time points shows that the distance travelled by the polymerase after 30 min depends on b the stall scores of the STR and c their structures; 72-nt long STRs were binned either according to low (σ < 0.33), medium (0.33 ≤ σ < 0.66) and high (σ ≥ 0.66) stall scores at 30 min in b or to their predicted structures in c. Deep sequencing of the extended products of replication allows the identification of new sequence variants within the pool of newly synthesised DNA molecules. Mutations were classified into either base substitutions (BS) or expansion/contraction events (EC). d Distribution of the stall scores at 30 min associated with STRs presenting either expansion/contraction (orange) or base substitutions (light blue) events. e Pie charts representing the proportion of STR folding into hairpin-like (blue), i-motifs (yellow) or G4s structures (red) or unfolded (grey) when considering all STRs (All), STRs marked by expansion/contraction (EC) or base substitution (BS) events. These charts show a depletion and an enrichment of structured STRs within the pool of STRs characterised by presenting either length or sequence variation respectively. Frequencies of f expansion/contraction and g base substitution events observed at STRs when binned according to their stall scores at 30 min (low σ < 0.33, medium 0.33 ≤ σ < 0.66, high σ ≥ 0.66). Frequencies are defined as the ratios of the number of reads supporting a mutation by the total number of reads covering this mutation. Centre lines denote medians, boxes span the interquartile range and whiskers extend beyond the box limits by 1.5 times the interquartile range. P values for the comparison of the distributions were calculated using Kolmogorov–Smirnov tests, *P ≤ 0.05, **P < 0.01, ***P < 0.001

Back to article page