Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: Characterizing the interplay between gene nucleotide composition bias and splicing

Fig. 2

Nucleotide composition bias and splicing-related features. a Nucleotide frequency (%) maps in different sets of exons and their flanking intronic sequences. b Heatmap representing the average frequency (%, as compared to control exons) of A, T, G, or C nucleotides in a window of 25 nucleotides downstream of GC exons (left panel) or AT exons (right panel). “*” corresponds to Wald’s test FDR < 0.05. c Heatmap representing the average frequency (%, as compared to control exons) of A, T, G, or C nucleotides in a window of 25 nucleotides upstream of GC exons (left panel) or AT exons (right panel). “*” corresponds to Wald’s test FDR < 0.05. d Minimum free energy (MFE) at the 5′ ss (left panel) and the 3′ ss (right panel) of GC exons or AT exons. MFE was computed using 25 nucleotides within exons and 25 nucleotides within introns. The red lines indicate the median values calculated for control exons. “$” and “*” correspond to Tukey’s test FDR < 10−16 when comparing GC exons to AT exons or when comparing GC exons or AT exons to control exons, respectively. e Proportion (%) of GC exons or AT exons with at least two or more predicted BPs in a window of 100 nucleotides in their upstream intron (left panel). Number of hydrogen bonds measured between the U2 snRNA and the BP sequence found in the 25 nucleotides upstream of GC exons and AT exons (right panel). The red lines indicate the median values calculated for control exons. “**” and “$$” correspond to χ2 test P < 10−13 when comparing GC exons to AT exons or when comparing GC exons or AT exons to control exons, respectively. “$” and “*” correspond to Tukey’s test P < 0.02 when comparing GC exons to AT exons and when comparing GC exons to control exons, respectively. f Weblogos generated using sequences flanking the BPs with the best score in a 25 nucleotide-long window upstream of GC exons or AT exons and the boxplot resuming their GC content. “$” corresponds to Tukey’s test FDR < 10−16. g Boxplot representing the number of TNA sequences within the last 50 nucleotides of the upstream introns of GC exons and AT exons (left panel). Boxplot representing the number of T-rich low-complexity sequences in a window between positions − 35 and − 75 upstream the 3′ ss of GC exons and AT exons (right panel). The red lines indicate the median values calculated for control exons. “$$” and “**” correspond to Tukey’s FDR < 10−16 when comparing GC exons to AT exons and when comparing GC exons or AT exons to control exons, respectively. h Density of peaks obtained from publicly available U2AF2-CLIP datasets generated from HEK293T (left panel) or HeLa (right panel) cells and mapped upstream of GC exons and AT exons. The green arrows indicate peaks that mapped upstream of the Py tract

Back to article page