Skip to main content

Advertisement

Fig. 1 | Genome Biology

Fig. 1

From: Multilayered control of exon acquisition permits the emergence of novel forms of regulatory control

Fig. 1

Genomics features of introns with exonization events. a Workflow to identify novel exonization events (see the “Methods” section). Briefly, RNA-seq from shRNA knockdown of RNA binding proteins in HepG2 is analyzed by 2-pass enabled STAR, and then novel junctions are incorporated into index files analyzed by Whippet [12]. Identified exons are filtered to remove exon-exon junctions and events occurring in any of the matched control samples, as well as annotated in genome databases. Only events supported by > 5 reads mapping over exon-exon junctions and a percent spliced in (PSI) greater than 5% are included. b Plot showing the results from a logistic linear regression analysis aimed at identifying features important in discriminating introns prone to exonization events to all other expressed introns. Features in bold significantly contribute to the model (p < 0.01, Student t test). TSS, transcription start site; ppt_len, polypyrimidine tract length; 5′ss, 5′-splice site; 3’ss, 3′-splice site; bp_scr, branchpoint score; SS_dist, splice site distance; BP_num, branchpoint number; AGEZ, AG dinucleotide Exclusion Zone length; TAD, topologically associating domain; ppt_scr, polypyrimidine tract score (n = 13,103). c Plot showing the results from a logistic linear regression analysis aimed at identifying the type of transposable elements that most effectively discriminate introns prone to exonization events compared to all other expressed introns. Features in bold significantly contribute to the model (p < 0.01, Student t test). Nodes are colored by average estimated age of when transposable elements arose (n = 13,103). d Enrichment map for GO, REACTOME, and KEGG functional categories of genes that contain Alu-exonization events, with representative GO terms shown for each sub-network (see Additional file 1: Figure S1 for annotated version). Node size is proportional to the number of genes associated with the GO category, and edge width is proportional to the number of genes shared between GO categories

Back to article page