From: RNase-mediated protein footprint sequencing reveals protein-binding sites throughout the human transcriptome

Overview of the PIP-seq method. (A) In the PIP-seq method, cells are cross-linked with formaldehyde or 254-nm UV light, or not cross-linked. They are lysed and divided into footprint and RNase digestion control samples. The footprint sample is treated with an RNase (ss- or dsRNase), which results in a population of RNase-protected RNA–RBP complexes. The protein cross-links are then reversed (by heating for formaldehyde cross-links or by proteinase K treatment for UV cross-links), leaving only the footprints where the RNA was protein-bound. For the RNase digestion control sample, which is designed to control for RNase insensitive regions, the order of operations is reversed; bound proteins are first removed by treatment with SDS and proteinase K, and then the unprotected RNA sample is subjected to RNase treatment. Strand-specific high-throughput sequencing libraries are prepared from both footprint and RNase digestion control samples and normalized using rehybridization and duplex-specific nuclease (DSN) treatment. PPSs are identified from the sequencing data using a Poisson model. Screenshots show UCSC browser views of sequencing reads from the footprint and RNase digestion control sample (same scale) and PPSs identified from the regions of the genes listed. (B,C) Absolute distribution of PPSs throughout RNA species for formaldehyde (B) and UV (C) cross-linked PIP-seq experiments. (D,E) Average PPS count per RNA molecule (classified by RNA type (mRNA and lncRNA) and transcript region (for example, 5′ UTR)) for formaldehyde (D) and UV (E) cross-linked PIP-seq experiments. Percentages indicate the fraction of each RNA type or region that contains PPS information. (F) Average expression (y-axis) of human mRNAs separated by total number of PPSs identified in their sequence (x-axis) for PPSs identified using formaldehyde cross-linking. CDS, coding sequence; DSN, duplex-specific nuclease; dsRNase, double-stranded RNase; lncRNA, long non-coding RNA; PIP-seq, protein interaction profile sequencing; PPS, protein-protected site; ssRNase, single-stranded RNase; UTR, untranslated region.

