PAR-CLIP data indicate that Nrd1-Nab3-dependent transcription termination regulates expression of hundreds of protein coding genes in yeast
© Webb et al.; licensee BioMed Central Ltd. 2014
Received: 8 October 2013
Accepted: 7 January 2014
Published: 7 January 2014
Nrd1 and Nab3 are essential sequence-specific yeast RNA binding proteins that function as a heterodimer in the processing and degradation of diverse classes of RNAs. These proteins also regulate several mRNA coding genes; however, it remains unclear exactly what percentage of the mRNA component of the transcriptome these proteins control. To address this question, we used the pyCRAC software package developed in our laboratory to analyze CRAC and PAR-CLIP data for Nrd1-Nab3-RNA interactions.
We generated high-resolution maps of Nrd1-Nab3-RNA interactions, from which we have uncovered hundreds of new Nrd1-Nab3 mRNA targets, representing between 20 and 30% of protein-coding transcripts. Although Nrd1 and Nab3 showed a preference for binding near 5′ ends of relatively short transcripts, they bound transcripts throughout coding sequences and 3′ UTRs. Moreover, our data for Nrd1-Nab3 binding to 3′ UTRs was consistent with a role for these proteins in the termination of transcription. Our data also support a tight integration of Nrd1-Nab3 with the nutrient response pathway. Finally, we provide experimental evidence for some of our predictions, using northern blot and RT-PCR assays.
Collectively, our data support the notion that Nrd1 and Nab3 function is tightly integrated with the nutrient response and indicate a role for these proteins in the regulation of many mRNA coding genes. Further, we provide evidence to support the hypothesis that Nrd1-Nab3 represents a failsafe termination mechanism in instances of readthrough transcription.
RNA binding proteins play crucial roles in the synthesis, processing and degradation of RNA in a cell. To better understand the function of RNA binding proteins, it is important to identify their RNA substrates and the sites of interaction. This helps to better predict their function and lead to the design of more focused functional analyses. Only recently, the development of cross-linking and immunoprecipitation (CLIP) and related techniques has made it possible to identify direct protein-RNA interactions in vivo at a very high resolution [1–5]. To isolate direct protein-RNA interactions, cells are UV irradiated to forge covalent bonds between the protein of interest and bound RNAs. The target protein is subsequently affinity purified under stringent conditions, and UV cross-linked RNAs are partially digested, ligated to adapters, RT-PCR amplified and sequenced. CLIP methods are becoming increasingly popular and produce valuable data. The number of papers describing the technique seems to double every year and it is now being applied in a wide range of organisms. The method is also under constant development: the individual-nucleotide resolution CLIP (iCLIP) approach has improved the accuracy of mapping cross-linking sites [2, 4], and incorporating photoactivatable nucleotides in RNA can enhance the UV cross-linking efficiency . We have recently developed a stringent affinity-tag-based CLIP protocol (cross-linking and cDNA analysis (CRAC)) that can provide a higher specificity , and the tag-based approach is becoming more widely adopted [4, 6]. The combination of CLIP with high-throughput sequencing (for example, HITS-CLIP) has markedly increased the sensitivity of the methodology and provided an unparalleled capability to identify protein-RNA interactions transcriptome-wide [3, 5, 7]. This approach is producing a lot of extremely valuable high-throughput sequencing data. Fortunately, many bioinformatics tools are now becoming available tailored to tackle the large CRAC/CLIP datasets [8–11]. We have recently developed a python package, dubbed pyCRAC, that conveniently combines many popular CLIP/CRAC analysis methods in an easy to use package.
Nrd1 and Nab3 are essential sequence-specific yeast RNA binding proteins that function as a heterodimer in processing and degradation of diverse classes of RNAs [12–19]. Transcription termination of RNA polymerase (Pol) II transcripts generally involves mRNA cleavage and addition of long polyA tails (cleavage and polyadenylation (CPF) pathway), which label the RNA ready for nuclear export (reviewed in ). By contrast, transcripts terminated by Nrd1-Nab3 generally contain short polyA tails and are substrates for the nuclear RNA degradation machinery [21, 22]. This activity is also important for small nucleolar RNA (snoRNA) maturation and degradation of cryptic unstable transcripts (CUTs) and stable unannotated transcripts (SUTs) [12, 23–26]. Nrd1 and Nab3 direct transcription termination of nascent transcripts by interacting with the highly conserved carboxy-terminal domain (CTD) of RNA polymerase II. Because this interaction requires phosphorylation at serine 5 in the CTD, Nrd1 and Nab3 are believed to primarily operate on promoter proximal regions where serine 5 phosphorylation levels are high [27, 28].
Recent high-throughput studies have indicated Nrd1 and Nab3 frequently UV cross-link to mRNAs [6, 24, 29] and thousands of mRNA coding genes harbor Nrd1 and Nab3 binding sequences (see below). However, thus far a relatively small number of mRNAs have been reported to be targeted by Nrd1 and Nab3 [25, 30–33]. Indeed, it is not clear exactly what percentage of the mRNA transcriptome these proteins control. To address this question, we reanalyzed CRAC and PAR-CLIP data using the pyCRAC software package. We generated high-resolution maps of Nrd1-Nab3-RNA interactions, focusing on the presence of known RNA binding motifs in the sequencing data. We also confirmed some of our predictions experimentally. Our analyses revealed that Nrd1-Nab3 bound between 20 to 30% of protein-coding transcripts, several hundred of which had binding sites in untranslated regions (UTRs). Although Nrd1 and Nab3 showed a preference for binding near 5′ ends of relatively short transcripts, they bound transcripts throughout coding sequences and 3′ UTRs. Our data suggest that Nrd1-Nab3 can terminate transcription of a long approximately 5 kb transcript by binding 3′ UTRs and we speculate that the fate of many mRNAs is dictated by kinetic competition between Nrd1-Nab3 and the CPF termination pathways. Statistical analyses revealed that Nrd1 and Nab3 targets are significantly enriched for enzymes and permeases involved in nucleotide/amino acid synthesis and uptake, and for proteins involved in mitochondrial organization. Collectively, our data support the notion that Nrd1 and Nab3 function is tightly integrated with the nutrient response  and indicate a role for these proteins in the regulation of many mRNA coding genes.
Results and discussion
Identification of Nrd1-Nab3 binding sites in PAR-CLIP data
Previous genetic and biochemical studies have identified a number of short Nrd1 and Nab3 RNA binding motifs (UCUU and CUUG in Nab3; UGUA and GUAG in Nrd1) [6, 15, 16, 18, 24, 29]. Not surprisingly, almost every single mRNA coding gene in the yeast genome contains at least one copy of these motifs and could therefore be Nrd1 and Nab3 targets (see below). To get an impression of how many mRNAs are actually targeted by Nrd1 and Nab3 in yeast, we analyzed data from Nrd1 and Nab3 CLIP/CRAC experiments using the pyCRAC software package .
At least a quarter of the mRNAs are Nrd1-Nab3 targets
Figure 2A provides an overview of the percentage of genes in the genome that contain Nrd1 (UGUA, GUAG) and Nab3 (UCUU, CUUG) motifs. The vast majority of motifs were found in protein coding genes and cryptic Pol II transcripts such as CUTs and SUTs. Although generally fewer motifs were present in short non-coding RNA genes (tRNAs, small nuclear RNAs (snRNAs) and snoRNAs; Figure 2A), a high percentage of these motifs contained T-C substitutions in the PAR-CLIP data (Figure 2C). Many Nrd1 and Nab3 motifs are located in snoRNA flanking regions, which were not included in our analyses. Therefore, the number provided here is an underestimation of the total snoRNA targets. Strikingly, the PAR-CLIP analyses showed that Nrd1 and Nab3 cross-linked to 20 to 30% of the approximately 6,300 mRNA transcripts analyzed (Figure 2B), although only a relatively small fraction of all motifs present in the genomic sequence contained T-C substitutions (less than 5%; Figure 2C). Around 50% of the cross-linked motifs mapped to untranslated regions, with a preference for 5′ UTRs (Figure 2D). Consistent with recently published data, our analyses identified the telomerase RNA (TLC1) as a Nrd1-Nab3 target [29, 36]. Other non-coding RNA targets included the RNase P RNA (RPR1), the signal recognition particle RNA (SCR1) and ICR1. Collectively, our analyses uncovered over a thousand mRNAs that could be regulated by Nrd1 and Nab3.
Nrd1 and Nab3 preferentially bind to 5′ ends of a subset of mRNA transcripts
Gene Ontology term analyses on this list of transcripts also revealed a significant enrichment of enzymes with oxidoreductase activity (almost 10%; P-value <0.02) and genes involved in cellular transport activities such as nitrogen compounds (8.8%; P-value = 0.0069). These included genes involved in ergosterol biosynthesis (Figure 5C; ERG24, ERG3 and ERG4), nucleoporins (KAP114, KAP108/SXM1, KAP121/PSE1, KAP142/MSN5), several nucleoside and amino acid permeases (FUR4, MEP3, MMP1, DIP5, CAN1, FCY2, BAP3; Figure 5D) and various other transporters (TPO1, TPO3, TAT1, YCF1).
Regulation of many genes involved in nucleotide biosynthesis is dictated by nucleotide availability and involves selection of alternative TSSs (IMD2, URA2, URA8 and ADE12) [42–45]. When nucleotide levels are sufficient, transcription starts at upstream TSSs and the elongating polymerase reads through Nrd1-Nab3 binding sites. When Nrd1-Nab3 bind these transcripts they are targeted for degradation. Indeed, several of the transcripts that originate from alternative TSSs have been annotated as CUTs. For a number of genes we could also detect cross-linked motifs upstream of the TSSs. Interestingly, cryptic transcription (XUTs and/or CUTs) was detected just upstream of AIM44, CDC47/MCM7, DIP5, ERG24, EMI2, FCY2, FRE1, GPM2, IRA2, MIG2, MYO1, TIR2, TEX1, YOR352W and YGR269W[38, 39] (red colored gene names in Figure 4), hinting that these genes could also be regulated via alternative start site selection.
Collectively, these data are consistent with a role for Nrd1 and Nab3 in the nutrient response pathway  and we speculate that Nrd1-Nab3-dependent premature termination is a more widely used mechanism for regulating mRNA levels than was previously anticipated .
Nrd1 and Nab3 bind 3′ UTRs of several hundred mRNAs
Nrd1 and Nab3 have been shown to regulate expression of mRNA transcripts by binding 3′ UTRs. It was proposed that in cases where the polymerase fails to terminate at conventional polyadenylation sites, Nrd1 and Nab3 binding to 3′ UTRs could act as a transcription termination ‘fail-safe’ mechanism . From our data we predict that this is likely a widely used mechanism to prevent Pol II from transcribing beyond normal transcription termination sites.
We identified a total of 373 transcripts (approximately 6% of all protein coding genes analyzed) where cross-linked Nrd1 and/or Nab3 motifs mapped to 3′ UTRs (Table S2 in Additional file 1). Two examples are shown in Figure 5B,E. We identified several cross-linked Nrd1 and Nab3 motifs downstream of the MSN1 and NAB2 coding sequences. We speculate that these are examples of ‘fail-safe’ termination, where Nrd1 and Nab3 prevent read-through transcription into neighboring genes located on the same (TRF4) or opposite strand (RPS2). This arrangement of termination sites is reminiscent of the region downstream of RPL9B (Figure 5F), where the CPF and Nrd1-Nab3 termination machineries act in competition . Cross-linked Nrd1 motifs also appeared enriched near the 3′ ends of protein coding genes (Figure 5A,B). The Nrd1 GUAG and GUAA motifs contain stop codons and we found that indeed a fraction of the cross-linked Nrd1 motifs recovered from the PAR-CLIP data overlapped with stop-codons (Figure 5C).
Nrd1-Nab3 and mitochondrion organization
The Corden laboratory recently demonstrated a role for Nrd1 in mitochondrial DNA maintenance . An nrd1-102 temperature-sensitive mutant showed a higher mitochondrial DNA content and was synthetically lethal with an AIM37 deletion, a gene involved in mitochondrial inheritance [30, 47]. Remarkably, a statistically significant fraction of the cross-linked Nrd1 and Nab3 motifs located in 3′ UTRs mapped to genes involved in mitochondrial organization and maintenance (37 genes, P-value 0.011). These include those encoding the mitochondrial DNA binding protein (ILV5), the nuclear pore associated protein (AIM4; Figure 5G), a large number of proteins that localize to the mitochondrial inner membrane (COX16, COX17, FCJ1, TIM12, TIM14/PAM18, TIM54, YLH47, YTA12, CYC2, COA3, OXA1) and several mitochondrial ribosomal proteins (NAM9, MRP13, MRPL3, MRPL21, MRPL22 and MRPL38). Notably, cells lacking AIM4 show similar defects in mitochondrial biogenesis as an aim37Δ strain .
Collectively, the data suggest that Nrd1 and Nab3 play an important role in mitochondrial function and development.
Nab3 is required for fail-safe termination of the convergent HHT1 and IPP1 genes
The convergent HHT1 and IPP1 genes came to our attention because we identified a cross-linked Nab3 motif that mapped to a XUT located directly downstream of the HHT1 gene (Figure 7A). XUTs can silence expression of neighboring sense genes by modulating their chromatin state ; therefore, this XUT could play a role in regulating IPP1 expression. In addition, substantial Nab3 cross-linking was also observed to anti-sense HHT1 transcripts (Figure 7A). We predicted that Nab3 was required to suppress multiple cryptic transcriptional activities in this region.
Quantification of the northern data shown in Figure 7D revealed a two- to four-fold reduction in HHT1 and IPP1 mRNA levels in the absence of Nrd1 and/or Nab3 (Figure 7E). These results indicate a role for Nrd1 and Nab3 in regulating mRNA levels of these genes.
We were unable to detect the XUT by northern blotting, presumably because it is rapidly degraded by RNA surveillance machineries (using oligo 3; Figure 7A; data not shown). However, quantitative RT-PCR (qRT-PCR) results showed a staggering approximately 25-fold increase in XUT levels in the absence of Nab3 (Figure 7F), clearly demonstrating a role for Nab3 in suppressing the expression of this XUT. The Pol II PAR-CLIP data revealed transcription downstream of the IPP1 polyadenylation signals (Figure 7A), indicating that a fraction of polymerases did not terminate at these sites. Depletion of Nab3 resulted in an approximately six-fold increase in transcription downstream of the annotated IPP1 polyadenylation sites (Figure 7G) and low levels of IPP1 read-through transcripts could be detected by northern blotting and end-point RT-PCR (Figure 7D,H). We conclude that here Nab3 functions as a ‘fail-safe’ terminator by preventing the polymerase from transcribing beyond the IPP1 polyadenylation sites into the HHT1 gene. Consistent with the low level of Nrd1 cross-linking in this region, Nrd1 depletion only modestly increased the XUT levels and no significant increase in read-through transcription of IPP1 could be detected (Figure 7A,D,G). These data indicate a role for Nab3 in fail-safe termination of IPP1 and suppressing XUT expression, which may interfere with transcription of genes on the opposite strand.
Nrd1-Nab3-dependent transcription termination of long mRNA transcripts
The level of serine 5 phosphorylated CTD gradually decreases during transcription of coding sequences, and it has been shown that Nrd1-dependent transcription termination becomes less efficient once approximately 900 nucleotides have been transcribed [27, 28]. Almost half of the transcripts bound by both Nrd1 and Nab3 in the 3′ UTR were longer than approximately 800 nucleotides (Figure 8A). However, compared to the length distribution of all the analyzed protein coding genes, both proteins did preferentially cross-link to transcripts smaller than 1 kb (Figure 8B). To determine whether Nrd1-Nab3 can terminate transcripts longer than 1 kb, we monitored transcription of the approximately 4.7 kb YTA7 gene in Nrd1-Nab3 depleted cells. The YTA7 transcript was selected because significant cross-linking of Nrd1 and Nab3 was detected mainly in the 3′ UTR. Notably, contrary to the IPP1 transcript, Nrd1-Nab3 cross-linked primarily upstream of polyadenylation sites, indicating that Nrd1-Nab3 termination could precede CPF-dependent termination (Figure 8C,D). The strength of Nrd1-Nab3-dependent transcription termination depends on at least three factors: (1) the number of clustered Nrd1-Nab3 motifs in a sequence, (2) the organization of the binding sites and (3) the presence of AU-rich sequences surrounding the binding sites [16, 35]. Three Nab3 motifs were located within 70 nucleotides of the cross-linked Nrd1 motif in the 3′ UTR of YTA7, which were surrounded by AU-rich polyadenylation sequences (Figure 8D). This indicates that this region has the required signals for Nrd1-Nab3-directed transcription termination. To address this, we performed qRT-PCR with oligonucleotides that amplify sequences downstream of the YTA7 3′ UTR. We also measured YTA7 mRNA levels by using oligonucleotides that amplify a fragment of the YTA7 exon (Figure 8E). The results show that depletion of Nrd1 and/or Nab3 led to an increase in transcription downstream of the YTA7 3′ UTR (Figure 8E), indicating read through. However, we can not exclude the possibility that these transcripts represent different isoforms of the same gene . As with IPP1, depletion of Nab3 had by far the strongest effect (Figure 8E). Strikingly, we could also detect two- to four-fold increase in YTA7 mRNA levels in the absence of these proteins. This suggests that, by default, a significant fraction of YTA7 is degraded via the Nrd1-Nab3 termination pathway.
Genome-wide ChIP data had indicated that Nrd1 binding correlated with serine 7 phosphorylation of the Pol II CTD, whereas recruitment of factors required for conventional CPF pathway correlated with serine 2 phosphorylation . Both serine 7 and serine 2 phosphorylation peaked in the 3′ UTR of YTA7 (Figure 8C) , indicating that both the Nrd1-Nab3 and CPF termination pathways are active in this region. This organization of termination signals is frequently found in cryptic transcripts (CUTs) , many of which are downregulated via the Nrd1-Nab3 pathway. It appears that a similar mechanism is used to regulate YTA7 mRNA levels and our bioinformatics analyses suggest that several hundred genes could be regulated in this way; we are currently investigating this in more detail. Transcriptome-wide, the Nrd1-Nab3 UV cross-linking profiles change when cells are starved of glucose . It is conceivable, therefore, that the expression levels of these genes are dictated by the nutrient availability.
We have presented a comprehensive analysis of Nrd1 and Nab3 PAR-CLIP datasets using the pyCRAC tool suite. We have uncovered more than a thousand potential Nrd1-Nab3 mRNA targets and our data indicate that Nrd1-Nab3 play an important role in the nutrient response and mitochondrial function. We have also provided valuable biological insights into regulation of mRNA transcription by the Nrd1-Nab3 termination pathway. Our data support a role for Nab3 in ‘fail-safe’ termination and regulation of XUT expression. Moreover, we demonstrate that Nrd1-Nab3 can terminate transcription of long transcripts and downregulate mRNA levels by binding to 3′ UTRs. We speculate that at least several hundreds of genes are regulated in this way. We are confident that the analyses presented here will be a useful resource for groups working on transcription termination.
Materials and methods
The data described here were generated using pyCRAC version 1.1, which can be downloaded from . The Galaxy version is available on the Galaxy tool-shed at  and requires pyCRAC to be installed in the /usr/local/bin/ directory.
Sequence and feature files
All Gene Transfer Format (GTF) annotation and genomic sequence files were obtained from ENSEMBL. Genomic coordinates for annotated CUTs, SUTs, TSSs, polyadenylation sites and UTRs were obtained from the Saccharomyces Genome Database (SGD) [22, 38–41]. To visualize the data in the UCSC genome browser the pyGTF2bed and pyGTF2bedGraph tools were used to convert pyCRAC GTF output files to a UCSC compatible bed format.
Raw data processing and reference sequence alignment
Nrd1, Nab3 and Pol II (Rpb2) PAR-CLIP datasets were downloaded from the Gene Expression Omnibus (GEO) database (GSM791764, Nrd1; GDM791765, Rpb2; GSM791767; Nab3). The fastx_toolkit  was used to remove low quality reads, read artifacts and adapter sequences from fastq files. Duplicate reads were removed using the pyCRAC pyFastqDuplicateRemover tool. Reads were mapped to the 2008 S. cerevisiae genome (version EF2.59) using novoalign version 2.07  and only cDNAs that mapped to a single genomic location were considered.
Counting overlap with genomic features
PyReadCounters was used to calculate overlap between aligned cDNAs and yeast genomic features. To simplify the analyses, we excluded intron-containing mRNAs. UTR coordinates were obtained from the Saccharomyces Genome Database (SGD) [40, 52]. The yeast genome version EF2.59 genomic feature file (2008; ENSEMBL) was used for all the analyses described here.
Calculation of motif false discovery rates
The pyCalculateFDRs script uses a modified version of a FDR algorithm implemented in Pyicos . For a detailed explanation of how the algorithm works, please see the pyCRAC documentation. Reads overlapping a gene or genomic feature were randomly distributed a hundred times over the gene sequence and FDRs were calculated by dividing the probability of finding a region in the PAR-CLIP data with the same coverage by the probability of finding the same coverage in the gene in the randomized data. We only selected regions with an FDR ≤0.01.
The motif analyses were performed using the pyMotif tool from the pyCRAC suite. To indicate overrepresentation of a k-mer sequence in the experimental data, pyMotif calculates Z-scores for each k-mer, defined as the number of standard deviations by which an actual k-mer count minus the k-mer count from random data exceeds zero. K-mers were extracted from contigs that mapped sense or anti-sense to yeast genomic features. Repetitive sequences in reads or clusters were only counted once to remove biases towards homopolymeric sequences. Bedtools was used to extract motifs that overlap with genomic features such as exons and UTRs and plots were generated using Gnuplot. The EMBOSS tool fuzznuc was used to extract genomic coordinates for all possible Nrd1 and Nab3 binding and the output files were converted to the GTF format.
Generation of genome-wide coverage plots
PyBinCollector was used to generate the coverage plots. To normalize the gene lengths, the tool divided the gene sequences over an equal number of bins. For each read, cluster (and their mutations), it calculated the number of nucleotides that map to each bin (referred to as nucleotide densities). To plot the distribution of T-C mutations over the 4 nucleotide Nrd1-Nab3 RNA binding motifs, we added 50 nucleotides up- and downstream of genomic coordinates for each identified motif, and divided these into 104 bins, yielding one nucleotide per bin and the motif start at bin 51. We then calculated the number of T-C substitutions that map to each bin and divided the number by the total number of Ts in each bin, yielding T-C substitution percentages. To plot the distribution of cross-linked motifs around TSSs, we included 500 nucleotides up- and downstream of the start sites and divided these into 1,001 bins, yielding one nucleotide per bin. To generate the heat maps shown in Figures 3 and 4, we used the --outputall flag in pyBinCollector. The resulting data were K-means clustered using Cluster 3.0 . Heat maps were generated using TreeView .
Western and northern blot analyses
Western blot analyses and genetic depletion of Nrd1-Nab3 using GAL::3HA strains were performed as previously described . Briefly, cells were grown in YPGalRaf (2% galactose, 2% raffinose) to an OD600 of approximately 0.5 and shifted to YPD medium (2% glucose) for 9 (GAL::3HA-nrd1/GAL::3HA-nab3), 10 (GAL::3HA-nrd1) or 12 hours (GAL::3HA-nab3). Total RNA extraction was performed as previously described . Northern blotting analyses were performed using ULTRAhyb-Oligo according to the manufacturer’s procedures (Ambion Austin, TX, USA). Oligonucleotides used in this study are listed in Table S3 in Additional file 1. Nrd1 and Nab3 proteins were detected using horse radish-conjugated anti-HA antibodies (Santa Cruz, Dallas, TX, USA; 1:5,000)
The oligonucleotide primers used for the RT-PCR analyses are listed in Table S3 in Additional file 1. Total RNA was treated with DNase I (Ambion) according to the manufacturer’s instructions. For the qRT-PCR analyses, RNA was reverse-transcribed and amplified using qScript One-Step SYBR Green qRT-PCR (Quanta Bioscience, Gaithersburg, MD, USA), performed on a Roche LightCycler 480 according to the manufacturer’s instructions (Roche, Burgess Hill, UK). Each reaction contained 50 ng template RNA and 250 nM gene-specific primers. Thermal cycling conditions were composed of 50°C for 5 minutes, 95°C for 2 minutes, followed by 40 cycles of 95°C for 3 s, 60°C for 30 s. Appropriate no-RT and no-template controls were included in each assay, and a dissociation analysis was performed to test assay specificity. Relative quantification in gene expression was calculated using the Roche LightCycler 480 Software. YTA7 levels were normalized to the levels of the PPM2 transcript (NM_00118395) where no significant cross-linking of Nrd1 and Nab3 was detected. For the end-point RT-PCR reactions, 100 ng of total RNA was reverse transcribed using Superscript III at 50°C according to the manufacturers instructions (Invitrogen, Paisley, UK) and 2 μM of IPP1 reverse primer. The PCR included 200 nM of forward primers. Thermal cycling conditions were 35 cycles of: 95°C for 30 s, 60°C for 30 s and then 72°C for 1 minute.
Cross-linking and immunoprecipitation
Cleavage and polyadenylation
Cross-linking and cDNA analysis
Cryptic unstable transcript
False discovery rate
Gene Transfer Format
Polymerase chain reaction
Small nucleolar RNA
Small nuclear RNA
Stable unannotated transcript
Transcription start site
Xrn1-sensitive unstable transcript.
We would like to thank Jai Tree, Louise McGibbon, Rebecca Holmes, Alex Tuck and many beta-testers from outside our University for testing the pyCRAC tools and help with debugging the scripts. We are very grateful to Steve West, Lidia Vasilieva and Jeffry Corden for critically reading the manuscript and David Tollervey for his support and advice. We would like to thank Alastair Kerr, Stuart Aitken and Christos Josephides for helpful suggestions on the implementation of algorithms for statistical analyses. This work was supported by Wellcome Trust Research and Career Development Grants (097383 to GK, 091549 to SG), the Medical Research Council (GK) and the Wellcome Trust Centre for Cell Biology core grant (092076).
- Hafner M, Landthaler M, Burger L, Khorshid M, Hausser J, Berninger P, Rothballer A, Ascano M, Jungkamp A-C, Munschauer M, Ulrich A, Wardle GS, Dewell S, Zavolan M, Tuschl T: Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell. 2010, 141: 129-141. 10.1016/j.cell.2010.03.009.View ArticleGoogle Scholar
- König J, Zarnack K, Rot G, Curk T, Kayikci M, Zupan B, Turner DJ, Luscombe NM, Ule J: iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol. 2010, 17: 909-915. 10.1038/nsmb.1838.View ArticleGoogle Scholar
- Licatalosi DD, Mele A, Fak JJ, Ule J, Kayikci M, Chi SW, Clark TA, Schweitzer AC, Blume JE, Wang X, Darnell JC, Darnell RB: HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature. 2008, 456: 464-469. 10.1038/nature07488.View ArticleGoogle Scholar
- Wang Z, Kayikci M, Briese M, Zarnack K: iCLIP predicts the dual splicing effects of TIA-RNA interactions. PLoS Biol. 2010, 8: e1000530-10.1371/journal.pbio.1000530.View ArticleGoogle Scholar
- Granneman S, Kudla G, Petfalski E, Tollervey D: Identification of protein binding sites on U3 snoRNA and pre-rRNA by UV cross-linking and high-throughput analysis of cDNAs. Proc Natl Acad Sci U S A. 2009, 106: 9613-9618. 10.1073/pnas.0901997106.View ArticleGoogle Scholar
- Jamonnak N, Creamer TJ, Darby MM, Schaughency P, Wheelan SJ, Corden JL: Yeast Nrd1, Nab3, and Sen1 transcriptome-wide binding maps suggest multiple roles in post-transcriptional RNA processing. RNA. 2011, 17: 2011-2025. 10.1261/rna.2840711.View ArticleGoogle Scholar
- Wang Z, Tollervey J, Briese M, Turner D, Ule J: CLIP: construction of cDNA libraries for high-throughput sequencing from RNAs cross-linked to proteins in vivo. Methods. 2009, 48: 287-293. 10.1016/j.ymeth.2009.02.021.View ArticleGoogle Scholar
- Corcoran DL, Georgiev S, Mukherjee N, Gottwein E, Skalsky RL, Keene JD, Ohler U: PARalyzer: definition of RNA binding sites from PAR-CLIP short-read sequence data. Genome Biol. 2011, 12: R79-10.1186/gb-2011-12-8-r79.View ArticleGoogle Scholar
- Althammer S, González-Vallinas J, Ballaré C, Beato M, Eyras E: Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data. Bioinformatics. 2011, 27: 3333-3340. 10.1093/bioinformatics/btr570.View ArticleGoogle Scholar
- Sievers CC, Schlumpf TT, Sawarkar RR, Comoglio FF, Paro RR: Mixture models and wavelet transforms reveal high confidence RNA-protein interaction sites in MOV10 PAR-CLIP data. Nucleic Acids Res. 2012, 40: e160-e160. 10.1093/nar/gks697.View ArticleGoogle Scholar
- Khorshid M, Rodak C, Zavolan M: CLIPZ: a database and analysis environment for experimentally determined binding sites of RNA-binding proteins. Nucleic Acids Res. 2011, 39: D245-D252. 10.1093/nar/gkq940.View ArticleGoogle Scholar
- Vasiljeva L, Buratowski S: Nrd1 interacts with the nuclear exosome for 3′ processing of RNA polymerase II transcripts. Mol Cell. 2006, 21: 239-248. 10.1016/j.molcel.2005.11.028.View ArticleGoogle Scholar
- Steinmetz EJ, Conrad NK, Brow DA, Corden JL: RNA-binding protein Nrd1 directs poly(A)-independent 3′-end formation of RNA polymerase II transcripts. Nature. 2001, 413: 327-331. 10.1038/35095090.View ArticleGoogle Scholar
- Hobor F, Pergoli R, Kubicek K, Hrossova D, Bacikova V, Zimmermann M, Pasulka J, Hofr C, Vanacova S, Stefl R: Recognition of transcription termination signal by the nuclear polyadenylated RNA-binding (NAB) 3 protein. J Biol Chem. 2011, 286: 3645-3657. 10.1074/jbc.M110.158774.View ArticleGoogle Scholar
- Carroll KL, Pradhan DA, Granek JA, Clarke ND, Corden JL: Identification of cis elements directing termination of yeast nonpolyadenylated snoRNA transcripts. Mol Cell Biol. 2004, 24: 6241-6252. 10.1128/MCB.24.14.6241-6252.2004.View ArticleGoogle Scholar
- Carroll KL, Ghirlando R, Ames JM, Corden JL: Interaction of yeast RNA-binding proteins Nrd1 and Nab3 with RNA polymerase II terminator elements. RNA. 2007, 13: 361-373. 10.1261/rna.338407.View ArticleGoogle Scholar
- Lunde BM, Hörner M, Meinhart A: Structural insights into cis element recognition of non-polyadenylated RNAs by the Nab3-RRM. Nucleic Acids Res. 2011, 39: 337-346. 10.1093/nar/gkq751.View ArticleGoogle Scholar
- Hogan DJ, Riordan DP, Gerber AP, Herschlag D, Brown PO: Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol. 2008, 6: e255-10.1371/journal.pbio.0060255.View ArticleGoogle Scholar
- Thiebaut M, Kisseleva-Romanova E, Rougemaille M, Boulay J, Libri D: Transcription termination and nuclear degradation of cryptic unstable transcripts: a role for the nrd1-nab3 pathway in genome surveillance. Mol Cell. 2006, 23: 853-864. 10.1016/j.molcel.2006.07.029.View ArticleGoogle Scholar
- Kuehner JN, Pearson EL, Moore C: Unravelling the means to an end: RNA polymerase II transcription termination. Nat Rev Mol Cell Biol. 2011, 12: 283-294. 10.1038/nrm3098.View ArticleGoogle Scholar
- LaCava J, Houseley J, Saveanu C, Petfalski E, Thompson E, Jacquier A, Tollervey D: RNA degradation by the exosome is promoted by a nuclear polyadenylation complex. Cell. 2005, 121: 713-724. 10.1016/j.cell.2005.04.029.View ArticleGoogle Scholar
- Wyers F, Rougemaille M, Badis G, Rousselle JC, Dufour ME, Boulay J, Regnault B, Devaux F, Namane A, Séraphin B, Libri D, Jacquier A: Cryptic pol II transcripts are degraded by a nuclear quality control pathway involving a new poly(A) polymerase. Cell. 2005, 121: 725-737. 10.1016/j.cell.2005.04.030.View ArticleGoogle Scholar
- Grzechnik P, Kufel J: Polyadenylation linked to transcription termination directs the processing of snoRNA precursors in yeast. Mol Cell. 2008, 32: 247-258. 10.1016/j.molcel.2008.10.003.View ArticleGoogle Scholar
- Wlotzka W, Kudla G, Granneman S, Tollervey D: The nuclear RNA polymerase II surveillance system targets polymerase III transcripts. EMBO J. 2011, 30: 1790-1803. 10.1038/emboj.2011.97.View ArticleGoogle Scholar
- Arigo JT, Carroll KL, Ames JM, Corden JL: Regulation of yeast NRD1 expression by premature transcription termination. Mol Cell. 2006, 21: 641-651. 10.1016/j.molcel.2006.02.005.View ArticleGoogle Scholar
- Kim M, Vasiljeva L, Rando OJ, Zhelkovsky A, Moore C, Buratowski S: Distinct pathways for snoRNA and mRNA termination. Mol Cell. 2006, 24: 723-734. 10.1016/j.molcel.2006.11.011.View ArticleGoogle Scholar
- Gudipati RK, Villa T, Boulay J, Libri D: Phosphorylation of the RNA polymerase II C-terminal domain dictates transcription termination choice. Nat Struct Mol Biol. 2008, 15: 786-794. 10.1038/nsmb.1460.View ArticleGoogle Scholar
- Vasiljeva L, Kim M, Mutschler H, Buratowski S, Meinhart A: The Nrd1-Nab3-Sen1 termination complex interacts with the Ser5-phosphorylated RNA polymerase II C-terminal domain. Nat Struct Mol Biol. 2008, 15: 795-804. 10.1038/nsmb.1468.View ArticleGoogle Scholar
- Creamer TJ, Darby MM, Jamonnak N, Schaughency P, Hao H, Wheelan SJ, Corden JL: Transcriptome-Wide Binding Sites for Components of the Saccharomyces cerevisiae Non-Poly(A) Termination Pathway: Nrd1, Nab3, and Sen1. PLoS Genet. 2011, 7: e1002329-10.1371/journal.pgen.1002329.View ArticleGoogle Scholar
- Darby MM, Serebreni L, Pan X, Boeke JD, Corden JL: The S. cerevisiae Nrd1-Nab3 Transcription Termination Pathway Acts in Opposition to Ras Signaling and Mediates Response to Nutrient Depletion. Mol Cell Biol. 2012, 32: 1762-1775. 10.1128/MCB.00050-12.View ArticleGoogle Scholar
- Ciais D, Bohnsack MT, Tollervey D: The mRNA encoding the yeast ARE-binding protein Cth2 is generated by a novel 3′ processing pathway. Nucleic Acids Res. 2008, 36: 3075-3084. 10.1093/nar/gkn160.View ArticleGoogle Scholar
- Rondón AG, Mischo HE, Kawauchi J, Proudfoot NJ: Fail-safe transcriptional termination for protein-coding genes in S. cerevisiae. Mol Cell. 2009, 36: 88-98. 10.1016/j.molcel.2009.07.028.View ArticleGoogle Scholar
- Gudipati RK, Xu Z, Lebreton A, Séraphin B, Steinmetz LM, Jacquier A, Libri D: Extensive degradation of RNA Precursors by the Exosome in Wild-Type Cells. Mol Cell. 2012, 48: 409-421. 10.1016/j.molcel.2012.08.018.View ArticleGoogle Scholar
- pyCRAC command line tools. [https://bitbucket.org/sgrann/pycrac]
- Porrua O, Hobor F, Boulay J, Kubicek K, D’Aubenton-Carafa Y, Gudipati RK, Stefl R, Libri D: In vivo SELEX reveals novel sequence and structural determinants of Nrd1-Nab3-Sen1-dependent transcription termination. EMBO J. 2012, 31: 3935-3948. 10.1038/emboj.2012.237.View ArticleGoogle Scholar
- Noël J-F, Larose S, Abou Elela S, Wellinger RJ: Budding yeast telomerase RNA transcription termination is dictated by the Nrd1/Nab3 non-coding RNA termination pathway. Nucleic Acids Res. 2012, 40: 5625-5636. 10.1093/nar/gks200.View ArticleGoogle Scholar
- Kim H, Erickson B, Luo W, Seward D, Graber JH, Pollock DD, Megee PC, Bentley DL: Gene-specific RNA polymerase II phosphorylation and the CTD code. Nat Struct Mol Biol. 2010, 17: 1279-1286. 10.1038/nsmb.1913.View ArticleGoogle Scholar
- Neil H, Malabat C, D’Aubenton-Carafa Y, Xu Z, Steinmetz LM, Jacquier A: Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature. 2009, 457: 1038-1042. 10.1038/nature07747.View ArticleGoogle Scholar
- Xu Z, Wei W, Gagneur J, Perocchi F, Clauder-Münster S, Camblong J, Guffanti E, Stutz F, Huber W, Steinmetz LM: Bidirectional promoters generate pervasive transcription in yeast. Nature. 2009, 457: 1033-1037. 10.1038/nature07728.View ArticleGoogle Scholar
- Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008, 320: 1344-1349. 10.1126/science.1158441.View ArticleGoogle Scholar
- Ozsolak F, Kapranov P, Foissac S, Kim SW, Fishilevich E, Monaghan AP, John B, Milos PM: Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell. 2010, 143: 1018-1029. 10.1016/j.cell.2010.11.020.View ArticleGoogle Scholar
- Kuehner JN, Brow DA: Regulation of a eukaryotic gene by GTP-dependent start site selection and transcription attenuation. Mol Cell. 2008, 31: 201-211. 10.1016/j.molcel.2008.05.018.View ArticleGoogle Scholar
- Thiebaut M, Colin J, Neil H, Jacquier A, Séraphin B, Lacroute F, Libri D: Futile cycle of transcription initiation and termination modulates the response to nucleotide shortage in S. cerevisiae. Mol Cell. 2008, 31: 671-682. 10.1016/j.molcel.2008.08.010.View ArticleGoogle Scholar
- Jenks MH, O’Rourke TW, Reines D: Properties of an intergenic terminator and start site switch that regulate IMD2 transcription in yeast. Mol Cell Biol. 2008, 28: 3883-3893. 10.1128/MCB.00380-08.View ArticleGoogle Scholar
- Kopcewicz KA, O’Rourke TW, Reines D: Metabolic regulation of IMD2 transcription and an unusual DNA element that generates short transcripts. Mol Cell Biol. 2007, 27: 2821-2829. 10.1128/MCB.02159-06.View ArticleGoogle Scholar
- van Dijk EL, Chen CL, d’Aubenton-Carafa Y, Gourvennec S, Kwapisz M, Roche V, Bertrand C, Silvain M, Legoix-Né P, Loeillet S, Nicolas A, Thermes C, Morillon A: XUTs are a class of Xrn1-sensitive antisense regulatory non-coding RNA in yeast. Nature. 2011, 475: 114-117. 10.1038/nature10118.View ArticleGoogle Scholar
- Hess DC, Myers CL, Huttenhower C, Hibbs MA, Hayes AP, Paw J, Clore JJ, Mendoza RM, Luis BS, Nislow C, Giaever G, Costanzo M, Troyanskaya OG, Caudy AA: Computationally driven, quantitative experiments discover genes required for mitochondrial biogenesis. PLoS Genet. 2009, 5: e1000407-10.1371/journal.pgen.1000407.View ArticleGoogle Scholar
- Pelechano V, Wei W, Steinmetz LM: Extensive transcriptional heterogeneity revealed by isoform profiling. Nature. 2013, 497: 127-131. 10.1038/nature12121.View ArticleGoogle Scholar
- pyCRAC for Galaxy. [http://toolshed.g2.bx.psu.edu/view/swebb/pycrac]
- FASTX-Toolkit. [http://hannonlab.cshl.edu/fastx_toolkit/]
- Novoalign. [http://www.novocraft.com/main/index.php]
- Saccharomyces Genome Database. [http://www.yeastgenome.org]
- Cluster. [http://bonsai.hgc.jp/~mdehoon/software/cluster/index.html]
- Java TreeView. [http://jtreeview.sourceforge.net]
- Tollervey D, Mattaj IW: Fungal small nuclear ribonucleoproteins share properties with plant and vertebrate U-snRNPs. EMBO J. 1987, 6: 469-476.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.