High resolution transcriptome maps for wild-type and nonsense-mediated decay-defective Caenorhabditis elegans
- Arun K Ramani†1, 2,
- Andrew C Nelson†2, 3,
- Philipp Kapranov4, 5,
- Ian Bell4,
- Thomas R Gingeras4, 6Email author and
- Andrew G Fraser1, 2Email author
© Ramani et al.; licensee BioMed Central Ltd. 2009
Received: 5 June 2009
Accepted: 24 September 2009
Published: 24 September 2009
While many genome sequences are complete, transcriptomes are less well characterized. We used both genome-scale tiling arrays and massively parallel sequencing to map the Caenorhabditis elegans transcriptome across development. We utilized this framework to identify transcriptome changes in animals lacking the nonsense-mediated decay (NMD) pathway.
We find that while the majority of detectable transcripts map to known gene structures, >5% of transcribed regions fall outside current gene annotations. We show that >40% of these are novel exons. Using both technologies to assess isoform complexity, we estimate that >17% of genes change isoform across development. Next we examined how the transcriptome is perturbed in animals lacking NMD. NMD prevents expression of truncated proteins by degrading transcripts containing premature termination codons. We find that approximately 20% of genes produce transcripts that appear to be NMD targets. While most of these arise from splicing errors, NMD targets are enriched for transcripts containing open reading frames upstream of the predicted translational start (uORFs). We identify a relationship between the Kozak consensus surrounding the true start codon and the degree to which uORF-containing transcripts are targeted by NMD and speculate that translational efficiency may be coupled to transcript turnover via the NMD pathway for some transcripts.
We generated a high-resolution transcriptome map for C. elegans and used it to identify endogenous targets of NMD. We find that these transcripts arise principally through splicing errors, strengthening the prevailing view that splicing and NMD are highly interlinked processes.
Identifying genes whose mRNA expression is perturbed in a mutant can yield great insight into a wide range of biological problems. For example, comparing gene expression in wild-type organisms with that seen in mutants can be used to identify the targets of transcription factors or signaling pathways , to organize genes into modules [2–7], and to order genes in pathways [8, 9]. Recently, genome-scale tiling arrays and massively parallel sequence analysis of transcriptomes have emerged as powerful new tools for transcriptome analysis [10–14]. Both rely on the availability of high quality genome sequence, and both offer the promise of transcriptome analysis at unprecedented depth and efficiency.
Each technology has different strengths. In the case of tiling arrays, the entire transcriptome can be queried at the same depth in a single hybridization, making it a very cost-effective way to achieve excellent coverage. However, the resolution with which any transcript can be mapped is limited by the resolution of the array (which for most complex genomes is not at single base-pair resolution) and, furthermore, while one can rapidly identify the regions of the genome that correspond to mature transcripts, the arrays contain no implicit information about how these are connected. Deep sequencing of the transcriptome on the other hand generates data at single-base resolution. While the sequence reads from all current technologies are short (typically 35 to 70 bp), it is possible to assemble these into longer contiguous reads and to link these contigs together. However, since the range of gene expression extends over many orders of magnitude, achieving good coverage for a complex transcriptome is still costly, and assembly of the data is still computationally intensive.
Since tiling arrays and sequencing have complementary benefits for transcriptome analysis, we decided to use both technologies to examine the Caenorhabditis elegans transcriptome across a series of developmental stages. The C. elegans genome is completely sequenced and while it contains a similar number of genes as the human genome, it is much more compact - around 27%  of the worm genome is coding compared with 1.5%  in humans. Genome annotation is generally of high quality in the worm; the genome is relatively small (approximately 100 Mb compared with approximately 3 Gb in human) and unrepetitive, making both tiling- and sequence-based approaches comparatively straightforward in the worm. Both technologies allow examination not only of levels of gene expression but also of splice changes across development; they also allow identification of novel transcripts that do not lie in annotated gene structures. They thus provide an unbiased and rich view of the changing transcriptome across development and our immediate goal was to map the wild-type transcriptome at good coverage and resolution and, thus, to provide a framework to analyze perturbations of the transcriptome in mutants.
In addition to mapping the wild-type transcriptome across several developmental stages, we wished to assess the usefulness of these data for examining how the transcriptome is perturbed in mutant animals. To this end, we used both tiling arrays and sequencing to examine the transcriptome of worms defective for nonsense-mediated decay (NMD), identified in animals by Hodgkin et al.  and reviewed in [18, 19]. The central cellular role of the NMD pathway is to prevent the expression of prematurely truncated proteins, which are likely to have deleterious consequences. The NMD pathway recognizes transcripts containing premature termination codons (PTCs) and targets them for degradation, thus eliminating them from the cell . The role of the NMD pathway in eliminating PTC-containing transcripts is highly conserved and indeed many of the components are shared from yeast to human (see [21, 22] for reviews), including the core components SMG-2, SMG-3 and SMG-4 (Upf1-3 in Saccharomyces cerevisiae).
The PTC-containing transcripts that are targets for NMD recognition and degradation arise from three principal sources [21, 23–27]. The first occurs from transcripts deriving from genes containing nonsense mutations, whether inherited or somatic. However, nonsense mutations play a clear role in many human genetic diseases, and in several of these, NMD has been shown to affect the severity of the disease phenotype and, thus, this class of target, though rare, has key medical importance . The second class comprises transcripts that contain PTCs that arise during alternative splicing - either retention of introns or errors in splice site selection [29–32]. Finally, transcripts can be targeted by NMD despite having no PTCs in the principal open reading frame (ORF); instead, these transcripts contain a short ORF upstream of the true start ATG, known as an upstream ORF (uORF). The stop codon of this uORF is recognized as a premature stop codon and the transcript is thus recognized as an NMD target. Recently, genome-scale studies using standard expression microarrays have identified endogenous transcripts that are targets of NMD in yeast [33, 34], Drosophila [35, 36], and humans [37–39]. In all three organisms examined, approximately 10% of genes give rise to a transcript that is targeted for degradation via the NMD pathway, a surprisingly large number .
Much is thus known already about NMD: the molecular components of the NMD pathway are well-characterized, many of the molecular features that cause a specific transcript to be degraded via the NMD pathway are known, and many endogenous transcripts are found to be affected by NMD. However, other than in yeast  no genome-scale studies have examined the effect of NMD on wild-type transcriptomes with the resolution that either tiling arrays or transcriptome sequencing can provide. We thus set out to compare the transcriptomes of wild-type animals with that of worms that are defective for NMD using both tiling arrays and deep sequencing to determine whether the increased resolution of such analyses can provide new insight into the effect of NMD on the transcriptome of normal developing worms.
Results and discussion
Outline of approach and overview of data
Both genome-scale tiling arrays and deep sequencing approaches were used generate a high resolution, high coverage 'reference transcriptome' for C. elegans to use as a tool to guide analysis of perturbed transcriptomes such as those of mutant animals. At the time of initiating these studies, there was a great difference in the cost to analyze any specific RNA sample by tiling arrays or by deep sequencing, and we thus chose to use tiling arrays as our primary method to map the transcriptome across multiple developmental stages, and deep sequencing to validate the tiling data and to refine the resolution of the transcript mapping at a more limited number of developmental stages. Combining these data in this way combines the cost-effectiveness of tiling with the higher resolution of sequencing to generate a high quality transcriptome map.
For our tiling analysis, we purified total RNA from wild-type N2 animals at four different stages of the C. elegans life-cycle (larval stages L3 and L4, young adults, and gravid adults). For each developmental stage, RNA samples were prepared in triplicates and hybridized individually to genome-scale tiling arrays - these have a 35 bp resolution and allow an unbiased view of the majority (70%) of the genome. We initially examined these data to assess coverage and to compare data quality between tiling and sequencing. At any single developmental stage, we detect expression of around a third of all predicted genes on tiling arrays (see Materials and methods; Table S1 in Additional data file 1); across all examined stages, we detect approximately 50% (9,515 out of 19,169 annotated genes in WS150 release of Wormbase ) of genes. This is comparable to the detection sensitivity of conventional microarrays. We find that approximately 95% of transcribed features (so-called 'transfrags', the individual contiguous regions of the genome that are transcribed; see Materials and methods and [11, 13] for definition) map to currently predicted transcripts (Table S2 in Additional data file 1), a far higher proportion than that observed in either Drosophila  or human . We note that while the proportion of novel transfrags is far lower in the worm, this is in keeping with previous results  and is broadly as expected for the worm genome given the far higher proportion of predicted coding sequence relative to that found in many other animal genomes. In addition, this low proportion of novel transcribed regions identified as transcribed indicates that genome annotation and gene prediction in the worm is of generally excellent quality.
Novel transcribed regions of the C. elegansgenome
Novel transfrags can either arise from entirely new transcripts that have not been predicted or they could alternatively be novel exons or previously predicted genes. In the latter case, it should be possible to connect these novel transfrags to known gene annotations. To examine this, we used Illumina paired end sequencing on poly-A+ RNA derived from mixed stage populations of worms. We identified reads mapping to novel transfrags and asked whether the paired sequence read mapped to a known gene structure. In approximately 60% of cases (Figure 2a), we could unambiguously connect a novel transfrag confirmed by both sequence and tiling to a previous predicted transcript, suggesting that these are novel exons (Figure 2b, c). Of such novel exons, 65% (20% 5' and 45% 3') are either 5' or 3' to the coding region of the gene, consistent with a view that terminal exons are more variable; therefore, predicting transcript ends is considerably harder and more error prone than predicting internal coding exons.
Finally, to further investigate the novel transfrags identified by tiling, we compared our tiling data to multiple other gene models in C. elegans. First, we examined the proportion of our novel transfrags (based on gene models in version WS150 of Wormbase) that were still novel in later sets of gene models (WS160, WS170, WS180, and WS190) (Figure 2d). We find that approximately 30% (670 of 2,229) of transfrags that were outside gene models in WS150 have since been incorporated in newer gene models; 90% of those that are novel exons are now confirmed in gene models. Of the remaining (approximately 70%; 1,530 of 2,229), we note that many map to alternative gene models outside the canonical C. elegans gene models - for example, approximately 15% (204 of 1,530) overlap with Twinscan models. We believe that many of these novel transfrags are likely to represent errors in standard gene models, since other gene models predict many of them relatively well. Thus, our data, like previous work [44, 45], may contribute to refining de novo gene models.
In total, then, we identified 10,073 (the unique non-overlapping set from the four stages) novel transcribed regions relative to gene models in WS150 using tiling arrays. Most of these (Table S4 in Additional data file 1) could be confirmed by sequence and of those identified by both technologies, approximately 30% appear to be novel exons of previously annotated transcripts.
Alternative splicing detection using tiling arrays and transcriptome sequencing
Tiling arrays can be used to measure gene expression at the level of mRNA. However, unlike conventional expression arrays, tiling arrays can also be used to examine the expression of individual exons and their relative inclusion into transcripts deriving from any gene. Changes in the relative inclusion of an exon across development indicate changes in splicing and we thus investigated the extent to which we could identify splice changes across C. elegans development using our tiling data. For each exon, we computed its normalized intensity at each developmental stage based on tiling data. The normalized intensity (NI) of any exon is the expression level of the exon relative to the expression level of the gene that includes it. An NI of approximately 1 for an exon indicates that essentially all transcripts deriving from that gene include that exon; an NI of approximately 0 indicates that this exon is skipped from almost all transcripts deriving from that gene. We note that just as gene expression levels measured by tiling and sequence correlate very highly, this is also the case for levels of expression of each individual exon (Figure 1a).
Having established that normalized exon intensity can identify alternatively spliced exons with reasonable accuracy, we searched for exons whose NI changes between any two developmental stages. A significant change in NI for an exon between two stages indicates a shift in isoform levels for the gene in question. Note that we cannot infer the isoforms themselves, but only identify that they have changed based on changing NI. We find that approximately 5% (459 of 9,515) of expressed genes contain at least one exon whose NI changes by over 2-fold between any two developmental stages - that is, at least approximately 5% of genes change their relative isoform levels across development. While approximately 18% (based on WS150 release of the genome annotation) of worm genes are currently annotated to have multiple isoforms, this is the first genome-scale analysis of how isoforms change across development, although the identification of 5% of genes whose isoform patterns change across development is highly likely to be an underestimate.
Identification of endogenous targets of nonsense-mediated decay
As described above, we have combined tiling arrays and deep sequencing to construct a high-resolution reference transcriptome for wild-type animals across several developmental stages. We wished to use this to examine how the transcriptome is perturbed in mutant animals. Specifically, we were curious to see whether the high resolution of our transcriptome map could yield insights that would not have been evident using standard gene expression microarrays. As a test case, we chose to examine the transcriptome of worms defective for NMD. This was first identified in animals by Hodgkin et al.  and is a highly conserved cellular program evolved to prevent the expression of prematurely truncated proteins (reviewed in Chang et al. ) The NMD pathway recognizes transcripts containing PTCs and targets them for degradation - thus, transcripts that are targets for NMD should have elevated expression in cells that have no functional NMD pathway. This has been used to identify endogenous transcripts that are NMD targets in S. cerevisiae, Drosophila and human cells [34, 36–39, 46]. We mapped the transcriptome of worms that are defective for NMD using a combination of tiling arrays and sequencing (exactly as was done for the wild-type transcriptome above), and compare it to our wild-type reference transcriptome (Additional data file 3). This should allow us first to identify the endogenous targets of NMD in C. elegans, which has never been done at the genome-scale; second, to examine the features of transcripts are endogenous targets of NMD in C. elegans to see whether this is similar to that found in other organisms; and finally to determine whether we can gain novel insights from our high resolution map that are not seen simply by examining overall gene expression.
We examined the transcriptome of smg-1(r861) mutant worms - smg-1 encodes a central kinase in the NMD pathway, is highly conserved in eukaryotes, and is absolutely required for NMD in C. elegans . We purified total RNA from smg-1(r861) mutant animals at the same four developmental stages (L3, L4, young adults, and gravid adults) examined in the wild-type animals and hybridized these in triplicate to genome-wide tiling arrays. We computed gene intensities as for the wild-type transcriptome and could thus identify genes whose expression levels are perturbed in the smg-1(r861) mutants. We also used Illumina sequencing to examine the smg-1(r861) transcriptome at two developmental stages to check our tiling data. We find that just as gene expression levels and exon expression levels were highly similar between tiling and sequencing in the wild-type transcriptome, they are highly correlated for the smg-1(r861) data (Additional data file 3).
Looking across all developmental stages, approximately 17% (1,645 of 9,515) of all detectable genes differ in expression level by at least 1.5-fold between wild-type and smg-1(r861) worms and in the great majority of cases (approximately 75% overall), transcript levels are higher in the smg-1(r861) mutant, consistent with these being NMD targets. To confirm that these are not somehow specific to the smg-1(r861) strain, we also examined one time point (L4 larvae) in animals mutant for SMG-5 , a key phosphatase in the NMD pathway, and find that the great majority (318 out of 437 genes in this stage; approximately 73%) of genes whose expression differs between wild-type and smg-1(r861) animals also differs between wild-type and smg-5 (r860) animals , confirming the majority of these differences are indeed the result of loss of NMD and are not somehow specific to smg-1(r861). We thus estimate that at least 10% of genes produce a transcript that is elevated in an NMD mutant animal; this is likely to be an underestimate. This is a very similar proportion to that seen in yeast, fly, and human and suggests that while genome complexity and transcriptional regulation is very different in these organisms, the proportion of genes whose expression is affected by NMD is very similar. We next sought to examine the features of the transcripts that have elevated expression in the smg-1(r861) mutants.
Features of NMD-regulated transcripts in C. elegansidentified from gene models
We have identified many genes whose expression is increased in animals that have no functional NMD pathway. We note that identifying a higher level of expression of any gene in a NMD mutant does not mean that every transcript deriving from that gene is an NMD target, nor can we be sure that the effect is direct. However, what is clear is that all true endogenous NMD targets (ENTs) will have increased expression in mutant animals where NMD is completely lost. We therefore reasoned that transcript features associated with true NMD targets will be enriched in the sets of transcripts that are more highly expressed in the smg-1(r861) mutants even if some expression changes are the result of downstream effects.
Intron retention in NMD-regulated transcripts in C. elegans
We conclude that precise intron excision is relatively efficient in the worm since the great majority of introns are correctly excised and thus undetectable. However, at least 7% (1,274 of 20,000) of genes produce transcripts in which an intron has failed to be excised and that are ultimately degraded by NMD in wild-type animals. Intriguingly, we find that the retained introns are not a random set, but instead share certain features. Most particularly, we notice a decrease in usage of the canonical TTTCAG splice site consensus at the 3' end of these introns and a corresponding increase in the usage of less common splice sites. While nearly 75% of annotated introns contain the canonical TTNCAG, only 55% of the expressed introns have these 3' splice sites (P-value = 0.004) . We thus suggest that the number of errors associated with excision of introns flanked by splice sites that closely match the consensus sequences is far lower than for introns that have more divergent splice sites. This is similar to the relationship between diminished 5' splice donor and branch point consensus and intron retention leading to NMD as observed in yeast .
Splicing changes in NMD-regulated transcripts in C. elegans
It is becoming increasingly apparent that splicing and NMD are two highly linked processes, with almost entire families of splicing factors being alternative splicing-dependent NMD targets themselves [29, 39, 40, 52]. We examined differences in splicing between wild-type and smg-1(r861) animals at the level of exon intensity. As described above, we computed NIs for all cassette exons in each of the developmental stages in the smg-1(r861) animals and compared these NIs to those of each exon in wild-type animals. A difference in NI between wild-type and smg-1(r861) animals for any exon indicates a difference in overall levels of inclusion of that exon in the entire set of transcripts deriving from that gene in worms lacking NMD; this may be a direct or indirect effect of loss of NMD. In the case of direct effects, the exact same set of transcripts is synthesized in wild-type and smg-1(r861) animals; however, some splice variants contain PTCs and are thus degraded by NMD in wild-type worms. The difference in NI for an exon between wild-type and smg-1(r861) is thus due to the failure to degrade these PTC-containing isoforms in smg-1(r861) animals. Retention of these PTC-containing isoforms in smg-1(r861) animals will affect overall gene intensity; hence, in the case of splice changes that are the direct effects of NMD, not only will the NI of any exon be perturbed in the smg-1(r861) mutant, but the overall gene intensity will also be affected. Alternatively, a change in NI for an exon between wild-type and smg-1(r861) might be completely indirect, some downstream consequence of loss of NMD. In such cases, the difference in NI for an exon is due to a difference in the isoforms synthesized between wild-type and smg-1(r861) mutants. In these indirect cases, the NI difference reflects a difference in splicing between wild-type and smg-1(r861) mutants rather than a difference in transcript turnover/retention, and there will thus be no accompanying difference in gene intensity between wild-type and smg-1(r861) mutants. We thus distinguish direct from indirect effects of NMD on splicing patterns by assessing whether there is a gene intensity change accompanying any splice change that we see; if there is a concomitant change in gene expression, we deduce that the splice change is a direct effect of NMD.
We find that 485 genes have an exon whose NI differs between wild-type and smg-1(r861) mutants by 2-fold or more. In 350 of these 485 genes (72%) we find that the varying exon has an NI of 0.5 or more in the smg-1(r861) mutant; that is, in animals that have lost NMD, that exon is present in at least 50% of the transcripts deriving from that gene. However, only a minority of these genes (approximately 22%) show any difference in expression level in the NMD mutant animals. This is surprising - if these splice changes were the result of retaining transcripts with PTCs, we would expect a substantial difference in overall gene expression levels in the mutant. This suggests that many of the NI changes seen are likely to be indirect effects of loss of NMD (since transcripts are not NMD targets with or without the varying exon) and may be indicative of a more general perturbation in splice site selection in the NMD mutants. One possibility to explain indirect effects on splicing in NMD mutants is that NMD is affecting expression of splice factors themselves and there is previous evidence to suggest this may be true. One particular class of genes whose expression is affected by NMD comprises the rsp genes that encode the SR family of splice factors [30, 40, 50]. Indeed, when we examine our tiling data we see that seven of the eight C. elegans rsp genes give rise to NMD-targeted transcripts that have retained introns and we confirmed all these events by RT-PCR (Additional data file 5). We also find that the SR and hnRNP families of splice factors are over-expressed in the smg-1(r861) mutants based on both our tiling and sequence data (Table S7 in Additional data file 1), and wanted to extend this analysis to other splice factors to see if this is generally true.
We examined a manually curated list of all well-annotated splice factors (Table S8 in Additional data file 1) to see if NMD has an effect on expression of other splice factors and find that while only 13% (2,631 of 20,000) of all genes have an exon with an NI differing by 1.5-fold or more between wild-type and smg-1(r861) worms, approximately 33% (44 of 132; P-value < 0.0001) of splice factors have an exon with an NI differing by 1.5-fold. This is a strong enrichment and accords with previous findings [29, 30, 39, 40, 50, 53]. Crucially, the great majority of these differences in NI in splice factors in smg-1(r861) worms appear to be direct consequences of the loss of NMD - while only 30% of genes with strong differences (3-fold or higher) in NI between smg-1(r861) and wild-type show any difference in expression level in the NMD mutant animals, over 90% of splice factors with similar NI differences (3-fold) show increased expression levels (1.5-fold or higher) in the smg-1(r861) worms. We thus propose that the majority of splice differences between wild-type worms and worms defective for NMD are indirect; however, in the case of splice factors themselves, most splice differences are highly likely to be direct, the immediate result of expression of PTC-containing isoforms of these genes.
Translational initiation efficiency at the true start ATG affects NMD targeting of transcripts
Both genome-scale tiling arrays and new generation sequencing technologies can be used to examine the transcriptome at great depth and coverage. We used both technologies to make a 'reference' transcriptome for C. elegans, examining several different developmental stages of wild-type worms using high resolution tiling arrays and validating the data using deep sequencing. Our principal aim was not to make an exhaustive and complete map of the C. elegans transcriptome, but rather to generate a high-resolution scaffold that can be used to identify subtle perturbations of this transcriptome in mutant animals. Nonetheless, mapping the wild-type transcriptome itself identified a number of interesting findings and novel features.
First, we find that both sequencing and tiling analysis yielded very similar transcriptome maps, as had previously been found in Schizosaccharomyces pombe . The transcribed regions identified corresponded extremely well between these two technologies, and the levels of expression of either genes or individual exons were very similar. Since the methodologies underlying tiling arrays and sequencing are very different, this suggests that both methods are providing accurate maps of the transcriptome and expression levels. Second, we used both methodologies to assess levels of alternative splicing in the worm, and to identify changes in splicing across development. We find that most of the alternative splicing events inferred from tiling arrays can be directly validated by sequence reads spanning exon junctions and find that at least 5% of genes have major changes to their isoforms between any two developmental stages. This suggests that using either technology, or a combination of both, to examine perturbations in alternative splicing either in different conditions or different mutant backgrounds will be very powerful. Finally, we used tiling arrays to identify over 10,000 regions of the C. elegans genome that are transcribed but lie outside current canonical gene models. We find that over 60% of these can be confirmed by sequence, suggesting that most of these are real transcripts. Other gene finding programs such as Twinscan predict many of these, indicating that there may be a systematic bias against these specific transcribed regions in the models used to build the current canonical gene models. We examined our sequence data and find that approximately 40% of these novel transcribed regions can be connected to current gene models; these are thus novel exons. The remainder likely represent entirely novel genes and their identification in this way may guide future refinements to both final gene models and de novo gene-finding algorithms.
Having made a 'reference' wild-type transcriptome, we used this to examine how this transcriptome is perturbed in worms lacking a functional NMD pathway. Transcripts containing PTCs are normally degraded in wild-type animals but will be retained in NMD mutants and such transcripts will thus be expressed at higher levels in the mutant. We find that in C. elegans (as in yeast, fly and human) approximately 10% of genes have higher expression levels in NMD mutant animals. First, there is a clear enrichment for the presence of an uORF; second, such transcripts are more likely to have a long 3' UTR; third, we see clear evidence for intron retention in many of these transcripts; and finally, we identify many transcripts that appear to be direct targets of NMD due to alternative splicing events. Taken together, we can identify one or more such features in over 55% of genes that have at least 1.5-fold increased expression in any single developmental stage in smg-1(r861) animals. This number rises to over 80% for genes that have higher expression in the smg-1(r861) mutant in all four stages. This suggests that most increases in gene expression seen in smg-1(r861) animals are direct effects of loss of NMD and that the genes whose expression is increased in smg-1(r861) mutant animals make PTC-containing transcripts that are normally degraded via NMD.
Identifying genes with perturbed expression in NMD mutants could have been done using conventional expression arrays (as was previously done in yeast, fly, and human). Since the resolution with which we can examine the transcriptome is far higher using tiling and sequencing, we also investigated whether other more subtle changes can be seen in the transcriptome of NMD mutant animals. We examined our data to determine whether we could detect any changes in splicing in any genes between the wild-type and mutant transcriptomes that would not have been possible using conventional expression arrays. We identified a large number of introns that have increased expression in NMD mutant animals. We infer that they are likely to be retained in both wild-type and NMD mutant animals due to inefficient splicing, causing PTCs - these PTC-containing transcripts are then degraded in wild-type animals but persist (and hence are present at higher levels) in the NMD mutants. Intriguingly, the retained introns do not appear to be a random set - fewer of these have the 'TTTCAG' splice site consensus at their 3' end and we thus suggest that efficiency of excision of introns flanked by splice sites that closely match the canonical sequences is far higher than for introns that have more divergent splice sites. In total, approximately 7% of genes give rise to transcripts with retained introns - these are usually degraded in wild-type animals. Overall, however, we find that intron excision is highly efficient - the vast majority of introns are undetectable in either wild-type or NMD mutant transcriptomes, and those that fail to be excised (and hence have elevated levels in NMD mutants) are usually present at low levels compared with overall gene expression levels.
Finally, we identified a set of genes that have different isoform levels in wild-type and NMD mutant transcriptomes. Intriguingly, the majority of genes that show a difference in splicing in the NMD mutants do not differ in overall expression level between the mutants and the wild-type animals. We infer from this that the difference in isoform levels in the mutant is not due to the retention of isoform variants that contain PTCs (as this would manifest itself in a detectable difference in expression levels) in the mutants but is instead an indirect consequence of a loss of NMD. In this case, rather than the pattern of splicing being identical in wild-type and mutant and differences in isoform levels being caused by a loss of NMD in the mutant, the pattern of splicing itself is different between wild-type and mutant. Previous data from a variety of organisms suggested that some splice factors themselves are targets of NMD and indeed we find that approximately 30% of all well-annotated splice factor genes appear to make transcripts that are NMD targets. More intriguing is the finding that genes encoding splicing factors are far more likely to produce PTC-containing splice variants than random sets of genes (30% of splice factors, 13% of other genes). This finding supports the view that alternative splicing and NMD are two highly interlinked processes [49, 50, 53, 55–61].
We found that many of the splice changes seen in NMD mutant animals are likely to be indirect consequences of loss of NMD. Since many genes encoding splice factors are NMD targets, could this account for the indirect differences in splicing to many other genes seen in NMD mutant animals? We believe that this may be the case and that this may be the result of one of two possible effects. One alternative is that the level of splice factors is perturbed in animals defective for NMD and that this leads to the many changes in splicing seen in these mutants. Splice site selection is known to be highly sensitive to expression levels of splice factors and this is a plausible explanation. However, we note that although we detect a difference in overall transcript levels for many splice factors in NMD mutant animals, this does not imply that the levels of expressed full-length splice factor proteins differ in these animals compared with wild-type. The difference in expression levels of approximately 30% of splice factors in smg-1(r861) mutants is due to the synthesis of both full-length encoding transcripts that are stable in both wild-type and NMD mutant animals and PTC-including transcripts that are degraded in wild-type animals but not in animals lacking NMD. We suggest that the retention of these PTC-containing splice factor-encoding transcripts in smg-1(r861) animals may result in expression of truncated proteins that interfere with splice site specification by the full-length splice factors. The NMD pathway evolved to prevent precisely these kinds of detrimental effects following the expression of truncated proteins from PTC-containing transcripts. We note that this model is speculative and that the two alternatives are not mutually exclusive and our data alone cannot distinguish between these two models.
Intron retention, incorrect exon splicing, and the inheritance of nonsense mutations all lead to transcripts with PTCs and, thus, these are all classical targets of NMD. Even if such transcripts were translated, they cannot make full-length protein. The activation of NMD to degrade a transcript due to the presence of an uORF in the 5' UTR is more intriguing, however, since the principal ORF has no premature stops and thus could be translated. Current models suggest that the uORF stop is 'seen' by the NMD machinery as premature since the rest of the transcript is seen as an artificially elongated 'faux' 3' UTR [23, 48, 62–66] - our data are entirely consistent with this. However, we also find an intriguing correlation between the sequences surrounding the true ATG of the main ORF and the levels with which NMD recognizes uORF-containing transcripts for degradation. Even in transcripts containing an uORF, if the true ATG is surrounded by sequences that conform closely to a Kozak consensus, and thus is used efficiently for translational initiation, the transcript is less likely to be an NMD target. The effect is slight and poorly predictive - it does not explain why most transcripts containing an uORF evade NMD, and many other determinants must affect this. However, it fits well with a model where a first exploratory round of translation is used to find the start ATG followed by subsequent rounds of steady-state translation . If the true ATG is in an excellent initiation consensus, it will be used in subsequent rounds of initiation and the uORF will be translated at a lower rate; hence, there will be a lower impact of NMD. The relationship between the efficiency of translational initiation at the true ATG and the level of NMD targeting suggests that, in transcripts containing a uORF, selection of sequences surrounding the true ATG may in some cases fine-tune protein levels not only by affecting the rates of translational initiation but also by affecting transcript turnover and hence transcript levels. It will be interesting to see whether this speculative model holds true as experimentalists studying NMD identify the exact sequence features that mark out targets of NMD.
Examining the transcriptome of NMD mutant animals has given us a window into the change in cellular transcription and transcript processing machinery since we can see all transcripts that are made in the wild-type animals but degraded by NMD. We find that, in general, transcription in the worm appears to be very specific - the great majority of transcripts identified in the NMD mutants are also present in wild-type animals and are not affected by loss of NMD. The cell thus does not appear to make a large quantity of aberrant PTC-containing transcripts. Splicing appears to be slightly more error-prone, however, with approximately 7% of genes making transcripts that are NMD targets due to a failure to excise introns. Finally, while some changes in splicing lead to PTCs in transcripts, many of the differences in splicing that we can detect in NMD mutants could be indirect, resulting from perturbations in transcripts encoding splice factors themselves. The high resolution transcriptome map that we used in this study was invaluable, allowing us not only to analyze expression levels of predicted genes, but also to examine each exon and intron individually as well as to identify novel transcribed regions. We hope that the availability of this scaffold will help direct future transcriptome research in C. elegans, in particular in the analysis of splicing and RNA stability.
Materials and methods
Strain maintenance and RNA preparation and processing
C. elegans strains were maintained on NGM agar plates seeded with OP50 Escherichia coli according to standard protocols . Strains for which data are presented in this paper are Bristol N2, smg-1(r861) and smg-5(r860). All strains were supplied by the Caenorhabditis Genetics Centre (CGC) , University of Minnesota, USA. Total RNA was prepared using Trizol solution according to the manufacturer's protocol, cleaned using Rneasy columns (QIAGEN, Venlo, Limburg, The Netherlands) and Dnase treated for 30 minutes with 10 U Dnase I (Roche, (Basel, Switzerland) in 1× One-Phor-All buffer (GE Healthcare, Little Chalfont, Buckinghamshire, UK). RNA was then re-purified using Rneasy columns before labeling and hybridizing to Affymetrix GeneChip C. elegans Tiling 1.0R Arrays as previously described [11, 13]. In the case of sequence data, polyA+ RNA was purified from total RNA using Oligotex Midi Kits (QIAGEN) according to the manufacturer's protocol. cDNA was then produced using SuperScript™ Double-Stranded cDNA Synthesis Kit (Invitrogen, Carlsbad, CA, USA) and purified using a QIAGEN PCR Purification Kit. Sequence data for the resulting cDNA were then obtained as described by Wilhelm et al. .
Processing of tiling microarray data
Raw spot intensity files (.CEL files) were quantile normalized in R. The normalized data were processed and exported as .BAR files using Affymetrix Tiling Analysis Software version 1.1 for visualization in Affymetrix Integrated Genome Browser. A background cutoff was calculated to include the top 5% of all non-genic probes (relative to WS150) for each condition and interval analysis then performed in Tiling Analysis Software to identify transcribed regions (transfrags) [11, 13] above this cutoff. Maxgap and minrun parameters were set as 35 bp and 70 bp, respectively [11, 13]. Genes were considered expressed if ≥50% of probes were above background in ≥50% of unique exons. Gene intensities of median exonic probes above background within filtered exons were then calculated. Exon intensities used for the splicing analysis were the median probe intensity of probes above background in the exons for which ≥50% of probes were above background.
Normalized intensity and splice index calculation
where E i is the median probe intensity above background of the exon, G i of the gene and t1 and t2 are the different time points.
Mapping sequence data to the genome
Reads obtained from sequencing were mapped to the genome using MAQ  version 0.6.6. A quality threshold of 30 was used as cutoff to determine aligned reads. This yields a count for the number of reads assigned to each nucleotide position in the genome. Each nucleotide can therefore be given an intensity score, which is the number of times it occurs in mapped reads. Gene intensities from sequence data are therefore calculated as the median number of reads that map to each nucleotide for which there is at least one read that corresponds to the given gene. Exon and intron expression are calculated as the median number of reads mapping across the exon or intron.
Identifying reads mapping to splice sites
To map reads to splice junctions, we created a non-redundant set of sequences 66 nucleotides long corresponding to all possible splice junctions (annotated adjacent and non-adjacent exons based on WS150 were used; see Additional data file 6). This was created by combining 33 nucleotides from the 3' end of the upstream exon with 33 nucleotides from the 5' end of the downstream exon. Sequence reads were first mapped to the genome and the set of reads that mapped at less than a quality threshold of '30' by using MAQ were deemed unmapped. These reads were then aligned to the splice junctions, created as stated above, using BLAT and reads were identified as mapping to a junction if there was at least a four-nucleotide overlap over either exonic half of the corresponding junction. Reads that had multiple hits to different junctions were eliminated. Junction sequences formed by combining non-adjacent exons and having reads mapping to them uniquely were determined to be alternative splice sites. Currently, the calls are of a binary nature, with every alternative junction with reads mapping to it under the above criteria considered positive calls.
The raw data can be accessed from two independent locations. The first is at Wormbase - all the tiling data and sequence data will be available as a download. The tiling data are viewable in Wormbase as tracks in the Gene View section. The sequence data are available from the NCBI Short Read Archive and the tracking number is SRA009279.
Additional data files
The following additional data files are available with the online version of this paper: Tables S1 to S8 (Additional data file 1); a figure showing the distribution of mapped sequence reads (Additional data file 2); a figure detailing a comparison of tiling and sequence data for smg-1(r861), which is analogous to Figure 1 for N2 (wild type) (Additional data file 3); a text file that contains the set of genes that are over-expressed twofold or more in the NMD mutant compared to wild type (Additional data file 4); a figure showing the structural changes in SR gene transcripts between N2 and smg-1(r861) (Additional data file 5); a fasta file containing all the exon junction sequences (Additional data file 6).
endogenous NMD target
Mapping and Assembly with Qualities
open reading frame
premature termination codon
We thank Wolfgang Huber for critical comments and help with the data analysis, Sanger core sequencing for their sequencing efforts and the Fraser lab for comments and suggestions. The authors declare that they have no competing financial interests.
- Gaudet J, Muttumu S, Horner M, Mango SE: Whole-genome analysis of temporal gene expression during foregut development. PLoS Biol. 2004, 2: e352-10.1371/journal.pbio.0020352.PubMedPubMed CentralView ArticleGoogle Scholar
- Friedman N: Inferring cellular networks using probabilistic graphical models. Science. 2004, 303: 799-805. 10.1126/science.1094068.PubMedView ArticleGoogle Scholar
- Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003, 34: 166-176. 10.1038/ng1165.PubMedView ArticleGoogle Scholar
- Ben-Tabou de-Leon S, Davidson EH: Gene regulation: gene control network in development. Annu Rev Biophys Biomol Struct. 2007, 36: 191-10.1146/annurev.biophys.35.040405.102002.PubMedView ArticleGoogle Scholar
- Howard ML, Davidson EH: cis-Regulatory control circuits in development. Dev Biol. 2004, 271: 109-118. 10.1016/j.ydbio.2004.03.031.PubMedView ArticleGoogle Scholar
- Karlebach G, Shamir R: Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol. 2008, 9: 770-780. 10.1038/nrm2503.PubMedView ArticleGoogle Scholar
- Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R, Armour CD, Bennett HA, Coffey E, Dai H, He YD, Kidd MJ, King AM, Meyer MR, Slade D, Lum PY, Stepaniants SB, Shoemaker DD, Gachotte D, Chakraburtty K, Simon J, Bard M, Friend SH: Functional discovery via a compendium of expression profiles. Cell. 2000, 102: 109-126. 10.1016/S0092-8674(00)00015-5.PubMedView ArticleGoogle Scholar
- Muller P, Kuttenkeuler D, Gesellchen V, Zeidler MP, Boutros M: Identification of JAK/STAT signalling components by genome-wide RNA interference. Nature. 2005, 436: 871-875. 10.1038/nature03869.PubMedView ArticleGoogle Scholar
- Sachs K, Perez O, Pe'er D, Lauffenburger DA, Nolan GP: Causal protein-signaling networks derived from multiparameter single-cell data. Science. 2005, 308: 523-529. 10.1126/science.1105809.PubMedView ArticleGoogle Scholar
- Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S, Gerstein M, Snyder M: Global identification of human transcribed sequences with genome tiling arrays. Science. 2004, 306: 2242-2246. 10.1126/science.1103388.PubMedView ArticleGoogle Scholar
- Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, Sementchenko V, Piccolboni A, Bekiranov S, Bailey DK, Ganesh M, Ghosh S, Bell I, Gerhard DS, Gingeras TR: Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005, 308: 1149-1154. 10.1126/science.1108625.PubMedView ArticleGoogle Scholar
- He H, Wang J, Liu T, Liu XS, Li T, Wang Y, Qian Z, Zheng H, Zhu X, Wu T, Shi B, Deng W, Zhou W, Skogerbø G, Chen R: Mapping the C. elegans noncoding transcriptome with a whole-genome tiling microarray. Genome Res. 2007, 17: 1471-1477. 10.1101/gr.6611807.PubMedPubMed CentralView ArticleGoogle Scholar
- Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR: Large-scale transcriptional activity in chromosomes 21 and 22. Science. 2002, 296: 916-919. 10.1126/science.1068597.PubMedView ArticleGoogle Scholar
- Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J: Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008, 453: 1239-1243. 10.1038/nature07002.PubMedView ArticleGoogle Scholar
- C. elegans Sequencing Consortium: Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998, 282: 2012-2018. 10.1126/science.282.5396.2012.View ArticleGoogle Scholar
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, et al: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.PubMedView ArticleGoogle Scholar
- Hodgkin J, Papp A, Pulak R, Ambros V, Anderson P: A new kind of informational suppression in the nematode Caenorhabditis elegans. Genetics. 1989, 123: 301-313.PubMedPubMed CentralGoogle Scholar
- Chang YF, Imam JS, Wilkinson MF: The nonsense-mediated decay RNA surveillance pathway. Annu Rev Biochem. 2007, 76: 51-74. 10.1146/annurev.biochem.76.050106.093909.PubMedView ArticleGoogle Scholar
- Maquat L: Nonsense-mediated mRNA Decay. 2006, Georgetown, TX: Landes BioscienceGoogle Scholar
- Isken O, Maquat LE: The multiple lives of NMD factors: balancing roles in gene and genome regulation. Nat Rev Genet. 2008Google Scholar
- Behm-Ansmant I, Kashima I, Rehwinkel J, Sauliere J, Wittkopp N, Izaurralde E: mRNA quality control: an ancient machinery recognizes and degrades mRNAs with nonsense codons. FEBS Lett. 2007, 581: 2845-2853. 10.1016/j.febslet.2007.05.027.PubMedView ArticleGoogle Scholar
- Maquat LE: Nonsense-mediated mRNA decay: splicing, translation and mRNP dynamics. Nat Rev Mol Cell Biol. 2004, 5: 89-99. 10.1038/nrm1310.PubMedView ArticleGoogle Scholar
- Muhlrad D, Parker R: Aberrant mRNAs with extended 3' UTRs are substrates for rapid degradation by mRNA surveillance. RNA. 1999, 5: 1299-1307. 10.1017/S1355838299990829.PubMedPubMed CentralView ArticleGoogle Scholar
- Oliveira CC, McCarthy JE: The relationship between eukaryotic translation and mRNA stability. A short upstream open reading frame strongly inhibits translational initiation and greatly accelerates mRNA degradation in the yeast Saccharomyces cerevisiae. J Biol Chem. 1995, 270: 8936-8943. 10.1074/jbc.270.15.8936.PubMedView ArticleGoogle Scholar
- Pulak R, Anderson P: mRNA surveillance by the Caenorhabditis elegans smg genes. Genes Dev. 1993, 7: 1885-1897. 10.1101/gad.7.10.1885.PubMedView ArticleGoogle Scholar
- Pulak RA, Anderson P: Structures of spontaneous deletions in Caenorhabditis elegans. Mol Cell Biol. 1988, 8: 3748-3754.PubMedPubMed CentralView ArticleGoogle Scholar
- Welch EM, Jacobson A: An internal open reading frame triggers nonsense-mediated decay of the yeast SPT10 mRNA. EMBO J. 1999, 18: 6134-6145. 10.1093/emboj/18.21.6134.PubMedPubMed CentralView ArticleGoogle Scholar
- Khajavi M, Inoue K, Lupski JR: Nonsense-mediated mRNA decay modulates clinical outcome of genetic disease. Eur J Hum Genet. 2006, 14: 1074-1081. 10.1038/sj.ejhg.5201649.PubMedView ArticleGoogle Scholar
- Barberan-Soler S, Zahler AM: Alternative splicing regulation during C. elegans development: splicing factors as regulated targets. PLoS Genet. 2008, 4: e1000001-10.1371/journal.pgen.1000001.PubMedPubMed CentralView ArticleGoogle Scholar
- Lareau LF, Brooks AN, Soergel DA, Meng Q, Brenner SE: The coupling of alternative splicing and nonsense-mediated mRNA decay. Adv Exp Med Biol. 2007, 623: 190-211.PubMedView ArticleGoogle Scholar
- Jaillon O, Bouhouche K, Gout JF, Aury JM, Noel B, Saudemont B, Nowacki M, Serrano V, Porcel BM, Segurens B, Le Mouël A, Lepère G, Schächter V, Bétermier M, Cohen J, Wincker P, Sperling L, Duret L, Meyer E: Translational control of intron splicing in eukaryotes. Nature. 2008, 451: 359-362. 10.1038/nature06495.PubMedView ArticleGoogle Scholar
- Sayani S, Janis M, Lee CY, Toesca I, Chanfreau GF: Widespread impact of nonsense-mediated mRNA decay on the yeast intronome. Mol Cell. 2008, 31: 360-370. 10.1016/j.molcel.2008.07.005.PubMedPubMed CentralView ArticleGoogle Scholar
- Guan Q, Zheng W, Tang S, Liu X, Zinkel RA, Tsui KW, Yandell BS, Culbertson MR: Impact of nonsense-mediated mRNA decay on the global expression profile of budding yeast. PLoS Genet. 2006, 2: e203-10.1371/journal.pgen.0020203.PubMedPubMed CentralView ArticleGoogle Scholar
- He F, Li X, Spatrick P, Casillo R, Dong S, Jacobson A: Genome-wide analysis of mRNAs regulated by the nonsense-mediated and 5' to 3' mRNA decay pathways in yeast. Mol Cell. 2003, 12: 1439-1452. 10.1016/S1097-2765(03)00446-5.PubMedView ArticleGoogle Scholar
- Metzstein MM, Krasnow MA: Functions of the nonsense-mediated mRNA decay pathway in Drosophila development. PLoS Genet. 2006, 2: e180-10.1371/journal.pgen.0020180.PubMedPubMed CentralView ArticleGoogle Scholar
- Rehwinkel J, Raes J, Izaurralde E: Nonsense-mediated mRNA decay: Target genes and functional diversification of effectors. Trends Biochem Sci. 2006, 31: 639-646. 10.1016/j.tibs.2006.09.005.PubMedView ArticleGoogle Scholar
- Mendell JT, Sharifi NA, Meyers JL, Martinez-Murillo F, Dietz HC: Nonsense surveillance regulates expression of diverse classes of mammalian transcripts and mutes genomic noise. Nat Genet. 2004, 36: 1073-1078. 10.1038/ng1429.PubMedView ArticleGoogle Scholar
- Pan Q, Saltzman AL, Kim YK, Misquitta C, Shai O, Maquat LE, Frey BJ, Blencowe BJ: Quantitative microarray profiling provides evidence against widespread coupling of alternative splicing with nonsense-mediated mRNA decay to control gene expression. Genes Dev. 2006, 20: 153-158. 10.1101/gad.1382806.PubMedPubMed CentralView ArticleGoogle Scholar
- Saltzman AL, Kim YK, Pan Q, Fagnani MM, Maquat LE, Blencowe BJ: Regulation of multiple core spliceosomal proteins by alternative splicing-coupled nonsense-mediated mRNA decay. Mol Cell Biol. 2008, 28: 4320-4330. 10.1128/MCB.00361-08.PubMedPubMed CentralView ArticleGoogle Scholar
- Barberan-Soler S, Zahler AM: Alternative splicing and the steady-state ratios of mRNA isoforms generated by it are under strong stabilizing selection in Caenorhabditis elegans. Mol Biol Evol. 2008, 25: 2431-2437. 10.1093/molbev/msn181.PubMedPubMed CentralView ArticleGoogle Scholar
- Rogers A, Antoshechkin I, Bieri T, Blasiar D, Bastiani C, Canaran P, Chan J, Chen WJ, Davis P, Fernandes J, Fiedler TJ, Han M, Harris TW, Kishore R, Lee R, McKay S, Müller HM, Nakamura C, Ozersky P, Petcherski A, Schindelman G, Schwarz EM, Spooner W, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Yook K, Durbin R, et al: WormBase 2007. Nucleic Acids Res. 2008, 36: D612-617. 10.1093/nar/gkm975.PubMedPubMed CentralView ArticleGoogle Scholar
- Manak JR, Dike S, Sementchenko V, Kapranov P, Biemar F, Long J, Cheng J, Bell I, Ghosh S, Piccolboni A, Gingeras TR: Biological function of unannotated transcription during the early development of Drosophila melanogaster. Nat Genet. 2006, 38: 1151-1158. 10.1038/ng1875.PubMedView ArticleGoogle Scholar
- Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18: 1851-1858. 10.1101/gr.078212.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Hillier LW, Reinke V, Green P, Hirst M, Marra MA, Waterston RH: Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. Genome Res. 2009, 19: 657-666. 10.1101/gr.088112.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Shin H, Hirst M, Bainbridge MN, Magrini V, Mardis E, Moerman DG, Marra MA, Baillie DL, Jones SJ: Transcriptome analysis for Caenorhabditis elegans based on novel expressed sequence tags. BMC Biol. 2008, 6: 30-10.1186/1741-7007-6-30.PubMedPubMed CentralView ArticleGoogle Scholar
- Lelivelt MJ, Culbertson MR: Yeast Upf proteins required for RNA surveillance affect global expression of the yeast transcriptome. Mol Cell Biol. 1999, 19: 6710-6719.PubMedPubMed CentralView ArticleGoogle Scholar
- Anders KR, Grimson A, Anderson P: SMG-5, required for C. elegans nonsense-mediated mRNA decay, associates with SMG-2 and protein phosphatase 2A. EMBO J. 2003, 22: 641-650. 10.1093/emboj/cdg056.PubMedPubMed CentralView ArticleGoogle Scholar
- Brogna S, Wen J: Nonsense-mediated mRNA decay (NMD) mechanisms. Nat Struct Mol Biol. 2009, 16: 107-113. 10.1038/nsmb.1550.PubMedView ArticleGoogle Scholar
- McGlincy NJ, Smith CW: Alternative splicing resulting in nonsense-mediated mRNA decay: what is the meaning of nonsense?. Trends Biochem Sci. 2008, 33: 385-393. 10.1016/j.tibs.2008.06.001.PubMedView ArticleGoogle Scholar
- Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE: Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature. 2007, 446: 926-929. 10.1038/nature05676.PubMedView ArticleGoogle Scholar
- Zhang H, Blumenthal T: Functional analysis of an intron 3' splice site in Caenorhabditis elegans. RNA. 1996, 2: 380-388.PubMedPubMed CentralGoogle Scholar
- Ni JZ, Grate L, Donohue JP, Preston C, Nobida N, O'Brien G, Shiue L, Clark TA, Blume JE, Ares M: Ultraconserved elements are associated with homeostatic control of splicing regulators by alternative splicing and nonsense-mediated decay. Genes Dev. 2007, 21: 708-718. 10.1101/gad.1525507.PubMedPubMed CentralView ArticleGoogle Scholar
- Lewis BP, Green RE, Brenner SE: Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc Natl Acad Sci USA. 2003, 100: 189-192. 10.1073/pnas.0136770100.PubMedPubMed CentralView ArticleGoogle Scholar
- Nakagawa S, Niimura Y, Gojobori T, Tanaka H, Miura K: Diversity of preferred nucleotide sequences around the translation initiation codon in eukaryote genomes. Nucleic Acids Res. 2008, 36: 861-871. 10.1093/nar/gkm1102.PubMedPubMed CentralView ArticleGoogle Scholar
- Cuccurese M, Russo G, Russo A, Pietropaolo C: Alternative splicing and nonsense-mediated mRNA decay regulate mammalian ribosomal gene expression. Nucleic Acids Res. 2005, 33: 5965-5977. 10.1093/nar/gki905.PubMedPubMed CentralView ArticleGoogle Scholar
- Green RE, Lewis BP, Hillman RT, Blanchette M, Lareau LF, Garnett AT, Rio DC, Brenner SE: Widespread predicted nonsense-mediated mRNA decay of alternatively-spliced transcripts of human normal and disease genes. Bioinformatics. 2003, 19 (Suppl 1): i118-121. 10.1093/bioinformatics/btg1015.PubMedView ArticleGoogle Scholar
- Jumaa H, Nielsen PJ: The splicing factor SRp20 modifies splicing of its own mRNA and ASF/SF2 antagonizes this regulation. EMBO J. 1997, 16: 5077-5085. 10.1093/emboj/16.16.5077.PubMedPubMed CentralView ArticleGoogle Scholar
- Lejeune F, Cavaloc Y, Stevenin J: Alternative splicing of intron 3 of the serine/arginine-rich protein 9G8 gene. Identification of flanking exonic splicing enhancers and involvement of 9G8 as a trans-acting factor. J Biol Chem. 2001, 276: 7850-7858. 10.1074/jbc.M009510200.PubMedView ArticleGoogle Scholar
- Mitrovich QM, Anderson P: Unproductively spliced ribosomal protein mRNAs are natural targets of mRNA surveillance in C. elegans. Genes Dev. 2000, 14: 2173-2184. 10.1101/gad.819900.PubMedPubMed CentralView ArticleGoogle Scholar
- Sureau A, Gattoni R, Dooghe Y, Stevenin J, Soret J: SC35 autoregulates its expression by promoting splicing events that destabilize its mRNAs. EMBO J. 2001, 20: 1785-1796. 10.1093/emboj/20.7.1785.PubMedPubMed CentralView ArticleGoogle Scholar
- Wollerton MC, Gooding C, Wagner EJ, Garcia-Blanco MA, Smith CW: Autoregulation of polypyrimidine tract binding protein by alternative splicing leading to nonsense-mediated decay. Mol Cell. 2004, 13: 91-100. 10.1016/S1097-2765(03)00502-1.PubMedView ArticleGoogle Scholar
- Buhler M, Steiner S, Mohn F, Paillusson A, Muhlemann O: EJC-independent degradation of nonsense immunoglobulin-mu mRNA depends on 3' UTR length. Nat Struct Mol Biol. 2006, 13: 462-464. 10.1038/nsmb1081.PubMedView ArticleGoogle Scholar
- Ivanov PV, Gehring NH, Kunz JB, Hentze MW, Kulozik AE: Interactions between UPF1, eRFs, PABP and the exon junction complex suggest an integrated model for mammalian NMD pathways. EMBO J. 2008, 27: 736-747. 10.1038/emboj.2008.17.PubMedPubMed CentralView ArticleGoogle Scholar
- Silva AL, Ribeiro P, Inacio A, Liebhaber SA, Romao L: Proximity of the poly(A)-binding protein to a premature termination codon inhibits mammalian nonsense-mediated mRNA decay. RNA. 2008, 14: 563-576. 10.1261/rna.815108.PubMedPubMed CentralView ArticleGoogle Scholar
- Singh G, Rebbapragada I, Lykke-Andersen J: A competition between stimulators and antagonists of Upf complex recruitment governs human nonsense-mediated mRNA decay. PLoS Biol. 2008, 6: e111-10.1371/journal.pbio.0060111.PubMedPubMed CentralView ArticleGoogle Scholar
- Eberle AB, Stalder L, Mathys H, Orozco RZ, Muhlemann O: Posttranscriptional gene regulation by spatial rearrangement of the 3' untranslated region. PLoS Biol. 2008, 6: e92-10.1371/journal.pbio.0060092.PubMedPubMed CentralView ArticleGoogle Scholar
- Ishigaki Y, Li X, Serin G, Maquat LE: Evidence for a pioneer round of mRNA translation: mRNAs subject to nonsense-mediated decay in mammalian cells are bound by CBP80 and CBP20. Cell. 2001, 106: 607-617. 10.1016/S0092-8674(01)00475-5.PubMedView ArticleGoogle Scholar
- Brenner S: The genetics of Caenorhabditis elegans. Genetics. 1974, 77: 71-94.PubMedPubMed CentralGoogle Scholar
- Caenorhabditis Genetic Center (CGC). [http://biosci.umn.edu/CGC/]
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res. 2004, 14: 1188-1190. 10.1101/gr.849004.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.