Genome-wide DNA mutations in Arabidopsis plants after multigenerational exposure to high temperatures

Background Elevated temperatures can cause physiological, biochemical, and molecular responses in plants that can greatly affect their growth and development. Mutations are the most fundamental force driving biological evolution. However, how long-term elevations in temperature influence the accumulation of mutations in plants remains unknown. Results Multigenerational exposure of Arabidopsis MA (mutation accumulation) lines and MA populations to extreme heat and moderate warming results in significantly increased mutation rates in single-nucleotide variants (SNVs) and small indels. We observe distinctive mutational spectra under extreme and moderately elevated temperatures, with significant increases in transition and transversion frequencies. Mutation occurs more frequently in intergenic regions, coding regions, and transposable elements in plants grown under elevated temperatures. At elevated temperatures, more mutations accumulate in genes associated with defense responses, DNA repair, and signaling. Notably, the distribution patterns of mutations among all progeny differ between MA populations and MA lines, suggesting that stronger selection effects occurred in populations. Methylation is observed more frequently at mutation sites, indicating its contribution to the mutation process at elevated temperatures. Mutations occurring within the same genome under elevated temperatures are significantly biased toward low gene density regions, special trinucleotides, tandem repeats, and adjacent simple repeats. Additionally, mutations found in all progeny overlap significantly with genetic variations reported in 1001 Genomes, suggesting non-uniform distribution of de novo mutations through the genome. Conclusion Collectively, our results suggest that elevated temperatures can accelerate the accumulation, and alter the molecular profiles, of DNA mutations in plants, thus providing significant insight into how environmental temperatures fuel plant evolution. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-021-02381-4.

In the submitted manuscript Lu et al. estimated Arabidopsis mutation rate under elevated temperature in multi-generation mutation accumulation lines (single seed descent) and mutation accumulation populations (35 individuals). Their results showed the mutation rate were elevated when plants were grown under high temperature conditions. The estimated mutation rate, mutational spectrum, exceeding overlaps between spontaneous mutation and natural variants were all consistent with previous studies. The manuscript was well-written and the authors explained the experimental protocol and variant calling procedure in details.
The study confirmed several aspects of mutation parameters in Arabidopsis previously shown. One uniqueness of the study was the MA population. I felf the comparison between MA population and MA line was underutilized in the manuscript that might inform us about the process of selection, fitness, population size and mutation. For example, comparing to MA lines (n=1), the decrease of mutation rate at coding regions in MA population (n=35) might be utilized to estimate the selection strength and the fitness effects on these coding genes.
Under heat, the authors found mutations at DNA repair genes and also high mutation rates. These two observations could be connected and have nothing to do with heat. I wonder if the authors have considered a different scenario. Mutation occurred at DNA repair DNA just by chance in the heat MA line, and has nothing to do with the heat treatment. Because DNA repair gene was mutated, the mutation rate also increased. So the observed the mutation rate increases in heat was due to the mutation in DNA repair gene rather than heat. Please comment on this.
Minor comments: 1. Line 78. The authors define large duplications/deletions as del, but this is not consistent with line 161 where del refers to 1-3 short deletion. Please make changes and be consistent. 3. Line 352, 377 and lines 553-555. The authors used the word "ability" to describe the bias in mutation toward low gene density region and certain trinucleotide context. I found it a bit awkward to describe plants can control their mutation in a certain way. Changes in mutational parameters should be considered as an outcome the interplays between multiple evolutionary processes rather than controlled by the plant itself. I would suggest to modify the wording. Figure S2 has a mismatched figure legend. 5. Figure S3. It's not clear why the non-linear smooth function was used in the figure.
Question 1: In the abstract, authors claimed "Mutation occurred more frequently in intergenic regions, coding regions (especially nonsynonymous mutations), and transposable elements (TEs)", but in the later section, authors mentioned "the mutations in lines and populations grown under elevated temperatures were significantly biased toward low gene density regions, special trinucleotides (GC context), tandem repeats, and adjacent simple repeats". These two sentences are kindly contradict to each other.

Response:
Thank you for pointing out this issue. After reading your comment, we realized that the two sentences seem to be confusing. We originally tried to emphasize the two different mutation properties in the lines and populations grown under elevated temperatures.
For the first sentence "Mutation occurred more frequently in intergenic regions, coding regions (especially nonsynonymous mutations), and transposable elements (TEs)", we actually meant the relatively higher mutation frequency in intergenic, coding regions and TEs of plants grown under elevated temperatures compared to those under control temperature.
For the second sentence "the mutations in lines and populations grown under elevated temperatures were significantly biased toward low gene density regions, special trinucleotides (GC context), tandem repeats, and adjacent simple repeats", we were comparing the mutation rates among different genomic regions within the DNA in the plants obtained under elevated temperatures, but not a comparison between plants under elevated temperatures and control.
We have revised the two sentences into the followings: Question 4: On Figure 5c, the color key of frameshift insertion and nonsynonymous is very similar, please change the color make it easy to be differentiated from each other.
Response: Thank you for your suggestion. We have changed the color key of frameshift insertion in Figure 5c in the revised manuscript.  Figure   4, the mutation frequency one would expect the value would be 1, but here the value is larger than 1, more detailed explanation is needed. Response: Thank you for pointing out the issue and your valuable comments.

Response
1) We are sorry that the original sentence was not described correctly. We meant that the mutation frequency of coding region in Heat B is lower than that in warming C. We have corrected the description in the revised manuscript. Response: As suggested, we have deleted the sentence in the revised manuscript.

Response to the reviewer II Comments to the Author:
In So the observed the mutation rate increases in heat was due to the mutation in DNA repair gene rather than heat. Please comment on this.
Response: Thank you for your comments. Indeed, we found only one heat MA line (not all heat lines) that exhibited mutations on a DNA repair gene. Therefore, it is not likely that the DNA repair gene is a mutation hotspot driven by heat treatment.
Previous ''DNA repair'' hypothesis suggests that the variation in fidelity of DNA repair may account for the variation in mutation rate (Baer et al. 2007). Therefore, we agree with your comments that the mutated DNA repair gene could be a possibility resulting in an increase in the mutation rate. On the other hand, in our results, we found that many Heat MA individuals have significantly increased mutation rates without mutations in DNA repair genes. We speculate some possible reasons as follows: 1) Under heat stress, the mutation propensity of plants may not be directly driven by high temperature, but because some important biological processes had been significantly affected, including metabolism rate, cell division, DNA replication, and increased ROS (reactive oxygen species) level. In addition, the most basic chemical reactions of cellular activities, such as redox reaction, electron transfer, and macromolecule proteins or enzyme structure, would also be affected by high temperature. Therefore, in this regard, heat stress may act like a disruptor, interfering with the normal physiological, biochemical and growth process of plants, resulting in the increased disorders of cellular metabolism and instability in homeostasis. 2) When Arabidopsis plants grown under high temperature, heat stress induced more transcriptional activities of stress responsive genes, resulting in the increased probability of error in DNA unwinding and transcription. Moreover, heat stress would probably cause the heat stress-related impairment of DNA repair activity, for example, potential up-regulation of the error-prone polymerases typical of the SOS and SIM mechanisms identified in bacteria, yeast, and human cancer cells (Baer et al. 2007;Bindra, et al., 2011;Al Mamun et al. 2012;Shor et al. 2013). All these negative effects on plants are produced by heat stress, which may contribute to the increased mutation rates in Heat MA lines and populations.
Question 3: Line 78. The authors define large duplications/deletions as del, but this is not consistent with line 161 where del refers to 1-3 short deletion. Please make changes and be consistent.

Response:
In the Background, we have deleted the first "del" (Page 5, line 80) and retained "1-3bp short deletion" as "del" in the Results section (Page 9, line 163).
Question 4: Line 181-205 and figure 3. I would suggest including the standard error of the mean (SEM) in writing. SEM was shown in figure 3 but not written in line 181-

205.
Response: As suggested, we have added the standard error of the mean (SEM) in the revised manuscript. Please see pages 10-11, lines 186-208.
Question 5: Line 352, 377 and lines 553-555. The authors used the word "ability" to describe the bias in mutation toward low gene density region and certain trinucleotide context. I found it a bit awkward to describe plants can control their mutation in a certain way. Changes in mutational parameters should be considered as an outcome the interplays between multiple evolutionary processes rather than controlled by the plant itself. I would suggest to modify the wording.

Response:
We fully agree with your comments that accumulated mutational changes are the outcome of the interplays between multiple evolutionary processes rather than controlled by the plant itself. As per your suggestion, we have reworded the corresponding sentences. 2) We have replaced the sentence in the original line 377 "However, the trinucleotides CCG (or GGC) and GCG (or CGC) appeared to have strong mutation abilities in all MA groups, regardless of temperature treatment" to "However, the trinucleotides CCG (or GGC) and GCG (or CGC) appeared to have high mutation rates in all MA groups, regardless of temperature treatment".
3) We have revised the sentence in the original lines 553-535 "These results suggest that plants alter their genes to re-adjust their own physiological and developmental processes in response to long-term high temperature stress." to "These results suggest that multigenerational exposure of A. thaliana to high temperatures promotes the mutations in stress response genes, thereby exerting influences on physiological and developmental processes in response to long-term high temperature stress.
Question 6: Figure S2 has a mismatched figure legend.

Response:
We are sorry that the legend of Figure S2 was not presented correctly. In the revised manuscript, we have the necessary corrections. Please see the Additional file 2.
Question 7: Figure S3. It's not clear why the non-linear smooth function was used in the figure.
Response: In the original analysis, we had used both LOESS (locally weighted smoothing) and linear regression models (See Figure 1 below) to plot smooth function for the correlation between GC contents and mutation rates. The results showed that both models generally presented a flat fitted line, indicating that GC content had no significant impact on the mutation rate. Therefore, we showed only the correlation result for GC contents and observed mutation rates using the LOESS method (using loess in the stat_smooth function as part of the ggplot2 package). We have added the corresponding figure legend in the Additional file 2: Figure S3. Question 8: Additional file 3. Page 6. The citation for GSE118298 was incorrect.

Response:
We are sorry for the mistake. In the revised manuscript, we have corrected the reference citation (Wang et al., 2020) and added the reference in the revised Additional file 3.