- Open Access
Whole genomes and transcriptomes reveal adaptation and domestication of pistachio
- Lin Zeng†1, 5,
- Xiao-Long Tu†2,
- He Dai†3,
- Feng-Ming Han†3,
- Bing-She Lu†7,
- Ming-Shan Wang1,
- Hojjat Asadollahpour Nanaei4,
- Ali Tajabadipour8,
- Mehdi Mansouri9,
- Xiao-Long Li3,
- Li-Li Ji2,
- David M. Irwin6,
- Hong Zhou10,
- Min Liu3,
- Hong-Kun Zheng3,
- Ali Esmailizadeh4Email author and
- Dong-Dong Wu1, 11Email author
© The Author(s). 2019
- Received: 4 July 2018
- Accepted: 1 April 2019
- Published: 18 April 2019
Pistachio (Pistacia vera), one of the most important commercial nut crops worldwide, is highly adaptable to abiotic stresses and is tolerant to drought and salt stresses.
Here, we provide a draft de novo genome of pistachio as well as large-scale genome resequencing. Comparative genomic analyses reveal stress adaptation of pistachio is likely attributable to the expanded cytochrome P450 and chitinase gene families. Particularly, a comparative transcriptomic analysis shows that the jasmonic acid (JA) biosynthetic pathway plays an important role in salt tolerance in pistachio. Moreover, we resequence 93 cultivars and 14 wild P. vera genomes and 35 closely related wild Pistacia genomes, to provide insights into population structure, genetic diversity, and domestication. We find that frequent genetic admixture occurred among the different wild Pistacia species. Comparative population genomic analyses reveal that pistachio was domesticated about 8000 years ago and suggest that key genes for domestication related to tree and seed size experienced artificial selection.
Our study provides insight into genetic underpinning of local adaptation and domestication of pistachio. The Pistacia genome sequences should facilitate future studies to understand the genetic basis of agronomically and environmentally related traits of desert crops.
- Pistacia vera
- Crop domestication
- Artificial selection
With reducing agricultural acreage and human population growth, feeding the world is becoming an increasing problem. Deserts take up about one third of the land surface area of Earth, are extreme environments that are barren landscapes where little precipitation occurs, and often have dry and alkaline soils, thus have hostile living conditions for most plant and animal life . However, some crops can still be cultivated in some desert areas. Insight into the environmental adaptations and economic characters of these species should facilitate the planting and breeding of these crops in different desert regions, which might contribute to easing the world’s food crisis.
Although the rapid development of genome sequencing has facilitated to discover genetic underpinning of many crop domestication and improvement, there are very few studies on pistachio. The genome size of pistachio has been estimated to be about 600 Mb with a high heterozygosity rate . Moazzzam Jazi et al.  used a genome-wide transcriptome and discovered the salinity tolerance-related markers and stress response mechanisms by comparing two pistachio cultivars under control and salt treatment.
In the present study, to better understand the molecular evolutionary history underpinning pistachio domestication, we assembled a draft genome of pistachio and resequenced 107 whole genomes, including 93 domestic and 14 wild individuals of P. vera and 35 other genomes from different wild Pistacia species. Integrating genomic and transcriptomic analyses revealed expanded gene families (e.g., cytochrome P450 and chitinase) and the jasmonic acid (JA) biosynthetic pathway that are likely involved in stress adaptation. Comparative population genomic analyses revealed that pistachio was domesticated ~ 8000 years ago and that likely key genes for domestication are those involved in tree and seed size, which experienced artificial selection (Additional file 1). These genome sequences should facilitate future studies to understand the genetic underpinnings of agronomically and environmentally related traits of desert crops.
Genome evolution of pistachio
We firstly sequenced the genome of the P. vera L cultivar by Illumina Hiseq 2500 platform from multiple paired-end libraries, including two small-insert libraries (270 bp and 500 bp) and six long-insert mate-pair libraries (3 kb, 4 kb, 8 kb, 10 kb, 15 kb, and 17 kb). A draft genome of 569.12 Mb was assembled, with contig and scaffold N50 sizes of 20.69 kb and 768.39 kb, respectively. (Additional file 2: Tables S1-S2, version1). To improve the continuity, we further generated a total of 4,038,150 filtered long reads with average lengths of 14,568 bp from 59 Gb sequencing data by Pacbio Sequel System. Finally, a draft genome of 671 Mb was assembled, with contig and scaffold N50 sizes of 75.7 kb and 949.2 kb, respectively (Additional file 2: Tables S2, version 2). The genome quality is compatible with the previously reported plant genomes (Additional file 2: Tables S2) and facilitates some convincing data analyses. The assembly size is a little larger than the estimated genome size, which is likely due to the high heterozygosity of pistachio (1.72%). The pattern has also been reported in other assembled genomes with high heterozygosity [4–6]. Transposable elements occupied 70.7% of the pistachio genome, of which 46.75% were long terminal repeat retrotransposons (Additional file 2: Table S3). A total of 31,784 protein-coding genes and 161 miRNAs were annotated by integrating different methods (Additional file 2: Tables S4-S5). Conserved Core Eukaryotic Gene Mapping Approach (CEGMA) analyses indicated that 96.94% of the core protein-coding genes were recovered in our assembled genome. The assembly sequence was assessed with BUSCO v3.0.2b which found 1361 complete gene models out of 1440 (94.51%) and 29 fragmented (2.01%); 18.96% of complete genes were found in more than one copy (Additional file 2: Table S6).
We first performed a comparative genomic investigation to assess the palaeohistory of this species. Phylogenomic analysis using genes extracted from single-copy families in nine plant genomes indicated that the pistachio diverged from Citrus sinensis ~ 58 million years ago and from Populus trichocarpa ~ 105 million years ago (Additional file 1: Figure S1). Analysis of fourfold degenerate third-codon transversion sites demonstrated that the pistachio genome had not experienced a lineage-specific whole genome duplication subsequent to its divergence from these species (Additional file 1: Figure S2). We also performed a genomic synteny analysis by aligning the pistachio genome to the genome of the basal angiosperm Amborella trichopoda . The macrosynteny analyses showed that each Amborella region had up to three pistachio regions, while each pistachio region had up to two P. trichocarpa regions  (Fig. 1b, Additional file 1: Figure S3). The synteny analyses support the conclusion that no lineage-specific genome duplication occurred in pistachio, but they do share the gamma duplication that occurred within eudicots, and that Populus experienced a lineage-specific genome duplication event [7, 8].
Expanded gene families related with stress adaptation of pistachio
To reveal the genetic basis underpinning the pistachio phenotype (e.g., salt tolerance), we investigated the evolution of gene families by identifying unique and shared gene families among different plants using OrthoMCL . Among the gene families identified in P. vera, 9735 were shared as families with Arabidopsis thaliana, C. sinensis, Gossypium raimondii, and Vitis vinifera, while 707 gene families, containing 1381 genes, were specific to pistachio (Additional file 1: Figure S4). To assess the function of these genes, we performed a Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis using David and g:Profiler programs. Both programs found many genes involved in “defense response” (GO: 0006952, Additional file 2: Table S7), which included many genes containing the NB-ARC domain and the NBS-LRR domain. This kind of genes has been well known for disease resistance in plants  and is reasonably important for defense response in pistachio.
Next, we examined the expansion and contraction of gene families in pistachio (Fig. 1c). As it is difficult to reach conclusions concerning adaptation from contraction in gene family size, or with genes that were not successfully assembled in this reference genome , we only analyzed the expanded gene families. Gene enrichment analysis of the expanded gene families found them to be enriched in the categories of metabolism, such as biosynthesis of terpenoid, flavonoid, sesquiterpenoid, and alkaloid (Additional file 2: Table S8). The expansion of gene families occurs after a long-term evolution and drives the evolutionary difference between Pistacia and Citrus, rather than a very short-term evolution of pistachio domestication from the wild. Therefore, we propose that the expansion of genes in the above categories is probably related to the metabolism of organic compounds found in wild Pistacia species. Phytochemical screening of wild Pistacia species found many phytochemicals such as alkaloids, flavonoids, coumarins, sterols, tannins, terpenoids, and sesquiterpene [12–14].
In addition, the enriched term “oxidation-reduction process” (GO: 0055114, P ≪ 0.001) contains many cytochrome P450 genes, which encode proteins involved in multiple metabolic pathways with complex functions and playing important roles in multiple processes, particularly roles in stress responses. To assess the function of these genes, we used BLASTP to search the A. thaliana proteome and identified the best-hit genes (E value < 1e−10) (Additional file 2: Table S9). Among the 187 cytochrome P450 genes, we found that many probably had functions for salt tolerance. For example, pervious study found elevated levels of CYP94 family gene expression alleviate the jasmonate response and enhance salt tolerance in rice . Among these expanded gene families in pistachio, there are 14 members of CYP94 genes. In soybean, CYP82A3 is involved in the jasmonic acid and ethylene signaling pathway and enhances resistance to salinity and drought , and there are 20 members of CYP82 genes among the expanded gene families in pistachio. Ectopic expression of P. trichocarpa CYP714A3 enhanced salt tolerance in rice , and there are 10 members of CYP714A genes among the expanded gene families in pistachio. Thus, it is likely that some cytochrome P450 genes are responsible for salt tolerance in pistachio.
RNA sequencing reveals a genetic mechanism underlying salt adaptation of pistachio
Differentially expressed genes were also found to be enriched in “chitin binding” (P = 0.03 after correction, Additional file 2: Table S10), with four genes encoding chitinases (CHIB, EP3, ChiC, AT2G43590). Plant chitinases are involved in diverse biological systems. Some chitinases in plants are expressed in response to environmental stresses (i.e., high salt concentration, cold, and drought) and can be upregulated by phytohormones such as ethylene, jasmonic acid, and salicylic acid [24, 25]. For example, gene ChiC encodes a class V chitinase, and its expression can be induced by the jasmonic acid and the stress resulting from salinity in Arabidopsis thaliana . CHIB encodes a basic chitinase involved in jasmonic acid-mediated signaling pathway . Our transcriptomic analysis suggests that genes encoding chitinases and those involved in the JA biosynthetic pathway likely contribute to the adaptation of pistachio to saline environments.
Admixture occurred among different wild relatives
A two-step domestication of pistachio
Genetic mechanisms underlying pistachio domestication
In this study, we assembled a draft reference genome and used comparative genomics to reveal the genetic underpinning for salt tolerance adaptation in pistachio, one of the most important commercial nut crops cultivated in desert regions. This reference genome will enable future evolutionary and ecological research on pistachio. We also generated a genome-wide dataset of SNP from 93 domestic and 14 wild individuals and identified some genes for agronomically related traits, such as fruit weight and tree size, that might have been selected during the domestication of pistachio. Sequence information from diverse cultivar accessions should be helpful for the researchers and breeders in genome-wide association mapping studies of agronomically related traits and support marker-assisted and genomic selection-based approaches for plant improvement. Insight into the environmental adaptations and economic characters of this crop will undoubtedly facilitate planting and breeding programs for different desert regions, which might contribute to easing the world’s food crisis.
Illumina sequencing for de novo genome
An individual of P. vera, cultivar name Batoury, a variety widely cultivated in China, was chosen for genome sequencing and assembly. Genomic DNA was extracted using the DNeasy Plant Mini Kit (Qiagen). Multiple pair-end libraries, including two types of small-insert libraries (270 bp and 500 bp) and six types of long-insert mate-pair libraries (3 kb, 4 kb, 8 kb, 10 kb, 15 kb, and 17 kb), were constructed using the Illumina paired-end and mate-pair kits according to the manufacturer’s instructions.
The libraries were sequenced on the Illumina Hiseq 2500 platform. For the raw reads, sequencing adaptors were removed and contaminated reads (chloroplast, mitochondrial, bacterial and viral sequences, etc.) were screened by alignment to the NCBI NR database using BWA v0.7.13  with default parameters. FastUniq v1.1  was used to remove duplicated read pairs, and low-quality reads were filtered under the following conditions: (1) reads with ≥ 10% unidentified nucleotides (N); (2) reads with > 10 nucleotides aligned to the adapter, allowing ≤ 10% mismatch; and (3) reads with > 50% bases having Phred quality < 5. Finally, we generated a total of 98.08 Gb and 55.87 Gb clean reads for the paired-end and the mate-pair libraries, respectively.
PacBio sequencing and assembly
Single-molecule sequencing was performed by the PacBio Sequel System, which yielded a total of 4,038,150 filtered subreads with average lengths of 14,568 bp for P. vera. Finally, only equal or longer than 500 bp PacBio subreads were performed the gap filling. Based on the ALLPATH-LG assembly, we utilized PBJelly  to do gap filling with the filtered PacBio subreads for improving genome assembly, and the option is “<blasr> -minMatch 8 -minPctIdentity 70 -bestn 1 -nCandidates 20 -maxScore -500 -noSplitSubreads </blasr>.” The final polishing procedure was completed by pilon v1.22 (available at https://github.com/broadinstitute/pilon) using Illumina data with the parameters “--mindepth 10 --changes --threads 4 --fix bases.”
Genome size estimation
A total of 58.80 Gb Illumina reads from the 270 bp library were selected to perform the genome size estimation. The distribution of 21-kmer showed a major peak at 84×. Based on the total number kmers and the corresponding kmer depth of 84, the pistachio genome size was estimated to be ~ 519.17 Mb using the formula: Genome size = kmer_Number/Peak_Depth.
De novo assembly and assessment
The whole genome was de novo assembled into longer contigs using ALLPATH-LG  with the default parameters. Because high genomic heterozygosity is generally recovered as alternative contigs, REDUNDANS v.013b  was used to identify heterozygous contigs based on at least 85% identity and overlap of at least 66% of the shorter sequence length from each pairwise comparison, with only the longer of the two redundant contigs retained. Finally, adjacent contigs connected by mate-pair information were joined to scaffolds using SSPACE v2.3  and gap filling was proceeded by GapCloser v1.12 from the SOAPdenovo package .
After detecting the variation using the corresponding sequencing data, the SNP rate besides the genome heterozygosity was estimated as the error rate of the genome assembly. Completeness of the assembly was assessed by mapping 458 conserved core eukaryotic genes (CGEs) and 248 highly conserved CGEs to the genome using CEGMA v2.5 .
The amount of the assembly composed of repeats was estimated by building a repeat library employing the de novo prediction programs LTR-FINDER , MITE-Hunter , RepeatScout v1.0.5 , and PILER-DF . The new repetitive elements were classified using PASTEClassifier v1.0  and combined with Repbase database v20.01  to create the final repeat library. Repeat sequences in the pistachio genome were identified and classified using the RepeatMasker program v4.0.6 . The criterion used for LTR family classification was that the 5′LTR sequence would share at least 80% identity over at least 80% of their length for the same family.
Protein-coding genes were predicted using de novo and protein homology-based approaches. Genscan v1.0 , Augustus v2.5.5 , GlimmerHMM v3.0.1 , GeneID v1.3 , and SNAP  were performed for de novo gene prediction, while homologous peptides from the A. thaliana (TAIR 10), Oryza sativa (Nipponbare, IRGSP-1.0), Theobroma cacao (Phytozome v12.1), and C. sinensis (Phytozome v12.1) genomes were aligned to our assembly to identify the homologous genes with GeMoMa v1.4.2 ; the RNA-Seq reads were assembled into contigs de novo into unigenes using Trinity , and the resulting unigenes were aligned to the repeat-masked assemblies using BLAT , and subsequently, the gene structures of BLAT alignment results were modeled using PASA ; then, the protein-coding regions were identified with TransDecoder v3.0.1  and GeneMarkS-T , respectively. Consensus gene models were generated by integrating the de novo predictions and protein alignments using EVidenceModeler . Annotation of the predicted genes was performed by blasting the gene (and predicted protein) sequences against a number of nucleotide and protein sequence databases, including COG , KEGG , NCBI-NR, and Swiss-Prot  with an E value cutoff of 1e−5 used to assess orthology.
rDNA genes were identified by aligning with rRNA template sequences (Pfam database v22.0 , using BLAST  with an E value of 1e−10 and an identity cutoff of 95% or more. The tRNAScan-SE v2.0  algorithm with default parameters was applied to predict tRNA genes. miRNA genes were predicted using INFERNAL v1.1 software  with the Rfam database v11.0  and a cutoff score of 30 or more. The minimum cutoff score was based on the settings which yielded a false-positive rate of 30 bits.
Comparative analysis of gene families
To identify homologous relationships among pistachio and related plants, the pistachio proteome was globally compared with the A. thaliana, C. sinensis, G. raimondii (Phytozome v12.1), and V. vinifera (Phytozome v12.1) proteomes filtered for transposable elements and alternative splicing. An all-against-all comparison was performed using BLASTP (E value = 1e−5) followed by clustering with OrthoMCL v2.0.9  (inflation = 1.5). Analysis of species-specific gene families was made with a Fisher’s exact test (P < 0.0001) on the Pfam domains.
A phylogenetic tree of 9 species—V. vinifera, Eucalyptus grandis (Phytozome v12.1), G. raimondii, Medicago sativa (Phytozome v12.1), Brassica rapa (Phytozome v12.1), A. thaliana, P. trichocarpa (http://ensemblgenomes.org,release-21), C. sinensis, and pistachio—was constructed using PhyML software v3.0  based on 1096 shared single-copy genes. The divergence time between pistachio and the 8 other sequenced species (V. vinifera as the outgroup) was estimated using the MCMCtree program implemented in the PAML package v4.9 . Calibration times were obtained from the TimeTree database (http://www.timetree.org/). Expansion and contraction of OrthoMCL-derived gene clusters was determined by a CAFÉ v2.1  calculation on the basis of changes in gene family size in the inferred phylogenetic history. KEGG and GO annotation of gene family was completed by aligning the genes to the KEGG database and NCBI non-redundant database using BlastP with an E value of 1e−5, respectively. BLAST2GO  was used to obtain the associated GO terms. The enrichment score is defined as the hypergeometric test value .
Whole genome duplication analysis
The all-against-all BLASTP method (E value < 1e−5) was used to detect paralogous genes in pistachio, C. sinensis, V. vinifera, and P. trichocarpa as well as orthologous genes in pistachio C. sinensis and pistachio V. vinifera. Homologous blocks were then detected using Mcscan v1.1 . The synonymous substitution (Ks) and fourfold degenerate site transversion (4DTV) values of the blocks were calculated using the HKY model .
Syntenic depth refers to the number of times a genomic region (or genome) is syntenic to the regions in another genome . We performed synteny searches to compare the pistachio genome structure with A. trichopoda (ε-WGD ) (http://amborella.huck.psu.edu,version1.0) and P. trichocarpa (ε-WGD, γ-WGD, and β-WGD ) genomes.
A greenhouse experiment was conducted in 2016 at the Pistachio Research Center, Rafsanjan, Iran: 30° 24′ 49.0″ N, 55° 59′ 20.8″ E, at 1528 masl. Seeds of cultivar pistachio rootstocks, P. vera L. cv. Ohadi, and the wild type (Sarakhs) were surface sterilized with 5% solution of sodium hypochlorite in distilled water and then incubated at 30 °C on a sterile moist cloth for 1 week. Ten seeds that germinated were sown in each pot filled with about 7 kg sandy-loam soil at a depth of about 2 cm. The number of seedlings per pot was reduced to three uniform seedlings when the emergence period was completed about 3 weeks of planting.
The experiment consisted of a completely randomized design with three replications for a total of six pots. Irrigation of all pots was carried out for 2 months using Rafsanjan normal urban water (with a salinity of 0.6 dS m−1) at field capacity. Two months after planting, when seedlings were well established, salinity treatment was imposed on half of the pots (salinity group) by irrigation water with an electrical conductivity of 4.5 dS m−1 while the other half of the pots (control group) were irrigated using the Rafsanjan normal urban water throughout the entire experiment. Salinity stress continued for 1 month when the symptoms of salinity toxicity in the leaves (burning edge caused by salt toxicity) were observed. Leaf samples were taken before applying the salinity stress on the seedlings at 2 months of age. Both the leaf and root samples from all replicates were taken at the end of the experiment when the seedlings were 3 months of age. The fresh leaf and root tissue samples were immediately dissected and submerged in five volumes of RNAlater® solution and stored in a − 80 freezer for subsequent RNA extraction. During the experiment, the maximum temperature was 34 ± 5 °C; the minimum temperature was 21 ± 4 °C, and the relative humidity was 40 ± 5%.
The leaves and roots from the three salinity-treated samples of cultivar, three samples under control condition of cultivar, and three samples under control condition of wild were prepared for RNA sequencing. Total RNA was extracted using the RNAplant Plus Reagent according to the manufacturers’ instructions (Tiangen, Beijing, China). Before library construction, we evaluated the degradation of the RNA on a 1% agarose gel and checked the purity using the Qubit® 3.0 Fluorometer (Life Technologies, CA, USA), with integrity and concentration assessed using an RNA Nano 6000 Assay Kit on the Bioanalyzer 2100 system (Agilent Technologies, CA, USA). Sequencing libraries were generated using NEBNext® Ultra™ RNA Library Prep Kit for Illumina® (#E7530L, NEB, USA) following the manufacturer’s recommendations and then sequenced on the Illumina HiSeq X Ten platform to generate 150 bp paired-end reads.
Before alignment, reads were trimmed based on their quality scores using the quality trimming program Btrim . Reads were aligned to our de novo genome of Pistacia vera L. using TopHat (v2.1.1 ) and then assembled using Cufflinks (v2.2.1 with –G parameter) . Differential expression of genes in the different tissues was calculated using Cuffdiff .
Fruit size-related traits for pistachio were measured based on the pistachio descriptor (IPGRI. 1997. Descriptors for pistachio (P. vera L.), International Plant Genetic Resources Institute, Rome, Italy). The following phenotypes were recorded: dried pistachio fruit weight (gr) (the mean weight of 100 healthy dry nuts), fresh fruit weight with green skin (gr) (the mean weight of 100 healthy fresh nuts with green skin in grams), dried pistachio fruit length (mm) (average of 20 healthy nuts, measured from the most distant points along the main seed axis), dried pistachio fruit diameter (mm) (average of 20 healthy nuts, measured at the widest part perpendicular to the suture), dried pistachio fruit width (mm) (average of 20 healthy nuts, measured from the widest points perpendicular to the main seed axis), dried pistachio fruit and kernel shape (roundish = 1, ovoid = 2, elongated = 3, narrowly cordate = 4, cordate = 5), dried kernel weight (gr) (average weight of 100 healthy dry kernels), kernel diameter (mm) (average of 20 healthy kernels, measured at the widest part perpendicular to the cotyledon suture.), kernel width (mm) (average of 20 healthy kernels, measured on the widest points perpendicular to the main seed axis), kernel length (mm) (average of 20 healthy kernels, measured from the most distant points along the main seed axis).
Plant material and DNA extraction for genome resequencing
Fresh leaves (4–5 g) were sampled from the germplasm collections of the Pistachio Research Institute in Rafsanjan, Iran: 30° 24′ 49.0″ N, 55° 59′ 20.8″ E, at 1528 masl; the pistachio germplasm of Ardakan, Iran: 32° 18′ 36″ N, 54° 1′ 3″ E, at 1040 masl; and Jiroft, Iran: 28° 40′ 41″ N, 57° 44′ 26″ E, at 650 masl. Leaf tissues were harvested during the 2015–2017 period and were transported on ice and stored at − 80 °C in the Biotechnology Laboratory, Animal Science Department, Shahid Bahonar University of Kerman, Iran, until subjected to DNA extraction.
Total genomic DNA was extracted from 1 g fresh leaves using hexadecyl trimethyl ammonium bromide (CTAB) protocol with some modifications. The quantity and quality of isolated DNA were assessed by NanoDrop spectrophotometer and 1% agarose gel electrophoresis, looking for a single absorbance peak at 260 nm, a 260/280 absorbance ratio of 1.8–2.0, and no evidence of substantial band shearing or contamination. Extracted DNA was dissolved in 20 μl TE buffer (10 mM Tris, pH 8, 1 mM EDTA) and stored at − 20 °C for subsequent NGS analysis.
Genome resequencing and SNP calling
Among the above samples, 93 domestic and 14 wild pistachio individuals and another 13 P. mutica, 13 P. khinjuk, 4 P. integerrima, and 5 P. palaestina were chosen for genome resequencing. Ten micrograms of genomic DNA, prepared by the standard CTAB extraction protocol, was used to construct libraries with 350 bp insert size. Sequence libraries were constructed according to the Illumina library preparation pipeline and were sequenced on the Illumina Hiseq 4000 platform to generate 150 bp paired-end reads.
We mapped the reads to our reference genome (version 1) with BWA-MEM . After sorting and duplicate marking of the bam format files with Picards tools 1.56 (http://picard.sourceforge.net), we called SNPs using Genome Analysis Toolkit (GATK) . The criteria used to filter the raw SNPs were “QUAL < 40.0, MQ < 25.0, MQ0 >= 4 && ((MQ0/(1.0*DP)) > 0.1, -cluster 3 -window 10.” We ignore the multi-nucleotide polymorphisms, and the loci containing SNP markers must present in at least 90% of individuals. A total of 14,767,700 high-quality SNPs were identified and used in the subsequent analyses.
Phylogenetic relationship of resequenced pistachio
We constructed a neighbor-joining tree using a p-distance matrix by TreeBeST (http://treesoft.sourceforge.net/treebest.shtml) with 100 bootstrap replications. Principal component analysis (PCA) was performed using the toolset SNPRelate from the R package. We ran frappe to estimate the genetic ancestry of each sample, specifying a range of 2–5 hypothetical ancestral populations. The maximum number of interaction was set to 10,000 in the frappe analysis.
Artificial selection analysis
We calculated the genome-wide distribution of population fixation statistics FST  and nucleotide diversity θπ and Watterson’s estimator θw ratios (windows with a number of variants < 20 were ignored)  for each sliding window with a window size of 50 kb and a step size of 25 kb. Putative selection targets were extracted with both top 5% of log ratios for θπ and FST. Our approach was to identify genomic regions with high differentiation between cultivated and wild Pistacia vera (ncultivar = 13, nwild = 14).
Analysis of genetic introgression
We inferred gene flow between the diverged populations using the maximum likelihood method implemented in TreeMix . First, we inferred the maximum likelihood (ML) tree with the command “-i input -bootstrap -o output.” Second, from one to four migration events were gradually added to the ML tree of the five species with command was “-i input -bootstrap -k 1000 -m migration events -o output.” Genetic introgression was also analyzed by the D statistic (ABBA-BABA test) .
Recent demographic history inference using δaδi
To uncover the recent demographic history of the cultivated and wild P. vera, we only considered SNPs with more than 40-fold sequencing coverage at the population level in intergenic regions to ensure their neutrality. Missing genotypes were imputed using the program BEAGLE . After investigating the empirical distributions of the MAFs, haplotypes were inferred for all genotype sites with MAF > 0.01. Two divergence models were considered between the two populations of P. vera. The model with the maximum log-likelihood value was chosen as the optimal one.
GO and KEGG enrichment analysis
This work was supported by the Animal Branch of the Germplasm Bank of Wild Species, Chinese Academy of Sciences (the Large Research Infrastructure Funding) and the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB13020600).
Availability of data and materials
All the sequences reported in this study are deposited into the Genome Sequence Archive database under Accession ID CRA000978  (http://bigd.big.ac.cn/search?dbId=gsa&q=CRA000978) and NCBI Sequence Read Archive under Accession ID PRJNA526975 (http://www.ncbi.nlm.nih.gov/bioproject/526975) .
D-DW and AE lead the project and designed and conceived the study. D-DW, LZ, X-LT, and DMI prepared the manuscript. D-DW, LZ, X-LT, HD, F-MH, M-SW, X-LL, L-LJ, ML, HZ, and H-KZ performed the data analysis. B-SL, HAN, AT, and MM performed some of the sampling and experiments. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Walker AS. What is a desert? In: Deserts: geology and resources. General Interest Publication. 1992;3–6Google Scholar
- Moazzzam Jazi M, Seyedi SM, Ebrahimie E, Ebrahimi M, De Moro G, Botanga C. A genome-wide transcriptome map of pistachio (Pistacia vera L.) provides novel insights into salinity-related genes and marker discovery. BMC Genomics. 2017;18:627.View ArticlePubMedPubMed CentralGoogle Scholar
- Ziya Motalebipour E, Kafkas S, Khodaeiaminjan M, Çoban N, Gözel H. Genome survey of pistachio (Pistacia vera L.) by next generation sequencing: development of novel SSR markers and genetic diversity in Pistacia species. BMC Genomics. 2016;17:998.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang X, Yue Y, Li H, Ding W, Chen G, Shi T, Chen J, Park MS, Chen F, Wang L. The chromosome-level quality genome provides insights into the evolution of the biosynthesis genes for aroma compounds of Osmanthus fragrans. Horticulture Res. 2018;5:72.View ArticleGoogle Scholar
- Wei C, Yang H, Wang S, Zhao J, Liu C, Gao L, Xia E, Lu Y, Tai Y, She G, et al. Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proc Natl Acad Sci U S A. 2018;115:E4151.View ArticlePubMedPubMed CentralGoogle Scholar
- Yan L, Wang X, Liu H, Tian Y, Lian J, Yang R, Hao S, Wang X, Yang S, Li Q, et al. The Genome of Dendrobium officinale illuminates the biology of the important traditional Chinese orchid herb. Mol Plant. 2015;8:922–34.View ArticlePubMedGoogle Scholar
- Amborella Genome Project. The Amborella genome and the evolution of flowering plants. Science. 2013;342:1241089.Google Scholar
- Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006;313:1596–604.View ArticlePubMedGoogle Scholar
- Li L, Stoeckert CJ Jr, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–89.View ArticlePubMedPubMed CentralGoogle Scholar
- McHale L, Tan X, Koehl P, Michelmore RW. Plant NBS-LRR proteins: adaptable guards. Genome Biol. 2006;7:212.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang M-S, Yang H-C, Otecko NO, Wu D-D, Zhang Y-P. Olfactory genes in Tibetan wild boar. Nat Genet. 2016;48:972.View ArticlePubMedGoogle Scholar
- Uddin G, Rauf A, Rehman T, Qaisar M. Phytochemical screening of Pistacia chinensis var. integerrima. Middle-East J Sci Res. 2011;7:707–11.Google Scholar
- Belhachat D, Aid F, Mekimene L, Belhachat M. Phytochemical screening and in vitro antioxidant activity of Pistacia lentiscus berries ethanolic extract growing in Algeria. Mediterr J Nutr Metab. 2017;10:273–85.View ArticleGoogle Scholar
- Rand K, Bar E, Ben-Ari M, Lewinsohn E, Inbar M. The mono - and sesquiterpene content of aphid-induced galls on Pistacia palaestina is not a simple reflection of their composition in intact leaves. J Chem Ecol. 2014;40:632–42.View ArticlePubMedGoogle Scholar
- Ogawa D, Kurotani K-I, Hayashi K, Hatanaka S, Hattori T, Toda Y, Takeda S, Ichikawa H, Ueda M, Tashita R, et al. Elevated levels of CYP94 family gene expression alleviate the jasmonate response and enhance salt tolerance in rice. Plant Cell Physiol. 2015;56:779–89.View ArticlePubMedGoogle Scholar
- Yan Q, Cui X, Lin S, Gan S, Xing H, Dou D. GmCYP82A3, a soybean cytochrome P450 family gene involved in the jasmonic acid and ethylene signaling pathway, enhances plant resistance to biotic and abiotic stresses. Plos One. 2016;11:e0162253.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang C, Yang Y, Wang H, Ran X, Li B, Zhang J, Zhang H. Ectopic expression of a cytochrome P450 monooxygenase gene PtCYP714A3 from Populus trichocarpa reduces shoot growth and improves tolerance to salt stress in transgenic rice. Plant Biotechnol J. 2016;14:1838–51.View ArticlePubMedPubMed CentralGoogle Scholar
- Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7:562.View ArticlePubMedPubMed CentralGoogle Scholar
- Park JH, Halitschke R, Kim HB, Baldwin IT, Feldmann KA, Feyereisen R. A knock-out mutation in allene oxide synthase results in male sterility and defective wound signal transduction in Arabidopsis due to a block in jasmonic acid biosynthesis. Plant J. 2002;31:1–12.View ArticlePubMedGoogle Scholar
- Moons A, Prinsen E, Bauw G, Van Montagu M. Antagonistic effects of abscisic acid and jasmonates on salt stress-inducible transcripts in rice roots. Plant Cell. 1997;9:2243–59.View ArticlePubMedPubMed CentralGoogle Scholar
- Du H, Liu H, Xiong L. Endogenous auxin and jasmonic acid levels are differentially modulated by abiotic stresses in rice. Front Plant Sci. 2013;4:397.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang Y, Mopper S, Hasenstein KH. Effects of salinity on endogenous Aba, Iaa, Ja, and Sa in Iris hexagona. J Chem Ecol. 2001;27:327–42.View ArticlePubMedGoogle Scholar
- Wasternack C, Hause B. A bypass in jasmonate biosynthesis-the OPR3-independent formation. Trends Plant Sci. 2018;S1360–1385(1318):30042–6.Google Scholar
- Kasprzewska A. Plant chitinases--regulation and function. Cell Mol Biol Lett. 2003;8:809–24.PubMedGoogle Scholar
- Grover A. Plant chitinases: genetic diversity and physiological roles AU - Grover, Anita. Crit Rev Plant Sci. 2012;31:57–73.View ArticleGoogle Scholar
- Ohnuma T, Numata T, Osawa T, Mizuhara M, Lampela O, Juffer AH, Skriver K, Fukamizo T. A class V chitinase from Arabidopsis thaliana: gene responses, enzymatic properties, and crystallographic analysis. Planta. 2011;234:123–37.View ArticlePubMedGoogle Scholar
- Zander M, La Camera S, Lamotte O, Métraux J-P, Gatz C. Arabidopsis thaliana class-II TGA transcription factors are essential activators of jasmonic acid/ethylene-induced defense responses. Plant J. 2010;61:200–10.View ArticlePubMedGoogle Scholar
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.View ArticlePubMedPubMed CentralGoogle Scholar
- Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 2012;8:e1002967.View ArticlePubMedPubMed CentralGoogle Scholar
- Vigueira CC, Olsen KM, Caicedo AL. The red queen in the corn: agricultural weeds as models of rapid adaptive evolution. Heredity. 2012;110:303.View ArticlePubMedPubMed CentralGoogle Scholar
- Vieira J, Vieira CP. On the identification of human selected loci in grapevines. Heredity. 2009;104:327.View ArticlePubMedGoogle Scholar
- Kashaninejad M, Tabil LG: Pistachio (Pistacia vera L.). In Postharvest Biology and Technology of Tropical and Subtropical Fruits. Edited by Yahia EM: Woodhead Publishing. 2011;218–247Google Scholar
- Guilfoyle J, GretchenHagen T. Auxin response factors. Curr Opin Plant Biol. 2007;10:453–60.View ArticlePubMedGoogle Scholar
- Xu X, Liu X, Ge S, Jensen JD, Hu F, Li X, Dong Y, Gutenkunst RN, Fang L, Huang L, et al. Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat Biotech. 2011;30:105–11.View ArticleGoogle Scholar
- Avni R, Nave M, Barad O, Baruch K, Twardziok SO, Gundlach H, Hale I, Mascher M, Spannagl M, Wiebe K, et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science. 2017;357:93–7.View ArticlePubMedGoogle Scholar
- Collins C, Dewitte W, Murray JAH. D-type cyclins control cell division and developmental rate during Arabidopsis seed development. J Exp Bot. 2012;63:3571–86.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Xu H, Luo X, Qian J, Pang X, Song J, Qian G, Chen J, Chen S. FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS One. 2012;7:e52249.View ArticlePubMedPubMed CentralGoogle Scholar
- English AC, Richards S, Han Y, Wang M, Vee V, Qu J, Qin X, Muzny DM, Reid JG, Worley KC, Gibbs RA. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One. 2012;7:e47768.View ArticlePubMedPubMed CentralGoogle Scholar
- Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A. 2011;108:1513–8.View ArticlePubMedGoogle Scholar
- Pryszcz LP, Gabaldon T. Redundans: an assembly pipeline for highly heterozygous genomes. Nucleic Acids Res. 2016;44:e113.View ArticlePubMedPubMed CentralGoogle Scholar
- Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011;27:578–9.View ArticlePubMedGoogle Scholar
- Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1:18.View ArticlePubMedPubMed CentralGoogle Scholar
- Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007;23:1061–7.View ArticlePubMedGoogle Scholar
- Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Han Y, Wessler SR. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 2010;38:e199.View ArticlePubMedPubMed CentralGoogle Scholar
- Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21(Suppl 1):i351–8.View ArticlePubMedGoogle Scholar
- Edgar RC, Myers EW. PILER: identification and classification of genomic repeats. Bioinformatics. 2005;21(Suppl 1):i152–8.View ArticlePubMedGoogle Scholar
- Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A, Leroy P, Morgante M, Panaud O, et al. A unified classification system for eukaryotic transposable elements. Nat Rev Genet. 2007;8:973–82.View ArticlePubMedGoogle Scholar
- Bao W, Kojima KK, Kohany O. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 2015;6:11.View ArticlePubMedPubMed CentralGoogle Scholar
- Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2009;Chapter 4:Unit 4 10.PubMedGoogle Scholar
- Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94.View ArticlePubMedGoogle Scholar
- Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(Suppl 2):ii215–25.View ArticlePubMedGoogle Scholar
- Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 2004;20:2878–9.View ArticlePubMedGoogle Scholar
- Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinformatics. 2007;4(3):1–4.3. 28.Google Scholar
- Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.View ArticlePubMedPubMed CentralGoogle Scholar
- Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 2016;44:e89.View ArticlePubMedPubMed CentralGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.View ArticlePubMedPubMed CentralGoogle Scholar
- Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002;12:656–64.View ArticlePubMedPubMed CentralGoogle Scholar
- Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003;31:5654–66.View ArticlePubMedPubMed CentralGoogle Scholar
- Haas B, Papanicolaou A. TransDecoder (find coding regions within transcripts); 2016.Google Scholar
- Tang S, Lomsadze A, Borodovsky M. Identification of protein coding regions in RNA transcripts. Nucleic Acids Res. 2015;43:e78.View ArticlePubMedPubMed CentralGoogle Scholar
- Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 2008;9:R7.View ArticlePubMedPubMed CentralGoogle Scholar
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41.View ArticlePubMedPubMed CentralGoogle Scholar
- Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30.View ArticlePubMedPubMed CentralGoogle Scholar
- Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O’Donovan C, Phan I, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003;31:365–70.View ArticlePubMedPubMed CentralGoogle Scholar
- Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42:D222–30.View ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.View ArticlePubMedGoogle Scholar
- Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–64.View ArticlePubMedPubMed CentralGoogle Scholar
- Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29:2933–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003;31:439–41.View ArticlePubMedPubMed CentralGoogle Scholar
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21.View ArticlePubMedGoogle Scholar
- Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.View ArticlePubMedGoogle Scholar
- De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 2006;22:1269–71.View ArticlePubMedGoogle Scholar
- Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–6.View ArticlePubMedGoogle Scholar
- Goebels F, Frishman D. Prediction of protein interaction types based on sequence and network features. BMC Syst Biol. 2013;7(Suppl 6):S5.View ArticlePubMedPubMed CentralGoogle Scholar
- Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH. Synteny and collinearity in plant genomes. Science. 2008;320:486–8.View ArticlePubMedGoogle Scholar
- Hasegawa M, Kishino H, Yano T. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 1985;22:160–74.View ArticlePubMedGoogle Scholar
- Ibarra-Laclette E, Lyons E, Hernandez-Guzman G, Perez-Torres CA, Carretero-Paulet L, Chang TH, Lan T, Welch AJ, Juarez MJ, Simpson J, et al. Architecture and evolution of a minute plant genome. Nature. 2013;498:94–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Kong Y. Btrim: a fast, lightweight adapter and quality trimming program for next-generation sequencing technologies. Genomics. 2011;98:152–3.View ArticlePubMedGoogle Scholar
- Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013;arXiv:1303.3997v2Google Scholar
- Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38:1358–70.PubMedGoogle Scholar
- Nei M, Li WH. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci U S A. 1979;76:5269–73.View ArticlePubMedPubMed CentralGoogle Scholar
- Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH-Y, et al. A draft sequence of the Neandertal genome. Science. 2010;328:710–22.View ArticlePubMedPubMed CentralGoogle Scholar
- Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81:1084–97.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475:493–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Parfitt DE, Badenes ML. Phylogeny of the genus Pistacia as determined from analysis of the chloroplast genome. Proc Natl Acad Sci U S A. 1997;94:7987–92.View ArticlePubMedPubMed CentralGoogle Scholar
- Zeng L, Wu D-D. genome and transcriptome of pistachio. Genome Sequence Archive. 2019; http://bigd.big.ac.cn/search?dbId=gsa&q=CRA000978. Accessed 2019.
- Zeng L, Wu D-D. genome and transcriptome of pistachio. Sequence Read Archive. 2019; http://www.ncbi.nlm.nih.gov/bioproject/526975. Accessed 2019.