- Open Access
Genomic signatures of near-extinction and rebirth of the crested ibis and other endangered bird species
Genome Biologyvolume 15, Article number: 557 (2014)
Nearly one-quarter of all avian species is either threatened or nearly threatened. Of these, 73 species are currently being rescued from going extinct in wildlife sanctuaries. One of the previously most critically-endangered is the crested ibis, Nipponia nippon. Once widespread across North-East Asia, by 1981 only seven individuals from two breeding pairs remained in the wild. The recovering crested ibis populations thus provide an excellent example for conservation genomics since every individual bird has been recruited for genomic and demographic studies.
Using high-quality genome sequences of multiple crested ibis individuals, its thriving co-habitant, the little egret, Egretta garzetta, and the recently sequenced genomes of 41 other avian species that are under various degrees of survival threats, including the bald eagle, we carry out comparative analyses for genomic signatures of near extinction events in association with environmental and behavioral attributes of species. We confirm that both loss of genetic diversity and enrichment of deleterious mutations of protein-coding genes contribute to the major genetic defects of the endangered species. We further identify that genetic inbreeding and loss-of-function genes in the crested ibis may all constitute genetic susceptibility to other factors including long-term climate change, over-hunting, and agrochemical overuse. We also establish a genome-wide DNA identification platform for molecular breeding and conservation practices, to facilitate sustainable recovery of endangered species.
These findings demonstrate common genomic signatures of population decline across avian species and pave a way for further effort in saving endangered species and enhancing conservation genomic efforts.
The International Union for Conservation of Nature (IUCN) and Bird Life Species has recognized over 20% of approximately 10,000 extant bird species as being threatened. As of 2014, the IUCN RedList has declaredfive, 1,373, and 959 species as extinct in the wild, threatened, and near threatened, respectively. Between 1988 and 2008, the conservation status of 235 species was upgraded to higher categories of endangerment, as compared to just 32 species that were downgraded . Furthermore, historical records document the extinction of at least 150 avian species since the 16th century. The principal threats leading to avian population decline have been linked to man-made environmental disasters, including over-hunting, habitat loss, pesticide abuse, and invasive species introduction . To combat the ongoing decline, conservation efforts have been made, such as protective legislation, habitat restoration, captive breeding, and reintroduction, and all are responsible for the successful recovery of 49 species that were near-extinct between 1994 and 2004 .
Recent conservation genetic studies - have demonstrated that small populations are susceptible to allelic drift, leading to allele loss/fixation, and the process can be accelerated by inbreeding. Likewise, in small captive populations, rapid genetic deterioration, such as inbreeding depression and genetic adaptation to artificial environment, can also occur . Deleterious mutation tends to accumulate due to reduced selective strength . Furthermore, extinction rate in small wild populations increases significantly as heterozygosity decreases . Several genetic studies have attempted to characterize this effect from conservation-related bottlenecks among avian species, albeit based on limited markers of allozymes or microsatellites ,. It has been proposed that studies using up-to-date and more informative markers at a genome-scale will be necessary .
One of the most recently endangered bird species in the world is the crested ibis (Nipponia Nippon; IUCN Red Data Book, BirdLife International 2001). It was once widespread across Northeast Asia, with a range encompassing China, Russia, Korea, and Japan until the 1860s (Figure 1a). Suffering from over-hunting and habitat loss, the crested ibis populations had finally collapsed in the late 19th and early 20th centuries, to the extent that it was thought to be completely extinct from the wild, when the last five birds were taken into captivity in Japan in 1981  (Figure 1b; Additional file 1: Tables S1, S2). Although extensive private and government-supported efforts were made to revive the species in Japan, the final five individuals taken into captivity died without producing offspring. However, through an extensive international survey effort aimed at saving the species from extinction, a remarkable discovery was made later in 1981. After over 17 years of no sightings in China (since one loner bird seen in the wild in 1964), seven individuals belonging to two breeding couples (four adults, of which one couple had three chicks) were found at the edge of their normal habitats in a remote location in the southern foothills of the Qinling Mountains (Yangxian County, Shanxi). To protect these seven birds, China immediately established the Yangxian Nature Reserve, and a conservation program for both wild and captive birds was started. Monitor stations were set up for every nest, GPS-based systems for individual identification and tracking, and guidelines that control pesticides and habitat destruction were introduced -. As a result, the two breeding pairs became what can be considered the ‘Adam and Eve’ for the recovered crested ibis population that has gone through two phases in over 33 years: initially, a small size (<100) between 1981 to 2001, and up to over 2,000 individuals currently (Figure 1b).
To provide genome-scale insights into the near-extinction and rescue, we sequenced the genomes of multiple individuals from both the crested ibis (n = 9; from the rescued population) and its co-habitant, not-endangered close-relative, the little egret (Egretta garzetta; n = 6; from the same order Pelecaniformes; diverged approximately 57 million years ago) . We compared their genome sequences with those of 41 other avian species described in companion publications in this issue and elsewhere , which include seven Endangered + Vulnerable (EV) species listed by the IUCN within the recent past (crowned crane, Balearica regulorum; macQueen’s bustard, Chlamydotis macqueenii; brown mesite, Mesitornis unicolor; kea, Nestor notabilis; dalmatian pelican, Pelecanus crispus; white-tailed eagle, Haliaeetus albicilla; and bald eagle, Haliaeetus leucocephalus) and 31 Least Concern (LC) species (Additional file 1: Table S3). We found common genomic signatures among the endangered or recently endangered species and that in the ibis was associated with feeding behavior, climate change, environmental hazard, and man-made disasters. We also found that the ibis populations are rapidly evolving and possess greater genetic diversity than expected in the recovery process. To better assist protection and recovery efforts for the crested ibis, we developed technical platforms and molecular tools, which may also be useful for the rescue and protection of other endangered wildlife.
The reference genome assemblies and annotations of the crested ibis and its cohabitant the little egret
With the crested ibis being our focused EV species, we first assembled its reference genome de novo from high-coverage (approximately 156×) sequence reads generated from a series of libraries constructed with various insert sizes, using SoapDenovo . The assembly reached a contig N50 size of 67 kb and a scaffold N50 size of 10.7 Mb (Table 1 and Additional file 1: Table S4), and upon adding 282 Gb single-molecule optical mapping data, its super-scaffold N50 size increased to 26 Mb. Using the conserved chromosomal organization of the chicken and zebra finch genomes, we anchored 1 Gb super-scaffolds to the chromosomes, which constitute 82% of the estimated genome size. The final reference genome represents one of the more complete assemblies in the current avian genome study . We validated the reference genome with alignment to eight fully-assembled fosmid sequences (98.4% alignment; Additional file 1: Table S5) and transcriptomic data (RNA-seq from two other blood samples; 95.0% alignment; Additional file 1: Table S6). Using similar procedures, we also generated a reference genome (approximately 70× coverage) from a male little egret, which has N50 contig and scaffold sizes of 24 kb and 3.0 Mb, respectively (Additional file 1: Table S4). Our annotation efforts predicted 17,163 and 17,032 genes for the crested ibis and the little egret, respectively (Additional file 2: Figure S1). For the crested ibis, approximately 55% of the predicted genes were validated based on about 107 million mRNA sequences from blood (≥1 RPKM (reads per kilobase per million)). In addition, the repeat contents of the two birds constitute 6.6% and 7.4% of the genome length for the ibis and egret genomes, respectively (Additional file 1: Table S7), similar to that of the zebra finch  but less than that of the chicken . The comparative genome assembly statistics and annotations of the other 41 EV and LC avian species are reported in . Our study represents the first effort of comparative genomic analyses based on the genome information generated from these EV and LC species.
Low heterozygosity among the EV species, and its ongoing loss but with signs of increased diversity in the recovered crested ibis population
Genetic diversity has been shown to buffer species against widespread epidemics of infectious agents and parasites; its decrease has been thought to have detrimental effects on population health and survival ,. Prior studies have proposed that EV species have low genetic diversity ,, but all were based a limited number of neutral genetic markers. We measure genetic diversity of the EV species by aligning high-quality reads from the genome sequences of individual birds against their reference genomes -. We first compared two to three species in the same order according to the phylogeny based on whole genome sequences from a companion study , but with different ICUN conservation statuses. In all cases, the EV species within the pairing scheme showed significant heterozygosity reduction relative to the LC control species, with the severest found in the crested ibis and the kea (Table 2). The numbers of heterozygous SNP loci are 478,836 (heterozygosity, 0.43 × 10-3) in the crested ibis genome, which is approximately 6 times less than in the little egret genome (2,683,899; heterozygosity, 2.51 × 10-3; Figure 2a). This finding was confirmed in analyses that compared all eight EV and 32 LC species regardless of phylogenetic relationship, showing that the average heterozygosity rate of the EV species is 1.08 × 10-3, significantly smaller than that of the LC species of 2.49 × 10-3 (Figure 2b; Additional file 1: Table S8). Our analyses on heterozygous SNPs of protein-coding sequences confirmed similar heterozygosity reduction in the EV species (Table 1; Additional file 2: Figure S2).
To confirm this genomic signature at the population level, we analyzed SNP and STR calls, using the moderate-coverage genome sequences (approximately 20×) of eight crested ibis and five little egret individuals, which were sampled from the same populations as used for the reference genomes. We found a dramatic one-eighth (8 times less) SNP density reduction of the autosomes in the crested ibis population as compared to the little egret population (Additional file 1: Tables S9 and S10). The average frequency of short (1 to 2 bp) STR loci for the crested ibis genomes (0.7%; 2-bp, ≥4 alleles; n = 6) was an order of magnitude lower than that of the little egret genomes (8.0%, n = 6; Figure 2c). The longer STR loci (3- to 6-bp) also showed magnitudes lower frequencies, similar to what were seen in the crested ibis (Figure 2c).
Taking advantage of the extensive identity-tracking data from the living crested ibis populations, we asked if there is still ongoing heterozygosity reduction over time during species recovery. By analyzing 31 well-defined STR loci (4-bp unit), we estimated Ht/H0 , (heterozygosity at generation t/initial heterozygosity) from 105 individuals (Additional file 1: Table S11) and found a negative Ht/H0 correlation (r = -0.61) with population recovery time (with a slope of 0.017 units lost per year; Figure 2d). The SNP-based Ht/H0 of the eight re-sequenced ibis individuals also displays a strong negative correlation with population recovery time (r = -0.66), although there was no significant change with time, most likely due to the limited data points (Figure 2d). To further investigate the genetic basis of this heterozygosity reduction, we calculated fixation index among four sub-populations derived from the two original breeding pairs, including their offspring kept in the original Yangxian Nature Reserve population. We found remarkably large fixation indices among the sub-populations (Additional file 2: Figure S3), despite that the first population split occurred only about 20 years ago. This points to the presence of signs for rapidly increased genomic diversity between separated populations, even though its smaller population size and physical isolation all lead to ongoing heterozygosity reduction.
Accumulation of deleterious mutations in the threatened species
Non-synonymous changes often lead to functional, sometimes deleterious, changes in proteins , and inbreeding is thought to contribute to the increase of deleterious mutations in a population . We wondered if there are genes bearing more non-synonymous mutations in EV species and assessed the ratio of non-synonymous/synonymous (NS/S) heterozygous SNPs between the two haploid sequences within a diploid genome of a given species. We found that the eight EV species show much higher NS/S (0.68, median) than their closely related LC species (Table 2) or the combined ratio of all 32 LC species (0.50; Figure 3a, Additional file 1: Table S8), with the highest found in the two eagles and dalmation pelican, followed by the kea and the crested ibis (Table 2). Similarly, at the population level, the NS/S ratio of the eight other crested ibis individuals ranges from 0.66 to 0.70, as compared to 0.44 to 0.48 for the five little egret individuals, and this difference is significant (Additional file 1: Table S10).
We tested whether the higher NS/S ratios can be attributed to stronger linkage disequilibrium due to inbreeding in a small effective population , using the multiple sequenced individuals. The crested ibis population has a slow linkage disequilibrium (LD) decay with reduced r 2 correlation coefficient at half of its maximum and at a distance of approximately 60 kb as compared to the little egret population with a distance of approximately 1 kb (Figure 3b). A similar slow LD decay has been observed in highly-inbred domestic species, such as horse  and dog . Furthermore, the synonymous SNP fraction of the derived alleles at a low frequency of 0.1 in the crested ibis population is just half that of the non-synonymous SNPs, whereas the two values are either higher or comparable in the little egret (Figure 3c). It appears that the decreased proportion of low-frequency synonymous SNPs relative to non-synonymous SNPs is a result of inbreeding fixation in the small crested ibis population.
Genes involved in brain function and cytochrome P450 metabolism have allelic fixation in the recovered crested ibis population
To find out whether the SNP fixation we observed in the recovered ibis population was randomly distributed among the genomes or specific to certain segments and genes, we utilized a method that identifies differences in rates of fixed SNPs, which are assumed, but does not necessarily have to occur by selective sweeps within the genomes . Specifically, we calculated heterozygosity (H p ) and its Z transformations, ZH p (Figure 4, see Materials and methods) in 500-kb sliding windows (n = 2,514) along whole genomes (except for sex-chromosome scaffolds) for the most and least frequently observed alleles at all SNP positions. From the distribution of observed H p and ZH p (Figure 4a), we defined a threshold of fixed SNPs (ZH p score equal to -2.326 or less, P <0.01, Figure 4b). The smallest H p values represent the least frequently observed alleles. We found approximately 1.4% of the windows (n = 36) had a ZH p score -2.326 or less (Figure 4b), and thus were significant outliers for the whole genome. We examined all candidate genes that resided in these regions (Additional file 1: Tables S12), and categorized them according to Gene Ontology (GO) terms (Additional file 1: Tables S13 and S14). Seven categories were statistically enriched, and nearly all involved in brain function: postsynaptic membrane, synapse part, GABA-A receptor activity, GABA signaling pathway, extracellular ligand-gated ion channel activity, chloride transport, and clathrin adaptor complex. Of the genes involved, 10 were over-represented in the neuroactive ligand-receptor interaction pathway (KEGG map04080) that processes information from exogenous signals using neurotransmitters and signaling molecules, including epinephrine, norepinephrine, neuromedin, melanin-concentrating hormone, GABA, and acethycholine (Additional file 1: Table S13). In addition, four of the cytochrome P450 genes stood out, which function in metabolism of hormones and toxins, including arachidonic acid (map00590) and linoleic acid (map00591) metabolism pathways that process essential fatty acids and play an important role in immune modulation . These findings suggest that the fixation in the genome is not random.
Population bottlenecks of the crested ibis and immune genes
It is possible that the alleles became fixed through a bottleneck affect reducing the polymorphisms in the genome driven by genetic drift. Population bottleneck refers to a sharp population size reduction due to environmental events or human activities. With sequence data from whole genomes, and of multiple individuals, we can calculate population bottlenecks more reliably than with several genes and neutral markers. We reconstructed the crested ibis’ demographic history using our sequence data based on a pairwise sequential Markovian coalescent (PSMC) model  and a diffusion approximation method for demographic inference (∂a∂i)  (see Materials and methods). This analysis revealed two ancient and one recent bottleneck (Figure 5a). The two ancient events (1 to 0.01 MYA) occurred during the Naynayxungla glaciation (0.78 to 0.50 MYA)  and the last glaciation (Ice Age, 0.01 MYA), resulting in a precipitous population drop to approximately 10,000 individuals at the end of the last glaciation. The recent event leads to a drastic population reduction throughout the last 100 years, terminating approximately 25 years ago with the human assisted recovery effort (Figure 5b, Additional file 1: Table S15). This timeframe is not only consistent with the known historic records of the ibis species , but also reflects a blend of global climate change , intensified human activity , and remarkably, population recovery after conservation efforts commenced some 30 years ago. This result is very different from the little egret in which population decreased during the Penultimate glaciation (0.3 to 0.13 MYA), but only slightly decreased in the last glaciation (Additional file 2: Figure S4).
To search for possible genetic causes for the recorded animal deaths among the recovered population, we scrutinized the records and found that the crested ibis population in the wild, while in the human-assisted recovery, has still been suffering from parasitic infection and other diseases, which account for 46.3% of total deaths from 1981 to 2003 (Figure 5c) . Since the major histocompatibility complexes (MHC) and its genetic variants are critical for immunity , we analyzed the nucleotide sequences of the classical MHC class II β gene (BLB) antigen binding region (encoding a protein for presenting antigenic peptides to helper T cells). This gene shows a much lower genetic diversity (1 locus with ≥3 amino acid alleles) in the antigen binding domain than its homolog of the little egret (6 loci with ≥3 amino acid alleles; Additional file 2: Figure S5). Such a low level of genetic diversity in the BLB and other MHC genes may result in defective immunity of the crested ibis population.
Historic agrochemical overuse and relevant mutated genes in threatened avian species
Overuse of various nondegradable agrochemicals has been suggested as one of the principal reasons for the population decline of seven of the eight EV species examined: bald eagle , white-tailed eagle , kea , Adele penguin , emperor penguin , chimney swift , and the Dalmatian pelican . To investigate whether agrochemical use also contributed to the decline of the crested ibis survival, we first examined the relevant historical evidence. In Japan, during the Meiji Restoration (in the late 19th century), traditional protection measures were disregarded and rampant hunting rapidly reduced the crested ibis population to the extent that by the time when the species was enlisted for protection on the hunting ordinances in 1908, it was almost extinct there . Although relevant evidence was poorly documented in Korea, Northern China, and Russia in the first half of the 20th century, in central China, the crested ibis was common in Gansu and Shaanxi Provinces before 1950 but nearly extinct by the end of 1950s. We found a negative correlation between the estimated crested ibis population size in the Gansu and Shaanxi Provinces and the amount of fertilizers and pesticides used in the region (Figure 5d). These findings suggest that overuse of agrochemicals may be associated with very dramatic and obvious decline in the crested ibis population of the region from which our genomes were sequenced.
We compared 6,332 orthologs genes among EV (all are carnivorous species; n = 8) and also to the LC carnivorous species (n = 15), since carnivorous species are also often apex predators and more sensitive to agrochemicals ,,,. We identified 44 genes that have a significantly higher rate of being inactivated (null mutations that alter protein structure) in the EV carnivorous species, and only nine genes with a significantly higher rate in the LC carnivorous species (Fisher’s exact test, P <0.05; Additional file 1: Table S16). Among them, 17 genes are metabolism related enzymes; for instance, one of them, SLCO1A2, a sodium-independent transporter mediating cellular uptake of organic anions in the liver , has lost its function in three threatened species (37.5%) and in none (0%) of the LC species. Another, HACL1, catalyzing a carbon-carbon cleavage reaction, is necessary for the catabolism of phytanoic acid in carnivores , which has lost its function in three threatened species. CHIA, which degrades chitin-chitotriose and participates in the defense against pathogens , has lost its function in five threatened species. These findings suggest that carnivorous EV species have a greater genetic susceptibility to agrochemicals.
Genome-wide STR profiling of the crested ibis population for marker-assisted breeding
Conservation and rescue of a species from near-extinction has been partly assisted with having genetic loci to track individuals to reduce inbreeding . However, there have been limited number of markers that can do so, including for the crested ibis. Further, changing conservation practice from small-scale captive breeding to STR-assisted breeding in large Nature Reserves can theoretically be benefitted by genome-scale approaches for genetic markers. In this study, we identified approximately 166,000 degenerate STR loci (sequences containing insertions, deletions, or mismatches) from the crested ibis genome sequences (Figure 6a) and tested them against a population of nine crested ibis individuals (18 haplotypes). On average, minor STR alleles exhibited a 4-bp difference from their major alleles and 18% of the allelic differences differed by >5 bp over their major alleles (Figure 6b). Focusing on the 4-bp STRs, we confirmed heterozygosity for 300 such loci within and across some of the crested ibis sub-populations (105 individuals) and selected a set of 23 informative STR loci, including 22 autosomal loci and one sex-linked locus (distinguishing male from female with a 31-bp insertion in W chromosome) to establish a DNA identification profiling (DIP) platform (Figure 6c and d, and Additional file 2: Figure S6; Additional file 1: Table S18). The new sex marker accurately discriminated between males (ZZ) and females (ZW) (Additional file 2: Figure S7), an important advance since morphological-based gender determination for the crested ibis is quite difficult. This DIP marker set had a collective discrimination power of 0.628 and an exclusion probability of 0.994 (Additional file 1: Table S19). This platform demonstrated an estimated probability of paternity up to 99.997% on pedigree analysis in a four-generation family (Additional file 1: Table S20). Our DIP platform is now being used in reconstructing pedigree relationships, establishing individual identity for the recovering crested ibis populations and assisting non-sibling or genetically distant parental pairing.
Our genome-wide analysis on the endangered crested ibis and seven other recently endangered and rescued avian species across the Neoaves phylogenetic tree provides direct evidence at a genomic-scale for support of previous hypothesis and novel insights into consequences of heterozygosity loss, deleterious mutation accumulations, population bottlenecks, and genetic drifts. The convergent inactivation (or pseudogenization) of xenobiotic metabolism related genes in the ibis and other endangered top predators suggests a reduction of adaptive genetic plasticity in these species to agrochemical overuse. However, the increasing genomic diversity among the isolated ibis populations derived from the offspring of the last wild pair identified in 1981 indicate that rapidly diverging sequences in the recovering ibis population are being fixed in less than 10 generations.
Our genome-wide data are important for exploring causative factors of the near-extinction and exact demographic reconstruction of the endangered species, and both are necessary for distinguishing long-term climate change from recent human-mediated events ,. In our case, we identified distant bottlenecks due to the past glaciations and the most recent bottleneck that is clearly unrelated to global glaciations (although severe periodic temperature drops may happen to accelerate the process) but associated with some man-made factors. The man-made induced bottleneck was much more severe than the glaciations. It appears that the crested ibis is more sensitive to these environmental challenges than the little egret. Similar events may have affected endangered non-avian species, such as the giant panda . However, our observations on the EV avian species provide an avian model for conservation genomics, which is distinctly different from giant panda whose genetic diversity remains high , albeit with a similar population size (approximately 2,000 for the crested ibis vs. approximately 2,500 for the giant panda). We propose that, regardless of the past conservation success, an immediate evaluation of genetic diversity and sequence variation should be imposed for risk assessment on all endangered species.
The genetic drift for fixation of changes in brain and metabolism genes of the rescued crested ibis population in China is intriguing, and could mean either deleterious mutations in these genes or the possibility of enhanced functions for certain brain behaviors and enhanced metabolism of toxins for survival of the species. Consistent with the former possibility, the surviving crested ibis in China may have gone through a change in foraging behavior . As a wading bird, the crested ibis uses a ‘remote touch’ mechanism to detect the movement of their prey in the mud through a series of sensory receptors , and either sacrifice or gain of sensing and digesting abilities are all relevant. We do not know if this change occurred demographically before or through genetic drift after the two breeding pairs were rescued in 1981.
One interpretation of the fixation findings based on the methods we used is that there have been selective sweeps for specific SNPs in specific brain and metabolism genes over the past 30 years. These sweeps could have artificially occurred due to controlled inbreeding, or naturally occur due to selection. But such selective sweeps within such a short time, for animals that reach sexual maturity at around 3 years with limited generations seems remarkable; although we see more rapid increasing diversity in the genome than expected. An alternative, more likely interpretation is that greater fixation of these alleles was already present in the two last breeding pairs before near extinction due to demographic differences . This difference is difficult to test considering all the animals we sequenced are descended from the last seven individuals in the wild from one population in 1981, where all others are extinct. If a demographic explanation were true, it would mean that these alleles became fixed through a bottleneck affect reducing the polymorphisms in the genomes by genetic drift.
One question that can now be better addressed is why the crested ibis nearly suffered extinction, whereas its cohabitant, the little egret, did not? One possible reason is that while both species exploit aquatic environments, such as eating mollusks, crustaceans, fish, and frogs, the little egret also consumes plant seeds in the winter or under drought and thus still thrives strongly in the same habitat. This foraging behavioral difference is consistent with genetic differences in enzymes for food digestion. Another possible reason is that the little egret might have become resident birds and gone through a bottleneck already by changing their foraging behavior at the same time.
Our genome-wide STR markers and its application to conservation genomics also provide a more powerful platform for breeding and tracking of endangered species both in partial captivity and in the wild. With this method, we believe that it is possible that immediate genome sequencing and evaluation of genetic diversity and loss-of-function genes for risk assessment can be done for generating rescuing strategies for other currently endangered avian species.
Our study is the first, of which we are aware, to conduct genome-scale analyses of species that were endangered, including near extinct, across a vertebrate class. With a focus on the crested ibis, we were able to identify genetic associations before, during, and after the near extinction events and population bottlenecks. We confirmed some expected changes, but genome-wide, such as reduced heterozygosity, accumulation of deleterious mutations, and susceptibility to agrochemical overuse by humans. We also identified seemingly positive changes in the recovering crested ibis population, such as more rapidly increasing genetic diversity between new populations than expected, and changes in some gene families that could potentially be related to surviving extinction or recovery. Our genome-scale derived STR platform is now assisting in that recovery. We hope that the knowledge and lessons learned from this study will be applicable to not only the one-quarter of avian species that are threatened or near threatened, but to threatened species broadly.
Materials and methods
For de novo assembly, we extracted DNA samples from peripheral venous blood of a 3-year-old female crested ibis in the Yangxian County Reserve and a male little egret captured from the same county in southern Qinling Mountains, Shaanxi Province, China. For our resequencing effort, blood DNA samples were from eight crested ibis and five little egret individuals from the same locality. For meta-analysis of endangered and least concerned species, we used the genome sequences of 41 additional avian species (Additional file 1: Table S3) . For DNA profiling, we used 105 individual crested ibis from four sub-populations of Yangxian, Huayang, Louguan, and Ningshan Counties (Additional file 1: Table S11).
Skin cells were grown in DMEM medium supplemented with 15% fetal bovine serum. Metaphase preparations for flow sorting were generated . The crested ibis chromosomes were numbered according to convention . Chromosome preparations were stained with Hoechst 33258 (Sigma, St Louis, MO, USA) and Chromomycin A3 (Sigma) and then sorted  (MoFlo, DAKO, Glostrup, Denmark DAKO).
For genome assembly, we constructed sequencing libraries with variable insert sizes (180 bp, 500 bp, 800 bp, 2 kb, 5 kb, 10 kb, and 20 kb) by following the manufacturer’s instruction for Illumina’s HiSeq 2000. Sequences of approximately 266 Gb and 127 Gb (reads length: 100 bp for libraries with insert size <2 kb, 45 bp for other libraries) were generated for the crested ibis and the little egret, respectively, and after quality-filtering, approximately 194 Gb (roughly 156×) and about 90 Gb (71×) sequences survived for the assembly. To achieve better contiguity, we also used a new physical mapping technology developed by the Argus System (OpGen) and its assembly software (Genome-Builder), which produced, based on KpnI digest, a large optical mapping dataset (about 282 Gb) from 34 high-density MapCards and containing 799,678 single-molecule restriction maps (>250 kb) with an average size of 353 kb.
The genome sequences for the crested ibis and little egret were assembled by using the de Bruijn graph-based assembler SOAPdenovo . Prior to assembly, potential sequencing errors were removed or corrected based on k-mer frequency methodology. Reads from libraries with insert sizes ranging from 170 bp to 800 bp were split into 41-mers to construct de Bruijn graphs and contigs. The paired-end reads were aligned to construct scaffolds. Super-scaffolds for the crested ibis were constructed and aided with optical mapping data. The crested ibis chromosomes were built by using super-scaffolds based on conserved synteny between the assembly and genome data of chicken and zebra finch.
To assess the large-scale and local assembly accuracy, we also sequenced (Sanger sequencing technology) and assembled (phred-phrap-consed) eight randomly selected fosmids (average approximately 39 kb long) from a genomic library for the crested ibis (same DNA used for the reference assembly). We also assessed the completeness and accuracy of our assembly using 98,881 transcripts from blood, which were sequenced and assembled independently. A total of 94,709 assembled transcripts (>95%) were mapped to the assembly (BLASTN, E <10-5, coverage ≥90%), yielding a single-base accuracy of approximately 98% for the assembled sequences with >20 reads coverage and excluding sequence gaps.
Gene and repeat annotations
To predict genes, we used both homology-based and de novo methods as follows. First, we obtained protein sequences of chicken, zebra finch, and human from Ensembl (release 60) and mapped them onto the genome assemblies using Tblastn with E-value 1e-5. All high-score segments were grouped into gene-like structures (genBlastA ). The homologous genome sequences with flanking sequences (2,000 bp) were aligned to the protein sequences by using Genewise  to define gene models. We clustered predicted transcripts >100 bp and took cross-species synteny into account (otherwise, a transcript with the best aligning score was used). Single-exon genes containing one or >1 frame shift errors and multi-exon genes containing >3 frame errors were not taken into account. Second, we clustered transcripts using TopHat  and Cufflinks  and aligned them (>150 bp) to SwissProt/TrEMBL database  with a cutoff E-value <1e-5. Third, we predicted protein-coding genes (>150 bp) using Genscan  (gene model parameters trained with Homo sapiens genes) and Augustus  (gene model parameters trained with chicken genes) and defined TE-derived proteins (BlastP with E-value <1e-5 and >50% alignment).
For the reference gene set, we constructed gene models following three criteria: (1) candidate genes clustered with >100 bp overlap; (2) one cluster per gene (homology-based model > RNA-seq model > de novo predicted model); and (3) if not (2), 30% alignment to a known protein in the SwissProt/TrEMBL database  (>2 exons). Functional annotations were based on the best match principle using SwissProt, InterPro , and KEGG  databases. Treefam  was used to define gene family (Blastp, E-value <1e-7; Hscore >10; minimum edge density >1/3) and CAFE  to define gene loss and gain.
We annotated transposable elements (TEs) based on homology to RepBase sequencesusing RepeatProteinMask and RepeatMasker  with default parameters. We also constructed de novo repeat libraries (transposable elements) using RepeatModeler (http://repeatmasker.org/RepeatModeler.html) with default parameters.
Resequencing data analysis
Resequencing reads were generated from a single-size insert library (500 bp) per individual and mapped high-quality reads onto the references with BWA , followed by removal of unmapped reads (average quality <10 or average map quality <20 or multiple-site reads). SNPs were called by using SOAPsnp  with thresholds of quality score ≥20, sequencing depth >8X and <40X, copy number of flanking sequences < = 2, >3 uniquely mapped reads, and distance between SNPs ≥5 bp.
We calculated the correlation coefficient (r 2) of alleles at SNP locus after setting -maxdistance 300 -dprime -minGeno 0.6 -minMAF 0.1 -hwcutoff 0.001 using the Haploview software . Since sample size is an important parameter influencing LD patterns, we randomly selected five crested ibises three times to repeat the experiment and the analysis. To reconstruct ancient demographic history, we ran the PSMC program (parameters: -N30, -t15, -r5, and -p ‘4 + 25*2 + 4 + 6’) using autosomal sequences (scaffold length ≥50 kb and a total of 478,758 heterozygous loci). We performed bootstrapping (100 times) to estimate the variance of simulated results and estimated the neutral mutation rate μ (mutations per base per generation) using the estimated genome-wide nucleotide divergence (10.31%) and divergence time (38.98 × 106) between the crested ibis and the little egret. Based on mean generation time (3 years for crested ibis), we calculated μ = (0.1031 × 3)/(2 × 38.98 × 106) = 3.968 × 10-9 mutations per generation for the crested ibis.
To reconstruct recent demographic history, we used the ∂a∂i program  and paired-end reads (500 bp in size) from nine samples (eight re-sequencing individuals and one de novo assembly individual). To minimize the effect of low-coverage sequencing, we extracted the sites that were covered by high-quality reads (at least six of nine individuals covered by >2X reads). To prepare for ∂a∂i program, we called 1,420,399 SNPs using a published method . The little egret reference genome sequence was used to infer ancestral alleles. We considered four models and chose the one with highest maximum log-likelihood value. The ancestral population size (Na) was estimated on the basis of the calculated θ value and the mutation rate. Population size and corresponding time were derived from parameters scaled based on Na.
Purifying selection analysis
For each 500-kb window, we determined the number of reads corresponding to the most and least abundant SNP alleles (nMAJ and nMIN), H p = 2∑nMAJ∑nMIN/(∑nMAJ + ∑nMIN)2, and transformed H p into Z scores: ZH p = (H p-μH p)/σH p . We applied a threshold of ZH p = -2.326 (q <0.01 in normal distribution) for putative selective sweeps.
Genome-wide STR profiling
We defined STRs using Tandem Repeat Finder  (parameters: Match = 2, Mismatch = 7, Delta = 7, PM = 80, PI = 10, Minscore = 30, and MaxPeriod = 6), which were validated in the following steps. DNA was extracted with the E.Z.N.A.™ Blood DNA Kit (Omega Bio-Tek Inc., USA) according to its instruction (E.Z.N.A.™ Blood DNA Isolation Protocols, Revised June 2009). All DNA samples were quantified with the TIANamp Genomic DNA Kit. PCR amplification was performed in a reaction volume of 25 μL with MicroAmp® reaction tubes (Applied Biosystems, CA, USA; the GeneAmp® PCR Systems 9700 with gold-plated silver or silver 96-well blocks). Amplified products were separated in ABI3730 DNA Genetic Analyzer 48-capillary array system (Applied Biosystems) according to the manufacturer’s protocol. The genotypes were analyzed by using Genemapper 3.5 (Applied Biosystems).
Genome data of crested ibis and little egret are uploaded to NCBI (PRJNA232572 and PRJNA232959). The raw reads in the SRA (SRP035852 and SRP035853). The NCBI accession numbers of the assembled genomes of all species are described in Additional file 1: Table S3.
- BLB :
MHC class II β gene
- CHIA :
DNA identification profiling (DIP) platform
Combined EN and VU
- HACL1 :
2-hydroxyacyl-CoA lyase 1
- Hp :
Heterozygosity in 500-kb sliding windows
Heterozygosity at generation t/initial heterozygosity
International Union for Conservation of Nature
Kyoto Encyclopedia of Genes and Genomes
Major histocompatibility complexes
Million years ago
Pairwise sequential Markovian coalescent
Reads per kilobase per million
- SLCO1A2 :
Solute carrier organic anion transporter family member 1 A2
Single nucleotide polymorphisms
- ZHp :
Z transformations of Hp
Attenborough D: Help us build a brighter future: the need to save the world’s most threatened birds. 2008, BirdLife International, Cambridge
Collar NJ, Andreev AV, Chan S: Threatened Birds of Asia: The BirdLife International Red Data Book. 2001, BirdLife International, Cambridge, 5
Brooke Mde L, Butchart SH, Garnett ST, Crowley GM, Mantilla-Beniers NB, Stattersfield AJ: Rates of movement of threatened bird species between IUCN red list categories and toward extinction. Conserv Biol. 2008, 22: 417-427. 10.1111/j.1523-1739.2008.00905.x.
Frankham R: Genetics and extinction. Biol Conserv. 2005, 126: 131-140. 10.1016/j.biocon.2005.05.002.
Saccheri I, Kuussaari M, Kankare M, Vikman P, Fortelius W, Hanski I: Inbreeding and extinction in a butterfly metapopulation. Nature. 1998, 392: 491-494. 10.1038/33136.
Keller LF, Waller DM: Inbreeding effects in wild populations. Trends Ecol Evol. 2002, 17: 230-241. 10.1016/S0169-5347(02)02489-8.
Lande R: Mutation and conservation. Conserv Biol. 1995, 9: 782-791. 10.1046/j.1523-1739.1995.09040782.x.
Woodworth L, Montgomery M, Briscoe D, Frankham R: Rapid genetic deterioration in captive populations: causes and conservation implications. Conserv Genet. 2002, 3: 277-288. 10.1023/A:1019954801089.
Evans SR, Sheldon BC: Interspecific patterns of genetic diversity in birds: correlations with extinction risk. Conserv Biol. 2008, 22: 1016-1025. 10.1111/j.1523-1739.2008.00972.x.
Spielman D, Brook BW, Frankham R: Most species are not driven to extinction before genetic factors impact them. Proc Natl Acad Sci U S A. 2004, 101: 15261-15264. 10.1073/pnas.0403809101.
Kohn MH, Murphy WJ, Ostrander EA, Wayne RK: Genomics and conservation genetics. Trends Ecol Evol. 2006, 21: 629-637. 10.1016/j.tree.2006.08.001.
Liu YZ: Rediscovery of crested ibis Nipponia nippon in Qinling Mountain. Chin J Zool. 1981, 27: 237-
Shi D, Cao Y: Chinese Crested Ibis. 2001, China Forestry Press, Beijing
Ding C: Research on the Crested Ibis. 2004, Shanghai Scientific and Technological Educational Publishing House Press, Shanghai
Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, Ho SYW, Faircloth BC, Nabholz B, Howard JT, Suh A, Weber CC, Fonseca RR, Li J, Zhang F, Li H, Zhou L, Narula N, Liu L, Ganapathy G, Boussau B, Bayzid MS, Zavidovych V, Subramanian S, Gabaldón T, Capella- Gutiérrez S, Huerta-Cepas J, Rekepalli B, Munch K, Schierup M, et al: Using whole genomes to resolve the tree of life of modern birds. Science. 2014, 346: 1320-1331. 10.1126/science.1253451.
Zhang G, Li C, Li Q, Li B, Larkin DM, Lee C, Storz JF, Antunes A, Greenwold MJ, Meredith RW, Ödeen A, Cui J, Zhou Q, Xu L, Pan H, Wang Z, Jin L, Zhang P, Hu H, Yang W, Hu J, Xiao J, Yang Z, Liu Y, Xie Q, Yu H, Lian J, Wen P, Zhang F, Li H, et al: Comparative genomics across modern bird species reveal insights into avian genome evolution and adaptation. Science. 2014, 346: 1311-1320. 10.1126/science.1251385.
Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010, 20: 265-272. 10.1101/gr.097261.109.
Yarmolinsky DA, Zuker CS, Ryba NJ: Common sense about taste: from mammals to insects. Cell. 2009, 139: 234-244. 10.1016/j.cell.2009.10.001.
Warren WC, Clayton DF, Ellegren H, Arnold AP, Hillier LW, Kunstner A, Searle S, White S, Vilella AJ, Fairley S, Heger A, Kong L, Ponting CP, Jarvis ED, Mello CV, Minx P, Lovell P, Velho TA, Ferris M, Balakrishnan CN, Sinha S, Blatti C, London SE, Li Y, Lin YC, George J, Sweedler J, Southey B, Gunaratne P, Watson M, et al: The genome of a songbird. Nature. 2010, 464: 757-762. 10.1038/nature08819.
International Chicken Genome Sequencing Consortium: Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature. 2004, 432: 695-716. 10.1038/nature03154.
Reed DH, Frankham R: Correlation between fitness and genetic diversity. Conserv Biol. 2003, 17: 230-237. 10.1046/j.1523-1739.2003.01236.x.
Messaoudi I, Patino JAG, Dyall R, LeMaoult J, Nikolich-Zugich J: Direct link between mhc Polymorphism, T cell avidity, and diversity in immune defense. Science. 2002, 298: 1797-1800. 10.1126/science.1076064.
Fang X, Zhang Y, Zhang R, Yang L, Li M, Ye K, Guo X, Wang J, Su B: Genome sequence and global sequence variation map with 5.5 million SNPs in Chinese rhesus macaque. Genome Biol. 2011, 12: R63-10.1186/gb-2011-12-7-r63.
Kim EB, Fang X, Fushan AA, Huang Z, Lobanov AV, Han L, Marino SM, Sun X, Turanov AA, Yang P, Yim SH, Zhao X, Kasaikina MV, Stoletzki N, Peng C, Polak P, Xiong Z, Kiezun A, Zhu Y, Chen Y, Kryukov GV, Zhang Q, Peshkin L, Yang L, Bronson RT, Buffenstein R, Wang B, Han C, Li Q, Chen L, et al: Genome sequencing reveals insights into physiology and longevity of the naked mole rat. Nature. 2011, 479: 223-227. 10.1038/nature10533.
Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y, Zhang Z, Zhang Y, Wang W, Li J, Wei F, Li H, Jian M, Li J, Zhang Z, Nielsen R, Li D, Gu W, Yang Z, Xuan Z, Ryder OA, Leung FC, Zhou Y, Cao J, Sun X, Fu Y, et al: The sequence and de novo assembly of the giant panda genome. Nature. 2009, 463: 311-317. 10.1038/nature08696.
Gymrek M, Golan D, Rosset S, Erlich Y: lobSTR: A short tandem repeat profiler for personal genomes. Genome Res. 2012, 22: 1154-1162. 10.1101/gr.135780.111.
Crow JF, Kimura M: An Introduction to Population Genetics Theory. 1970, Harper & Row, New York
Reumers J, Schymkowitz J, Ferkinghoff-Borg J, Stricher F, Serrano L, Rousseau F: SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs. Nucleic Acids Res. 2005, 33: D527-D532. 10.1093/nar/gki086.
Mettler LE, Gregg TG: Population genetics and evolution. 1969, Prentice-Hall, Inc., Englewood Cliffs (NJ)
Biswas S, Akey JM: Genomic insights into positive selection. Trends Genet. 2006, 22: 437-446. 10.1016/j.tig.2006.06.005.
Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear TL, Adelson DL, Bailey E, Bellone RR, Blöcker H, Distl O, Edgar RC, Garber M, Leeb T, Mauceli E, MacLeod JN, Penedo MC, Raison JM, Sharpe T, Vogel J, Andersson L, Antczak DF, Biagi T, Binns MM, Chowdhary BP, Coleman SJ, Della Valle G, Fryc S, Guérin G, et al: Genome sequence, comparative analysis, and population genetics of the domestic horse. Science. 2009, 326: 865-867. 10.1126/science.1178158.
Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, Zody MC, Mauceli E, Xie X, Breen M, Wayne RK, Ostrander EA, Ponting CP, Galibert F, Smith DR, DeJong PJ, Kirkness E, Alvarez P, Biagi T, Brockman W, Butler J, Chin CW, Cook A, Cuff J, Daly MJ, DeCaprio D, Gnerre S, et al: Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005, 438: 803-819. 10.1038/nature04338.
Rubin CJ, Zody MC, Eriksson J, Meadows JR, Sherwood E, Webster MT, Jiang L, Ingman M, Sharpe T, Ka S, Hallböök F, Besnier F, Carlborg O, Bed’hom B, Tixier-Boichard M, Jensen P, Siegel P, Lindblad-Toh K, Andersson L: Whole-genome resequencing reveals loci under selection during chicken domestication. Nature. 2010, 464: 587-591. 10.1038/nature08832.
Lands B: Consequences of essential fatty acids. Nutrients. 2012, 4: 1338-1357. 10.3390/nu4091338.
Li H, Durbin R: Inference of human population history from individual whole-genome sequences. Nature. 2011, 475: 493-496. 10.1038/nature10231.
Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD: Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 2009, 5: e1000695-10.1371/journal.pgen.1000695.
Zheng B, Xu Q, Shen Y: The relationship between climate change and Quaternary glacial cycles on the Qinghai–Tibetan Plateau: review and speculation. Quaternary Int. 2002, 97–98: 93-101. 10.1016/S1040-6182(02)00054-X.
Donaldson GM, Shutt JL, Hunter P: Organochlorine contamination in bald eagle eggs and nestlings from the canadian great lakes. Arch Environ Contam Toxicol. 1999, 36: 70-80. 10.1007/s002449900444.
Helander B, Olsson A, Bignert A, Asplund L, Litzen K: The role of DDE, PCB, coplanar PCB and eggshell parameters for reproduction in the white-tailed sea eagle (Haliaeetus albicilla) in Sweden. Ambio. 2002, 31: 386-403.
Youl JM: Lead exposure in free-ranging kea (Nestor notabilis), takahe (Porphyrio hochstetteri) and Australasian harriers (Circus approximans) in New Zealand. 2009, Massey University, Palmerston North
Shinsuke Tanabe AS, Hidaka H, Tatsukawa R: Transfer rates and pattern of PCB isomers and congeners and p, p’-DDE from mother to egg in Adelie Penguin (Pygoscelis adeliae). Chemosphere. 1986, 15: 9-
Corsolini S, Borghesi N, Ademollo N, Focardi S: Chlorinated biphenyls and pesticides in migrating and resident seabirds from East and West Antarctica. Environ Int. 2011, 37: 1329-1335. 10.1016/j.envint.2011.05.017.
Nocera JJ, Blais JM, Beresford DV, Finity LK, Grooms C, Kimpe LE, Kyser K, Michelutti N, Reudink MW, Smol JP: Historical pesticide applications coincided with an altered diet of aerially foraging insectivorous chimney swifts. 2012, 279:3114–3120
Albanis TA, Hela GD, Hatzilakos D: Organochlorine residues in eggs of Pelecanus crispus and its prey in wetlands of Amvrakikos Gulf, North-western Greece. Chemosphere. 1995, 31: 9-10.1016/0045-6535(95)00302-O.
Badagnani I, Castro RA, Taylor TR, Brett CM, Huang CC, Stryke D, Kawamoto M, Johns SJ, Ferrin TE, Carlson EJ, Burchard EG, Giacomini KM: Interaction of Methotrexate with organic-anion transporting polypeptide 1A2 and its genetic variants. J Pharmacol Exp Ther. 2006, 318: 521-529. 10.1124/jpet.106.104364.
Foulon V, Antonenkov VD, Croes K, Waelkens E, Mannaerts GP, Van Veldhoven PP, Casteels M: Purification, molecular cloning, and expression of 2-hydroxyphytanoyl-CoA lyase, a peroxisomal thiamine pyrophosphate-dependent enzyme that catalyzes the carbon-carbon bond cleavage during alpha-oxidation of 3-methyl-branched fatty acids. Proc Natl Acad Sci U S A. 1999, 96: 10039-10044. 10.1073/pnas.96.18.10039.
Maizels RM: Infections and allergy - helminths, hygiene and host immune regulation. Curr Opin Immunol. 2005, 17: 656-661. 10.1016/j.coi.2005.09.001.
Steiner CC, Putnam AS, Hoeck PEA, Ryder OA: Conservation genomics of threatened animal species. Annu Rev Anim Biosci. 2013, 1: 261-281. 10.1146/annurev-animal-031412-103636.
Zhao S, Zheng P, Dong S, Zhan X, Wu Q, Guo X, Hu Y, He W, Zhang S, Fan W, Zhu L, Li D, Zhang X, Chen Q, Zhang H, Zhang Z, Jin X, Zhang J, Yang H, Wang J, Wang J, Wei F: Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation. Nat Genet. 2012, 45: 67-71. 10.1038/ng.2494.
Cunningham SJ, Alley MR, Castro I, Potter MA, Cunningham M, Pyne MJ: Bill morphology of ibises suggests a remote-tactile sensory system for prey detection. Auk. 2010, 127: 308-316. 10.1525/auk.2009.09117.
Nielsen R, Williamson S, Kim Y, Hubisz MJ, Clark AG, Bustamante C: Genomic scans for selective sweeps using SNP data. Genome Res. 2005, 15: 1566-1575. 10.1101/gr.4252305.
Yang F, Carter NP, Shi L, Ferguson-Smith MA: A comparative study of karyotypes of muntjacs by chromosome painting. Chromosoma. 1995, 103: 642-652. 10.1007/BF00357691.
Ladjali-Mohammedi K, Bitgood JJ, Tixier-Boichard M, Ponce De Leon FA: International system for standardized avian karyotypes (ISSAK): standardized banded karyotypes of the domestic fowl (Gallus domesticus). Cytogenet Cell Genet. 1999, 86: 271-276. 10.1159/000015318.
Ng BL, Carter NP: Factors affecting flow karyotype resolution. Cytometry A. 2006, 69: 1028-1036. 10.1002/cyto.a.20330.
She R, Chu JS, Wang K, Pei J, Chen N: GenBlastA: enabling BLAST to identify homologous gene sequences. Genome Res. 2009, 19: 143-149. 10.1101/gr.082081.108.
Birney E, Clamp M, Durbin R: GeneWise and Genomewise. Genome Res. 2004, 14: 988-995. 10.1101/gr.1865504.
Trapnell C, Pachter L, Salzberg SL: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009, 25: 1105-1111. 10.1093/bioinformatics/btp120.
Roberts A, Pimentel H, Trapnell C, Pachter L: Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics. 2011, 27: 2325-2329. 10.1093/bioinformatics/btr355.
Bairoch A, Apweiler R: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000, 28: 45-48. 10.1093/nar/28.1.45.
Salamov AA, Solovyev VV: Ab initio gene finding in Drosophila genomic DNA. Genome Res. 2000, 10: 516-522. 10.1101/gr.10.4.516.
Stanke M, Waack S: Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003, 19: 215-225. 10.1093/bioinformatics/btg1080.
Mulder N, Apweiler R: InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol Biol. 2007, 396: 59-70. 10.1007/978-1-59745-515-2_5.
Kanehisa M, Goto S: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28: 27-30. 10.1093/nar/28.1.27.
Li H, Coghlan A, Ruan J, Coin LJ, Heriche JK, Osmotherly L, Li R, Liu T, Zhang Z, Bolund L, Wong GK, Zheng W, Dehal P, Wang J, Durbin R: TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res. 2006, 34: D572-D580. 10.1093/nar/gkj118.
De Bie T, Cristianini N, Demuth JP, Hahn MW: CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 2006, 22: 1269-1271. 10.1093/bioinformatics/btl097.
Steinberg S, Chen L, Wei L, Moser A, Moser H, Cutting G, Braverman N: The PEX Gene Screen: molecular diagnosis of peroxisome biogenesis disorders in the Zellweger syndrome spectrum. Mol Genet Metab. 2004, 83: 252-263. 10.1016/j.ymgme.2004.08.008.
Li R, Li Y, Fang X, Yang H, Wang J, Kristiansen K: SNP detection for massively parallel whole-genome resequencing. Genome Res. 2009, 19: 1124-1132. 10.1101/gr.088013.108.
Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263-265. 10.1093/bioinformatics/bth457.
Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZX, Pool JE, Xu X, Jiang H, Vinckenbosch N, Korneliussen TS, Zheng H, Liu T, He W, Li K, Luo R, Nie X, Wu H, Zhao M, Cao H, Zou J, Shan Y, Li S, Yang Q, Ni P, Tian G, Xu J, Liu X, Jiang T, Wu R, Asan, et al: Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010, 329: 75-78. 10.1126/science.1190371.
Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27: 573-580. 10.1093/nar/27.2.573.
We thank Laurie Goodman for help in editing the manuscript and Zhaozheng Guo for overseeing experimental bioethics. We thank Jing Liu, Hao Li, Liang Yang, Xiaojing Wei, Fang Cao, and Rong Su in Xi’an Jiaotong University for help in experiment. The Ibis and Egret sequencing and analyses part of this project was supported financially with funds from the state forestry administration of the People’s Republic of China (No. 2012717), Higher Education Innovation Fund of Xi’an Jiaotong University, and Dr. Xi Guang Assistant to the President, Kwang-Hua Education Foundation (No. 2010822), and Dr. Yan Hong Assistant to the President, Dr. Chu Longfei and Dr. Ding Haihua in Shaanxi branch of State Forestry Administration, the Breeding and Releasing programs of crested ibis population (No. 2010, No. 2008), the Shenzhen Municipal Government of China (Nos. CXB201108250096A, JC201005260191A), ShenZhen Engineering Laboratory for Genomics-Assisted Animal Breeding, and China National GeneBank-Shenzhen. EDJ’s contribution was supported by HHMI.
The authors declare that they have no competing interests.
SBL, BL, and CC conceived the study. JQY, JY, and HMY, managed the project. QBL, HBZ, and SGW prepared samples. BL, ZJX, SXZ, and GJZ performed genome sequencing, assembly, annotation, and transcriptomes analyses. SBL, JHL, QBL, CC, LC, HBZ, and BY carried out genetic analyses. BL, SBL, CC, and LC discussed the data. All authors contributed to data interpretation. SBL, BL, CC, EDJ, TG, and GJZ wrote the manuscript. EDJ, TG, GJZ also supervised and provided data in addition to those of the crested ibis and the little egret for comparative analysis from the avian phylogenomics project, and TG, GJZ, EDJ, and JY edited the manuscript. All authors read and approved the final manuscript.
Shengbin Li, Bo Li, Cheng Cheng contributed equally to this work.