Evolution, ecology and the engineered organism: lessons for synthetic biology
© BioMed Central Ltd 2009
Published: 30 November 2009
Skip to main content
© BioMed Central Ltd 2009
Published: 30 November 2009
As the scope and complexity of synthetic biology grows, an understanding of evolution and ecology will be critical to its success.
One of the most powerful and controversial aspects of engineering living organisms is that they reproduce, evolve, and interact with their environment. Humans have been engineering plants and animals since the advent of agriculture approximately 12,000 years ago through breeding and artificial selection for their domestication . The evolution of corn from the small grass teosinte , or the transformation of the wolf into 'man's best friend' (the dog)  are testaments to the success of this approach. We have even 'domesticated' microorganisms, using yeast and bacteria for the production of beer, wine, cheese and yogurt as well as numerous other products we consume every day [3, 4].
Although powerful, genetic engineering by classical breeding and selection is slow, and results in a large number of unknown genetic changes that are hard to reconcile and may have unintended secondary effects. What we need is a rational approach to the engineering of biological systems that makes the process fast, cheap and safe, to solve problems in energy, health, agriculture and the environment. First steps towards realizing this aim began with the advent of recombinant DNA technology in the latter half of the 20th century, which created visions of a new era of 'synthetic biology' where novel genes could be designed and constructed for useful purposes [5–7]. Since then we have made incredible advances in our ability to manipulate genes, genomes and organisms, and this has led to a renewed interest in making synthetic biology a reality .
One of the central goals of synthetic biology is to develop genetic elements with encapsulated functions, such as regulatory circuits or environmental sensors, that can be combined to create new pathways with predictable behaviours. Despite our ability to synthesize genes and even genomes , we still lack the sophistication to design de novo those genetic elements needed for advanced synthetic biology applications. Fortunately, evolution has already provided us with an immense diversity of biomolecular functions that can be used individually or combined by putting together natural functional modules.
Bacteria and archaea represent perhaps the largest reservoirs of new genes and new biochemical functions that can be harnessed by the synthetic biologist. Current estimates of the number of bacterial species range from 1 million to as many as 1 billion [16, 17], each representing a unique genetic solution to the environmental challenges posed by diverse ecological niches. This incredible diversity of species in turn encodes a vast universe of protein functions. As of October 2009, there were 11,912 protein families in the Pfam database alone [18, 19]. Despite this large number, our sampling of protein function is still incomplete, and many new activities still remain to be discovered in nature . In addition, there is probably a vast array of non-coding RNA functions and DNA regulatory sequences that would serve as useful genetic elements for synthetic biology but which are difficult to detect by typical sequencing methods because of their fast rate of evolution.
This plethora of gene functions derived from evolution has not gone unnoticed, and it has been standard practice in genetic engineering to mix and match genes from many organisms. One driving force behind this has been to make bacteria such as Escherichia coli into 'chemical factories' for the production of drugs, fine chemicals and other commercially important compounds. Recent successes include the production of amorpha-4,11-diene (a precursor of the antimalarial drug artemisinin) , the production of putrescine (used for the production of the plastic nylon-4,6)  and the production of 4-hydroxyvalerate (which can be converted into polyesters and other plastics) . Along with other examples, such as the production of the amino acids L-valine  and L-threonine  from engineered bacteria, such successes have founded a field of metabolic engineering that strives to leverage the metabolic flexibility of microbes to convert simple inputs such as sugars to desirable complex compounds [11, 26]. For many applications, the gene function or enzymatic chemistry is already available in nature, but if not, there are experimental strategies that can circumvent this problem (see Lesson 2).
Even if a gene function exists in nature, our ability to use it to engineer complex biological systems with new composite functions relies on the modularity inherent in naturally evolved systems. Modular biological systems are composed of functional domains that can be individually swapped or altered to change the overall characteristics of the system. Examples of modularity in biology abound at nearly all scales, and include basic gene regulation elements (promoters and binding sites for transcription factors), protein domains, macromolecular protein complexes, and cellular regulatory networks [27–31]. A number of compelling studies have demonstrated that modularity in biological systems arises under selection in a changeable environment [32, 33], and modularity seems to have been selected because it makes 'rewiring' on an evolutionary timescale more effective . The ability to rewire natural biological systems makes nature a vast source of modular 'parts' for the synthetic biologist. However, we must be careful to obey the rules of modularity and domain boundaries that nature uses. Understanding these rules, at both the molecular and organismal levels, is currently an active area of research [35–37].
As discussed above, evolution has provided a vast universe of genes and factorable modules that can be harnessed by the synthetic biologist to engineer new biological systems. In the simplest scenario, the desired function can be used 'as is' without any further modification. However, many synthetic designs require that we modify or tweak a gene function, such as altering an enzyme activity or changing a regulatory element. In extreme cases we need a gene function or activity that does not actually exist in nature. For example, incorporation of unnatural amino acids (for example, p-boronophenylalanine) into proteins is now possible using tRNA synthetases created in the laboratory and this enables the site-specific modification of proteins using boronate-based chemistry . Enzymes that catalyze the Kemp elimination reaction have been produced by using a combination of computational protein design and molecular evolution (see below) .
Fortunately, a suite of experimental techniques exists that can create new gene function in the laboratory on the basis of a deep understanding of the fundamental mechanisms of evolutionary change - variation by mutation and recombination, differential reproduction and heredity. These so-called 'in vitro evolution' methods have been applied successfully to DNA, RNA and proteins [40–44]. Like classical breeding and artificial selection, they are iterative processes, involving rounds of library creation, screening and selection. Here, we focus on the library-creation step because it has benefited most from our knowledge of evolutionary mechanisms. The traditional approach to library creation involves generating random variation, for which there are a number of standard methods such as random DNA synthesis, error-prone PCR, chemical mutagenesis or the use of mutator strains.
Random mutagenesis by itself is inefficient, and computer simulations of evolution have demonstrated that a low level of point mutation plus recombination is an optimal strategy for creating diversity . This observation led to the development of gene shuffling, which is a powerful technique for the rapid evolution of protein function . In this process, random DNA fragmentation and reassembly by PCR is used to simulate recombination in the laboratory. Gene shuffling has been used to increase enzyme activity , alter substrate specificity  and improve the properties of green fluorescent protein .
Gene shuffling has been further expanded to genome shuffling, which combines mutagenesis with protoplast fusion to rapidly evolve microbes for the purpose of strain improvement . Because multiple advantageous mutations may be combined during each round of mutagenesis and protoplast fusion, genome shuffling has proved superior to classical methods for strain improvement (that is, mutagenesis plus selection); however, it still suffers from the limitation that the genetic basis for the improvement is not known. Most recently, a method for rapid genome engineering in bacteria has been developed, called multiplexed automated genome engineering (MAGE), that allows at least 20 directed genomic mutations at once by using mutagenic oligos . The combination of MAGE, genome shuffling and the means to vary the selection pressure to enable bouts of random mutation without selection (that is, neutral evolution)  might be a powerful approach to the more rapid evolution of strains with desired characteristics. This method could be applied to developing strains with increased metabolic flux through an engineered pathway, or to improve tolerance to environmental stresses, such as pH or temperature. The take-home lesson is that evolutionary mechanisms have provided a powerful set of experimental tools for the rapid engineering of biological function. As we continue to understand how natural systems evolve, we can further exploit these processes for engineering genes and genomes in the laboratory.
Ultimately, even laboratory evolution is not sufficient for the engineering of complex biological systems. As designs become more complex, directed evolution at multiple genetic loci starts to resemble classical breeding and selection - where we do not understand the connection between genotype and phenotype. Furthermore, these evolution-based strategies require that we have selections or screens for the desired traits, which rapidly becomes too difficult as we move beyond the simplest applications. We envisage that synthetic biologists will use a hybrid approach starting with rational design using modular parts (Lesson 1), followed by organism-level evolution around the designed genetic architecture of the system for final optimization .
Even though we may use evolution as a tool to create novel function and optimize designs, we must be aware that its driving force for change does not stop when we deploy a system in a bioreactor or in the environment. Once a system is ready for use we would like to halt evolution, or at least minimize it, so that our system can perform without diverging from its original specifications. All the mechanisms of evolutionary change that were exploited to develop our system now need to be counteracted. This is quite a challenge and requires a focus on the two main sources of evolutionary change in nature - horizontal gene transfer (HGT) and random mutation.
One strategy for minimizing evolution is to prevent HGT. HGT can occur in three ways: by conjugation, transduction or transformation . Conjugation is the transfer of genetic material (often a plasmid) between bacteria through direct cell-to-cell contact. Many plasmids encode their own mobilization and transfer functions and can move between bacteria by conjugation. In the early days of recombinant DNA research it was recognized that these plasmid sequences could be deleted, thus preventing their transfer . In addition, cell-envelope proteins that are necessary for conjugation can be mutated.
By contrast, transduction and transformation enable transfer of DNA without cell contact. Transduction is mediated by bacteriophages whereas transformation is the uptake of free DNA from the environment. Transduction can be prevented by mutating a wide-range of bacteriophage receptors to give phage-resistant strains. Ideally, we could develop broad-range phage resistance, and there is evidence that such mutations exist. In one example, three mutants of Streptococcus thermophilus were identified that were resistant to 14 phages after screening for resistance to just one lytic phage, Sfi19 . Other strategies for broad-range phage resistance could include engineering the CRISPR (clustered, regularly interspaced, short palindromic repeat) genes, which have recently been hypothesized to be a bacterial 'immune system' that targets the degradation and silencing of foreign DNA .
The third mechanism of HGT involves natural transformation, and one strategy to prevent this is to mutate com genes and thus prevent uptake of DNA from the environment . Competence (com) genes encode a set of proteins that are localized in the bacterial cell envelope and are critical for processing and uptake of DNA. If all else fails and foreign DNA does get inside the cell of an engineered strain it could be prevented from integrating into the genome by using a rec-strain background or by installing a strong restriction/modification system. Recombination (rec) genes are essential for homologous recombination, so a rec-strain would not be able to recombine the foreign DNA into its chromosome. Restriction/modification systems degrade incoming DNA that is not specifically 'marked' by methylation by the host bacterium, and so would block HGT before the recombination step.
A second strategy for minimizing evolution is to modulate the mutation rate. Defects in the mismatch repair system, for example, dramatically increase the mutation rate. The mismatch repair system recognizes mispaired nucleotides that arise during errors in DNA replication and recombination and recruits the necessary enzymes to repair the mistake. Many of these genes were first identified as mutator (mut) genes, which led to an increase in mutation frequency when deleted. For example, loss of function of mutS or mutL leads to a 102 - to 103-fold increase in the frequency of transition and frameshift mutations . By contrast, overexpression of MutS or MutL leads to a decrease in the mutation frequency, and this could be one strategy for minimizing evolution . This study suggested that other genes might exist that increase the mutation rate when overexpressed. In this regard, a multicopy genetic screen in E. coli identified 15 loci that when overexpressed led to a mutator-like phenotype, and 12 of these were previously uncharacterized . In theory, every mechanism that nature uses to increase the mutation rate could be reversed by overexpression or deletion of the appropriate genes, although this general idea remains to be tested.
At present, the cutting-edge of genetic manipulation is in metabolic engineering [21, 22, 50]. The bacterium E. coli has long been a workhorse in this field, largely because of its ease of genetic manipulation and the large amount of knowledge and resources accumulated. However, when we start to consider applications of synthetic biology beyond the bioreactor, such as bioremediation or therapeutic use in the human body, we must consider the complex nature of these environments. In particular, we must ensure that our engineered biological system works to specification without unintended disruptions to the natural ecology of the environment or human host, and that it can be easily identified and removed if necessary.
Bioremediation is the use of living organisms to return an ecosystem to its natural state after toxic contamination. Ever since the advent of recombinant DNA technology, the use of genetically modified (GM) organisms for bioremediation has been a holy grail. Unfortunately, most attempts at using GM bacteria for bioremediation have failed because the engineered strain had reduced fitness and competed poorly with indigenous microbial communities . Although E. coli is a natural choice for use as a chemical factory in a laboratory bioreactor, it makes no sense to engineer a bacterium that normally resides in the human gut for bioremediation of a toxic-waste site. It is more appropriate to engineer organisms that are derived directly from the target ecology. This is not without its challenges, however.
The industrial chemical 2-chlorotoluene is produced in large amounts and is used in a variety of consumer products. It is toxic to aquatic environments and humans, is inert to chemical hydrolysis in environmental conditions, and is therefore an interesting target for microbial bioremediation. Initial attempts at engineering soil-derived Pseudomonas species for 2-chlorotoluene degradation  failed because of the complex nature of environmental influences on gene regulation . Given the tools of synthetic and systems biology, there is renewed hope that such problems, which are due to strong coupling of engineered organisms to target ecologies, can now be overcome.
One of the principal areas that needs development is the characterization of organisms for use in different bioremediation applications. This will mean identifying the key organisms responsible for the biotransformation process of interest, isolating and culturing their communities in the laboratory so they can be engineered for enhanced bioremediation and ecological stabilization, and then reintroducing them into the environment. Although there will be many difficulties in implementing this strategy, metagenomic techniques have greatly advanced the identification of the complex microbial communities that exist in the environment . Recent work also shows that we now have the technology to manipulate previously genetically intractable systems: the complete genome of Mycoplasma mycoides was transferred into yeast, altered using yeast genetic tools, and then transplanted back into a Mycoplasma cell to yield a new M. mycoides strain .
When considering the 'real-world' applications of synthetic biology such as bioremediation the environmental impact and safety of the engineered organisms are important considerations. Introducing an engineered organism into a bioremediation site can be thought of as purposefully introducing an invasive species. Whether it is successful and competes with the native organisms depends on its relative fitness and its ability to evolve and adapt to its environment . Even though these engineered strains may be less fit and perhaps even less effective than the native species, they have the advantage that they can be engineered with a 'leash' or other system to prevent their unwanted spread. Such safeguards have been in place since the beginning of recombinant DNA research, and have been further developed over the years [66–68].
One worry is that engineered strains will evolve around introduced safeguards, and Lesson 3 highlights ways in which we might address this possibility. Even so, the DNA of the engineered system could still be released after cell death and could be taken up by other bacteria in the ecosystem by natural transformation. How can we prevent the spread of engineered DNA by this route? If we could engineer strains that use an alternative genetic code, then even if the DNA gets transferred into other bacteria, translation would produce a functionless protein. This would similarly prevent 'natural' DNA accidentally imported into the engineered organism from being expressed. Alternative genetic codes exist in mitochondria and ciliates , and there are many examples of artificial alternative codes based on the tRNA synthetase system first developed by Schultz and co-workers . There are even translation systems that work orthogonally to the natural host system, and that would not function in bacteria that did not have the correct ribosomal apparatus .
Whatever the strategy we choose to follow to prevent unwanted spread, understanding the interplay between ecology and synthetic biology is critical to predicting how an engineered system might evolve in and interact with a natural environment. Once we take our engineered system out of the laboratory, whether into an industrial fermentation tank, the environment (for example, bioremediation) or a human host (for example, a therapeutic organism), we need to understand how our design will evolve according to the selective pressures of its environment, and how it will affect the ecology of its environment. The synthetic biologist is constantly in a state of tension - on one hand, exploiting the mechanisms of evolution to engineer more complex biological systems, and on the other trying to keep the design robust to evolution once it is released. Once introduced into the environment, the engineered biological system also needs to 'play well with others' and not adversely disrupt the natural ecology. There are complex considerations, both ethical and legal, in releasing genetically modified bacteria into the environment for study or application  or even in disclosing the technology that enables the engineering of organisms able to survive in the outside world. However, having a deeper understanding of the interplay between evolution, ecology and synthetic biology will be critical in moving our designs 'beyond the bioreactor' into the real world where they can safely and effectively benefit society.
JBL and APA acknowledge the support of the Synthetic Biology Engineering Research Center under NSF grant number 04-570/0506186. JBL acknowledges the Miller Institute for financial support. JMS and APA would also like to acknowledge support of the Energy Biosciences Institute, University of California, Berkeley.