4.3 The E. coli Genome 51
genetically stable. Even if mutagenesis and evolvability cannot be totally
repressed, genetic change should be kept at the minimum to preserve the
designed functionality. Third, by eliminating dispensable, energy-consuming
components, the chassis should function more economically and utilize the
resources efficiently, allowing high-yield product formation under well-defined
conditions. In addition, the biological chassis should be safe for health and for
the environment. By embedding genetic barriers in the blueprint, accidental
release and genetic mixing with the natural organisms can be prevented.
Construction of a simple cell can be attempted in two ways. On one hand,
building genomes from scratch, using synthetic oligonucleotide assemblies is an
approach of great potential [3]. Despite the theoretical challenges of bottom-up
genome design, the toolbox of genome assembly and transplantation into a living
cell is undergoing continuous development [24–26]. The grandiose project of
synthesizing a minimal genome (a genome comprising only essential genes),
seen for Mycoplasma mycoides, may therefore become general practice one day
[15]. On the other hand, rational simplification and optimization of existing
robust cells in routine laboratory use is a less challenging and less risky endeavor.
Beyond the elimination of unnecessary genes (streamlining), creating a chassis
might involve other modifications as well: altering the genetic code (codon
swaps), introduction of non-interfering subsystems (orthogonality), and rede-
sign and rewiring (optimization) [6, 21]. Here we will discuss genome streamlin-
ing by focusing on the reduction of the E. coli genome.
Using the term genome streamlining we do not mean creating an absolute min-
imal set of genes required for life. Rather, the aim here is to produce a significantly
reduced genome that retains all the important genes required for robust growth
and easy genetic manipulation in a practical, laboratory, or industrial setting.
4.3 The E. coli Genome
E. coli is an important commensal and pathogen, an excellent model for research,
and one of the most widely used industrial organisms. Among thousands of iso-
lates, five strains (K-12, B, C, Crooks, and W) and their derivatives have been
used extensively in laboratories for over 70 years [27]. Biotechnological applica-
tions range from production of commodity chemicals and biofuels to vaccine
development and bioremediation. Notably, nearly 30% of approved recombinant
therapeutic proteins are currently produced in E. coli.
Popularity of E. coli is owed to its versatility, simple culturability, and ease of
genetic manipulation. E. coli can utilize a wide range of carbon and energy
sources, is capable of aerobic growth and anaerobic fermentation, and can sur-
vive not only in the intestinal tract but also in the outside environment. The
versatility of the bacterium is reflected in its relatively large (4.5–5.5 Mb) [28],
high gene-density genome.
The genome sequence of the prototype laboratory strain K-12 MG1655
became available in 1997 [18] (selected features shown in Figure 4.1). The
4.6 Mb chromosome contains ~4300 protein-coding genes, accounting for
about 88% of the genome. The remaining part encodes stable RNAs (0.8%) and