5.4 Helper Functions 93
patterns” (previously identified as their core sequence, ApA dinucleotides
[91, 92]), spanning 5–10 helix turns, and present every few kilobases of the whole
genome. These patterns have been repeatedly observed and usually thought to
result from the local enrichment of ApA dinucleotides, but likely to result from
more complex patterns [93–95]. This constraint is so strong that it appears to
bias the nature up to one in five nucleotides in the genome [95]. In general A‐
tracts have been related to DNA curvature, and they are expected to play a con
siderable role in DNA compaction and regulation of gene expression [96–98].
The importance of this feature has not yet been explored in SynBio constructs.
Managing space is also essential to organize gene expression (references in
[84, 90, 99, 100]). Indeed, while the DNA molecule is a linear structure, the mem
brane is a 2D structure and the cytoplasm is a 3D structure; they all need to work
in concert. Allowing coordination of the different space scales, gene expression
[101] and distribution of genes within transcription units are finely tuned in
most Bacteria and Archaea, in particular in terms of coordination of metabolic
fluxes [84, 102]. For example, in the lactose operon, the gene for cytoplasmic
beta‐galactosidase, lacZ, is separated from that of the membrane protein lactose
permease, lacY, by a regulatory transcription attenuator. This results in consid
erably less expression of the distal genes lacY and lacA, as compared with that of
lacZ, and allows matching the production level of the cytoplasmic enzyme with
that of the membrane transport protein [103]. In general there is a relationship
between the genome organization and the pattern of transcripts and protein dis
tribution in the cell [86, 104].
The genome DNA is considerably longer than that of the cell, and this allows
folding of the chromosome in a way that can compensate for the one dimension/
three dimensions dichotomy. Furthermore there seems to exist a relationship
between the overall cell architecture and that of the genome; Tamames and cow
orkers found a remarkable correlation between the distribution of genes in the
mur‐fts gene clusters and the overall shape of the cell [105, 106]. This observa
tion may fit with the view that transcripts are systematically distributed in spe
cific regions of the cell, as shown by local biases in codon usage, forming islands
10–30 kb long [87] in agreement with the data reviewed by Willenbrock and
Ussery [100]. In general, analysis of the folding of the chromosome revealed the
existence of a core structure linking together between 12 and 80 loops per chro
mosome [88, 107]. Many studies have explored the role of the distribution of the
genes in the bacterial chromosome, in particular with the prospect of improving
gene expression in biotechnological constructs (see [108] for further references).
Despite the widespread view of the chromosome as extremely plastic, it rapidly
appeared that while some regions were prone to harbor a variety of genes, others
remained fairly constant. Indeed, macrodomains organization appears to display
rigid constraints that limit genome plasticity [109]. This was further illustrated
with the comparison between a large number of Escherichia coli strains
[110, 111]. It was also found that functionally related genes clustered together
into islands in a way that should have considerable impact on gene expression
[100, 108, 112].
The two extremes of gene distribution are clustering and its opposite, uni
form distribution (which creates an apparently periodical distribution, so that