two proteins, globin and globin. In the amphibian Xenopus laevisthe genes for both globin types are
found in the same region of the genome. In birds the two gene families have split from each other and this
is also seen for mammals, where there has been an increase in gene number and complexity. Mammals
need to supply their unborn young with oxygen and this cannot happen efficiently unless the foetal globin
can sequester oxygen from the adult globin. For this to happen, the protein sequence, and hence the DNA
sequence, of foetally expressed globin must diverge from that of the adult. Thus natural selection has pro-
moted the proliferation and diversification of haemoglobin genes.
The globin locus of humans shows several other interesting features (Figure 6.7). First, there are two
foetal genes (Gand A) that encode identical proteins. This is probably the result of an evolutionarily
recent duplication of the foetal gene in this lineage. Second, we see several examples of defective genes
carrying frame shifts and stop codons. These pseudogenes^16 may be the result of gene evolution having
taken a wrong path. Pseudogenesare very common in mammals but surprisingly quite rare in plants.
Third, each globin gene has the same transcriptional orientation. This is also true for all vertebrates and
is a consequence of the mechanism of gene duplication, which involves unequal homologous recombin-
ationbetween repeated sequences flanking the genes (Figure 6.8).
6.3 Intergenic DNA
In prokaryotes there is very little extra DNA besides that encoding genes. However, the genomes of
eukaryotes are very different. For example, a typical stretch of the maize genome contains relatively few
genes and a large amount of repetitious DNA (Figure 6.9).^17 But this is not a fixed rule; for example,
Genes and Genomes 215
Figure 6.8 Gene duplication and deletion promoted by unequal exchange between flanking repetitious DNAs
Figure 6.9 Intergenic DNA in plants. Maize, a plant with a large genome, has complex sets of retrotransposon insertions
(Section 6.10.3) between two genes (indicated by arrows). Arabidopsis thaliana has six genes with relatively
little intergenic DNA