42 3 Aspects of the Molecular Biology of Microorganisms of Relevance to the Aquatic Environment
mixed and electrophoresed together. Because the
ddNTPs are of different colors, a scanner can scan the
gel and record each color (nucleotide) separately. This
can be used for relatively short fragments of DNA,
700–800 nucleotides.
3.4.2 Sequencing of Genomes or Large DNA Fragments
The best example of the sequencing of a genome is
perhaps the human genome, which was completed a
few years ago. During the sequencing of the human
genome, two approaches were followed: The use of
bacterial artificial chromosomes (BACs) and the short
gun approach.
3.4.2.1 Use of Bacterial Artificial Chromosomes
The Human Genome Project was publicly funded, and
the National Institutes of Health and the National
Science Foundation have funded the creation of
“libraries” of BAC clones. Each BAC carries a large
piece of human genomic DNA in the order of
100–300 kb. All of these BACs overlap randomly, so
that any one gene is probably on several different
overlapping BACs. Those BACs can be replicated as
many times as necessary, so there is a virtually endless
supply of the large human DNA fragment. In the
publically funded project, the BACs are subjected to
shotgun sequencing (see below) to figure out their
sequence. By sequencing all the BACs, we know
enough of the sequence in overlapping segments to
reconstruct how the original chromosome sequence
looks.
3.4.2.2 Use of the Shotgun Approach
An innovative approach to sequencing the human
genome was pioneered by a privately funded sequenc-
ing project: Celera Genomics. The founders of this
company realized that it might be possible to skip the
entire step of making libraries of BAC clones. Instead,
they blast apart the entire human genome into frag-
ments of 2–10 kb and sequence those. Now, the chal-
lenge is to assemble those fragments of sequence into
the whole genome sequence. It was like having hun-
dreds of 500-piece puzzles, each being assembled by
a team of puzzle experts using puzzle-solving com-
puters. Those puzzles are like BACs – smaller puzzles
that make a big genome manageable. Celera threw
all those puzzles together into one room and scram-
bled the pieces. They, however, have scanners that scan
all the puzzle pieces and huge powerful computers to
fit the pieces together.
3.5 The Open Reading Frame and the Identification of Genes
Regions of DNA that encode proteins are first tran-
scribed into messenger RNA and then translated into
protein. By examining the DNA sequence alone, we
can determine the putative sequence of amino acids
that will appear in the final protein. In translation,
codons of three nucleotides determine which amino
acid will be added next in the growing protein chain.
The start codon is usually AUG, while the stop codons
are UAA, UAG, and UGA. The Open Reading Frame
(ORF) is that portion of a DNA segment which will
putatively code for a protein; it begins with a start
codon and ends with a stop codon.
Fig. 3.7 Diagram illustrating autoradiograph of a sequencing
gel of the chain terminating DNA sequencing method (Arrow
shows direction of the electrophoresis. By convention the auto-
radiograph is read from bottom to the top)