Nucleic Acids in Chemistry and Biology

(Rick Simeone) #1

genomic sequences. But when these are carefully tested on well-characterised genomic regions containing
known genes, they often fall short, either by missing exons, by predicting exons to be complete genes or
by predicting exons where none exist. Such failings are particularly pronounced for complex genes such
as Ubx(Figure 6.4).
One way to help find the locations of genes in DNA is by sequence analysis of large numbers of tran-
scribed sequences, because in general only genes are transcribed, and by comparison with the complete
genome sequence. Typically, a cDNA libraryis made by reverse transcription from the RNA of the organ-
ism. Then thousands of individual sequences are determined. To find the rare RNAs, often single sequen-
cing experiments are carried out (typically ca.500 bp), on a large number of subclones, instead of
determination of the complete sequence for relatively few RNAs.


6.5.2 Genome Maps


Once the genomic sequence is reasonably well ordered into accurate, large contiguous pieces (contigs),
which eventually extend to whole chromosomes, the cDNA sequences can be mapped onto the respective
genome. Such maps are useful in identifying genes, which may be associated with important traits, such
as the predisposition to inherited diseases. The gene map obtained from such a study can be aligned against
other important maps, showing the extents of large insert clones (BAC, YACclones, etc., see Section 5.2.1).
BAC and YAC contigs are obtained by sequence analysis of the ends of randomly selected large insert
clones and then by a search of previously acquired data for identical sequences.


6.5.3 Molecular Marker Maps


Another important map, which can be aligned against the genome and cDNA maps, is a molecular marker
map. A molecular marker is any difference in DNA sequence observed at a precise genomic location
between two individuals of an organism, for example two human beings. Such differences represent just a
tiny fraction of the huge amount of genetic variation in a species and mostly lie in non-coding DNA that
is not subject to natural selection to preserve its sequence. Molecular markers are useful research tools in
that they can be mapped genetically in the same way as visible traits, such as Mendel’s pea seed traits. They
can also be physically mapped on genomic DNA. Indeed, there are now hundreds of times more molecular
markers mapped, both genetically and physically, on the human genome than there are genes identified.
Genetic markers that are tightly linked to particular gene variants (alleles) can be of medical importance. For
example, whether a baby carries a defective cystic fibrosis gene can now be assessed by a simple marker
assay on DNA isolated from a pinprick of blood, rather than having to clone and sequence the gene itself.


6.5.4 Molecular Marker Types


The first types of molecular markers are restriction fragment length polymorphisms (RFLPs). These DNA
sequence variants (usually point mutationsor small insertions or deletions) result in the creation or
destruction of a restriction enzymecleavage site (Section 5.3.1). Such mutations sometimes alter the
restriction map of the genomic region in which they reside. Such DNA alterations can be detected either
by Southern blot analysis(Section 5.5.2) or more often nowadays by PCR (Section 5.2.2) followed by
restriction digestion of the amplified DNA to reveal the polymorphic restriction site.
Two more important molecular marker types in use now are microsatellites^31 and single nucleotide poly-
morphisms(SNPs) (Section 5.5.3).^32 Microsatellites are also called simple sequence repeats (SSRs). SSRs
contain a varying number of repeats of typically 2–3 base pairs. At a given locus (genomic region), one indi-
vidual might have six repeats of the dinucleotide GT whilst another may have nine such repeats. These dif-
ferences are revealed by DNA amplification of the region containing the repeat and by determination of its
length. SNPs are merely single nucleotide changes in a given genomic region, e.g. a G substitution by an
A at position 543. Much effort is currently being invested in finding cheap and efficient methods for identify-
ing such simple SNPs.


222 Chapter 6


http://www.ebook3000.com

Free download pdf