strategies — a whole - genome assembly and a regional chromosome assem-
bly — were used, each combining sequence data from Celera and the publicly
funded genome effort. Analysis of the genome sequence revealed 26,588
protein - encoding transcripts for which there was strong corroborating evi-
dence and an additional approximately 12,000 computationally derived genes
with mouse matches or other weak supporting evidence. Although gene - dense
clusters are obvious, almost half the genes are dispersed in low G + C sequence
separated by large tracts of apparently noncoding sequence. Only 1.1% of
the genome is spanned by exons, whereas 24% is in introns, with 75% of the
genome being intergenic DNA. DNA sequence comparisons between the
consensus sequence and publicly funded genome data provided locations of
2.1 million single - nucleotide polymorphisms (SNPs). An SNP is a change in
which a single base in the DNA under study differs from the usual base at that
position. Many SNPs are normal variations in the genome, whereas others are
responsible for diseases such as sickle cell anemia. A random pair of human
haploid genomes differed at a rate of 1 bp per 1250 on average, but there was
marked heterogeneity in the level of polymorphism across the genome. Less
than 1% of all SNPs resulted in variation in proteins, but the task of determin-
ing which SNPs have functional consequences remains an open challenge.
An interesting short history of efforts to carry out the sequencing of the
human genome — “ Controversial from the Start ” by reporter Leslie Roberts —
was published in Science magazine in 2001.^22 Scientists continue to use the
information generated by the human genome sequencing publications to
understand how genes function, how genetic variations predispose the organ-
ism to disease, and how gene function can be used in disease detection, preven-
tion and treatment regimens.
2.4 ZINC - FINGER PROTEINS
Zinc - fi nger proteins, discussed briefl y here, provide examples of a bioinorganic
topic intimately associated with biochemical knowledge of both proteins and
nucleic acids. Sporting a well - recognized fi nger - like motif, these proteins are
known to participate in one of the many and varied protein – DNA interactions
that command the attention of bioinorganic researchers. It was known in the
1970s that zinc was crucial to DNA and RNA synthesis and to cell division.
In the 1980s it was discovered that the African clawed toad Xenopus ’ tran-
scription factor IIIA (TFIIIA) contained 2 – 3 mol zinc/mol of protein.^23 TFIIIA
is a site - specifi c DNA - binding regulatory protein that activates the transcrip-
tion of the 5S RNA gene into DNA. It was found that protein isolated from
the 5S RNA complex, containing zinc, was bound to a 45 - base - pair DNA
sequence and that the protein protects the DNA from nuclease digestion.^2 It
was soon discovered that the two cysteine (cys) and two histidine (his) residues
per a 30 - amino - acid unit of TFIIIA form a tetrahedral coordination complex
with each of 7 – 11 zinc ions. These generate peptide domains, now called
ZINC-FINGER PROTEINS 63