Genetic Analysis 293
many repeats (theAlufamily, for example, is represented by∼300,000 members,
interspersed with nonrepetitive DNA, in the haploid human genome). Although
transposons may be of parasitic origins, many turned out to fulfill important (reg-
ulatory) functions in the eukaryotic genome. These, and other repeated sequences,
comprise a significant proportion of the genome: Only 1.5-2% of the human genome
are today considered to code for proteins.
In the years that followed the neat bottom-up picture of the Central Dogma,
of DNA transcribed into messenger-RNA that is translated on the ribosomes se-
quentially and unequivocally into a polypeptide that folds into a protein, had to
be greatly modified..
An increasing number of specialized RNA-polymerases were found to be in-
volved in the transcription process. The reductionist negative feedback gene reg-
ulation, in which the resources or the products interact automatically with the
regulator protein [Monodet al., 1963] was found to be only one component of
regulation. Differential activation by the interaction of a very large number of
transcription factors, many of which attach in rather Baroque constellations to
the DNA sequences of thepromoterregions, provided the framework of cellular
driven top-down positive gene regulation by directing the RNA polymerases to the
site of transcription initiation. These proteins are crucial to transcription initia-
tion in all eukaryotic cells. More than 2,000 transcription factors are encoded in
the human genome. Furthermore, various proteins that affect the binding of the
initiation factors, bind to the regulator sequences, upstream of the promoter. The
rate of transcription is regulated by interaction with another kind of sequence,
sometimes many thousands of nucleotidesaway from the transcribed sequence,
theenhancer-sequences that act by binding activator proteins that stimulate (or
inhibit) the transcription complex. (For a review, see, e.g., Lewin [2004].)
The concept of the gene as a basic structural DNA entity of heredity got ac-
tually the final blow with the discovery that genes of eukaryotes as a rule are
not sequences of continuous coding information for translation into polypeptide
sequences. Rather, DNA sequences are transcribed into heterogeneous nuclear
RNAs, which are processed by a complex, cell-directed splicing mechanism that
excerpts several sections (introns) from the transcribed sequences, so that only
the remaining series of sequences (exons) are eventually translated into polypep-
tide sequences of amino-acids. Furtermore, as a rule,alternative splicingallows
more than one way to compose a given transcribed sequence to emerge as exons
for translation. Often the same primary transcribed sequence may be shuffled
into several, sometimes very many alternative RNA sequences. (For a review see
Maniatis and Tasic [2002].)
With an increasing number of further editing processes of the pre-tanslational
RNAs, such as exon repetition (homotypic trans-splicing), co-transcriptional splic-
ing between two ORFs, anti-sense trans-splicing, alternative trans-splicing from
the same or from different chromosomes, antisense overlapping genes, overlapping
genes without shared sequences and others, with shared, but alternative reading
frames, and many more, the notion of the gene lost its original material concep-