Textbook of Personalized Medicine - Second Edition [2015]

(Ron) #1

14


human genome with an average density of one INDEL per 7.2 kb of DNA. Variation
hotspots were identifi ed with up to 48-fold regional increases in INDEL and/or SNP
variation compared with the chromosomal averages for the same chromosomes.
The scientists expect to expand the map to between 1 and 2 million by continuing
their efforts with additional human sequences. INDELs can be grouped into fi ve
major categories, depending on their effect on the genome:



  1. Insertions or deletions of single base pairs

  2. Expansions by only one base pair (monomeric base pair expansions

  3. Multi-base pair expansions of 2–15 repeats

  4. Transposon insertions (insertions of mobile elements)

  5. Random DNA sequence insertions or deletions.


INDELs already are known to cause human diseases. For example, cystic fi brosis
is frequently caused by a three-base-pair deletion in the CFTR gene, and DNA
insertions called triplet repeat expansions are implicated in fragile X syndrome and
Huntington’s disease. Transposon insertions have been identifi ed in hemophilia,
muscular dystrophy and cancer. INDEL maps will be used together with SNP maps
to create one big unifi ed map of variation that can identify specifi c patterns of
genetic variation to help predict the future health of an individual. The next phase of
this work is to fi gure out which changes correspond to changes in human health and
develop personalized health treatments. All the INDELs identifi ed in the study have
been deposited into dbSNP − a publicly available SNP database hosted by the
National Center for Biotechnology Information. The National Human Genome
Research Institute of the NIH funded the research.
GeneVa™ structural genomic variations platform (Compugen) provides pre-
dicted non-SNP, medium and large-scale genetic variations in the human genome.
Currently, it incorporates a database – developed during the past year – of approxi-
mately 200,000 novel predicted insertions, deletions and copy-number variations in
the human genome. This database was created by analyzing genomic, EST
(Expressed Sequence Tag), disease related and other databases. A specialized
computational biology analysis platform was developed to handle and integrate
these disparate data sources, identify possible genomic structural variations and pre-
dict their association with specifi c disease pathways such as those associated with
breast and colon cancer, diabetes type II and Parkinson’s disease.


Large Scale Variation in Human Genome


Large-scale disparities in the DNA of healthy people have been revealed, which
challenge the previous fi ndings, and reveal a largely ignored source of genome vari-
ation. This fi nding implies that healthy persons can have large portions of DNA that
are repeated or large portions that are missing for no known reason. This previously
unappreciated heterogeneity may underlie certain human phenotypic variation and
susceptibility to disease and argues for a more dynamic human genome structure.


1 Basic Aspects
Free download pdf