multiple susceptibility alleles, each with small effect sizes (typically
increasing disease risk between 1.2 and 2 times the population risk)
[1]. GWAS started several years ago because the technologies
developed have reached the requirements to perform GWAS,
including the catalog of human genetic variants, low-cost and
accurate methods of genotyping to identify gene variants, large
numbers of informative samples, and efficient statistical design for
analysis.
Since the Human Genome Project (HGP) was completed in
2003, many DNA sequence variants have been gradually identified
and thus capable to be used for GWAS. With the development of
the International HapMap Project that provided the location of
~4 million common SNPs in population of different geographical
origins and the allelic association between SNPs, which also termed
linkage disequilibrium (LD), we now can find disease-predisposing
genetic variants for complex traits [2]. By high-throughput geno-
typing technology, GWAS reveals the association between
hundreds and thousands of SNPs (usually called tag SNPs) and
clinical conditions and measurable traits.
1.1 Concepts
Underlying the Study
Design
The ultimate goal of GWAS is to identify a large portion of the
common single-nucleotide genetic variation for association with a
complex disease or variation in a quantitative trait. To develop new
prevention and treatment strategies to who is at risk, it is important
to understand the biological basis of genetic effects in developing
new medical therapies.
Single-nucleotide polymorphisms (SNPs) are single base-pair
changes in the DNA sequence that occur with high frequency in the
human genome, also known as modern units of genetic variation.
These genetic polymorphisms have proven to be very useful as
genetic markers and can be used to detect the disease variants via
LD. This relationship among SNPs, genotyping merely a set of
informative SNPs to serve as proxy markers (usually called tagging
SNPs, with r2>0.8), is sufficient to capture most of the genetic
information of SNPs, which are not genotyped with only slight loss
of statistical power. r2 is a measurement of “correlation” or LD
between two SNPs whose value ranges from 0 to 1 (r2 of one
indicates complete LD). r2 depends on both allele frequencies
and recombination between the two SNPs.
Linkage analysis was subsequently applied successfully to iden-
tify genetic variants that contribute to rare disorders like Hunting-
ton disease. When applied to more common disorders, like heart
disease or various forms of cancer, linkage analysis has not fared as
well. This implies the genetic mechanisms that influence common
disorders are different from those that cause rare disorders.
98 Michelle Chang et al.