72
broaden the range of discovered CNVs. CNVnator was calibrated by using the
extensive validation performed by the 1000 Genomes Project. Because of this,
CNVnator could be used for CNV discovery and genotyping in a population and
characterization of atypical CNVs, such as de novo and multi-allelic events. Overall,
CNVnator has high sensitivity, low false-discovery rate, high genotyping accuracy,
and high resolution in breakpoint discovery. Furthermore, CNVnator is comple-
mentary in a straightforward way to split-read and read-pair approaches. It misses
CNVs created by retrotransposable elements, but more than half of the validated
CNVs that it identifi es are not detected by split-read or read-pair. By genotyping
CNVs in the CEPH, Yoruba, and Chinese-Japanese populations, it was estimated
that at least 11 % of all CNV loci involve complex, multi-allelic events, a consider-
ably higher estimate than reported earlier. Moreover, among these events, the
authors observed cases with allele distribution strongly deviating from Hardy-
Weinberg equilibrium, possibly implying selection on certain complex loci. Finally,
by combining discovery and genotyping, they identifi ed six potential de novo CNVs
in two family trios.
Study of Rare Variants in Pinpointing Disease-Causing Genes
Genome-wide association studies (GWAS) use gene chips in automated systems
that analyze about 500,000 to 1 million sites where SNPs tend to occur. In using
these SNP chips over the past decade in comparing DNA samples between healthy
subjects and patients, scientists have identifi ed thousands of SNPs that associate
with common complex diseases. However, SNPs investigated by the gene chips do
not themselves cause a disease, but instead serve as a marker linked to the actual
causal mutations that may reside in a nearby region. After a GWAS fi nds SNPs
linked to a disease, researchers then perform a “fi ne-mapping” study by additional
genotyping, i.e. sequencing of the gene regions near the SNP signal, to uncover an
altered gene that harbors a mutation responsible for the disease.
GWAS have been successful in identifying disease susceptibility loci, but pin-
pointing of the causal variants in subsequent fi ne-mapping studies remains a chal-
lenge. A conventional fi ne-mapping effort starts by sequencing dozens of randomly
selected samples at susceptibility loci to discover candidate variants, which are then
placed on custom arrays and algorithms are used to fi nd the causal variants. A new
study challenges the prevailing view that common diseases are usually caused by
common gene variants (mutations) but the culprits may be numerous rare variants,
located in DNA sequences farther away from the original “hot spots” than scientists
have been accustomed to look (Wang et al. 2010 ). The authors propose that one or
several rare or low-frequency causal variants can hitchhike the same common tag
SNP, so causal variants may not be easily unveiled by conventional efforts. They
demonstrated that the true effect size and proportion of variance is explained by a
collection of rare causal variants, which can be underestimated by a common
tag SNP, thereby accounting for some of the “missing heritability” in GWAS.
2 Molecular Diagnostics in Personalized Medicine