Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1
remaining samples while a combined analysis of data from both
stages is conducted [1]. Significant signals, as termed SNPs in
GWAS, are subsequently tested for replication. Power calcula-
tion software such as CaTS can also be used to easily determine
the required sample size and thresholds and receive power
calculations for two-stage genome-wide association
studies [17].


  1. A recommended threshold for removing SNPs with low call
    rate is approximately 98–99%; however this threshold may vary
    from study to study [18].

  2. The frequency of a SNP is given in terms of the minor allele
    frequency or the frequency of the less common allele. For
    instance, a SNP with a minor allele (A) frequency of 0.60
    implies that 60% of a population has theAallele versus the
    more common allele (the major allele), which is found in 60%
    of the population.
    Also, consider the low-frequency and rare variants that are not
    common variants (minor allele frequency<5%) showing in
    your study. Poor coverage for rare variants and low LD with
    SNP markers cause lack of success in identifying rare variants. A
    larger sample and the discovery of the relative proportion of
    common variants and rare variants in the total genetic contri-
    bution can increase statistical power. In recent years, rare vari-
    ant association studies (RVASs) have become a growing field of
    genome-associated study [19].

  3. Hardy-Weinberg equilibrium allows allele and genotype fre-
    quencies to be estimated from one generation to the next.
    Departure from this equilibrium will be indicative of potential
    genotyping errors, population stratification, or actual associa-
    tion to the trait under study [18].

  4. In GWA approach, it’s very important for multiple testing to
    avoid false-positive results in your studies. There are few factors
    that need to be aware of to minimize the false-positive results:
    (1) statistical adjustment such as Bonferroni correction, false
    discovery rate (FDR), or permutation testing; (2) stringentP-
    value that indicates that the allele frequency is significantly
    altered between two sample groups; and (3) large sample size
    for both genome-wide scan and replication studies.

  5. The Bonferroni correction adjusts the alpha value fromα¼
    0.05 toα¼(0.05/n) wherenis the number of statistical tests
    conducted, which is also the number of GWAS markers to be
    investigated. Be aware that Bonferroni correction will be too
    conservative when some of the SNPs are in LD while it assumes
    that each association test of the SNPs is independent of all
    other tests. You can use LD information from SNPSpD or


106 Michelle Chang et al.

Free download pdf