Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1
Nowadays, genomic questions are so complex that a depth of
information is needed. In ultra-high-throughput sequencing, as
many as 500,000 sequencing-by-synthesis operations may be run
in parallel [3]. With its unprecedented throughput, speed, and
scalability compared with traditional DNA sequencing, NGS
enables researchers to study biological problems at a new level
and has been widely implemented in commercial DNA sequencers.
Table 1 makes comparisons between several high-throughput
sequencing methods [4].
Among those NGS methods, 454 pyrosequencing is doubt-
lessly the most classic one. It does not require ddNTPs for chain
termination. Instead, it mainly utilizes emulsion PCR to accom-
plish DNA elongation. By detecting the pyrophosphate released
during nucleotide incorporation, the sequencer can analyze the
sequence. Data will be stored in standard flowgram format (SFF)
files for downstream analysis.
The process can be divided into the following steps:


  1. Library construction. The library DNAs with 454-specific
    adaptors are denatured to be single strand.

  2. Surface attachment and bridge amplification.

  3. Denaturation and complete amplification. For example, by
    emulsion PCR.

  4. Single base extension and sequencing.


The theory can be concluded as follows:
When one dNTP (dATP, dGTP, dCTP, dTTP) complements to
the bases of the template strand with the help of DNA polymerase,
one pyrophosphate (PPi) is released. Catalyzed by ATP sulfurylase,
PPi can bind to adenosine-5^0 -phosphosulfate (APS) to generate
ATP. With luciferase, the ATP drives the luciferin into oxyluciferin
and generates visible light, which then be captured by CDD system.
The signal will be then analyzed by computers and finally show the
exact DNA sequence.
Although the next-generation sequencing methods are still the
most prevailing technologies, the third-generation sequencing
(TGS), also known as the single molecule sequencing (SMS), is
developing rapidly. This kind of technology depends on detecting
single molecule signal and no longer needs PCR, aiming to increase
throughput and decrease the time to result and cost by eliminating
the need for excessive reagents and harnessing the processivity of
DNA polymerase [5].

2 Methods for DNA Sequencing Data Analysis


After obtaining the exact sequences of the nucleic acid, it is usually
necessary to identify the quality of the outcome, to extract target

DNA Sequencing Data Analysis 3
Free download pdf