A Practical Guide to Cancer Systems Biology

(nextflipdebug2) #1

58 A Practical Guide to Cancer Systems Biology


Mapping reads using TopHat or HISAT


In this step, we demonstrated how to use two alignment algorithms, TopHat^5
and HISAT,^6 to generate read alignment data (BAM) for downstream
analysis.


TopHat:



  1. In the Tools panel, expand “NGS: RNA Analysis” and select “TopHat”
    tool.

  2. Set “Is this single-end or paired-end data?” to “Pair-end (as individual
    datasets)”.

  3. Set “Single end or paired reads?” to “Individual paired reads”.

  4. Set “RNA-Seq FASTQ file, forward reads” to the FASTQ dataset
    corresponding to the forward reads, i.e., name contains “R1”.

  5. Set “RNA-Seq FASTQ file, reverse reads” to the FASTQ dataset
    corresponding to the reverse reads, i.e., name contains “R2”.

  6. Set “Mean Inner Distance between Mate Pairs” and “Std. Dev for Dis-
    tance between Mate Pairs” based on the protocol of library construction.
    For example, for paired end runs with fragments selected at 300 bp,
    where each end is 50 bp, you should set “Mean Inner Distance between
    Mate Pairs” to 200.

  7. Use a built-in genome and select the proper reference genome. In our
    dataset, select “Human: hg38”.

  8. Set “TopHat settings to use” to “Use Defaults” or “Full parameter list”
    to adjust each TopHat parameter.

  9. Click “Execute” button to start the job.

  10. Repeat steps 1–9 for each set of pair-end data.


HISAT:



  1. In the Tools panel, expand “NGS:RNA Analysis” and select “HISAT”
    tool.

  2. Set “Input data format” to “FASTQ”.

  3. Set “Single end or paired reads?” to “Individual paired reads”.

  4. Set “Forward reads” to the FASTQ dataset corresponding to the forward
    reads, i.e., name contains “R1”.

  5. Set “Reverse reads” to the FASTQ dataset corresponding to the reverse
    reads, i.e., name contains “R2”.

  6. Use a built-in genome and select the proper reference genome. In our
    dataset, select “Human: hg38”.

Free download pdf