58 A Practical Guide to Cancer Systems Biology
Mapping reads using TopHat or HISAT
In this step, we demonstrated how to use two alignment algorithms, TopHat^5
and HISAT,^6 to generate read alignment data (BAM) for downstream
analysis.
TopHat:
- In the Tools panel, expand “NGS: RNA Analysis” and select “TopHat”
tool. - Set “Is this single-end or paired-end data?” to “Pair-end (as individual
datasets)”. - Set “Single end or paired reads?” to “Individual paired reads”.
- Set “RNA-Seq FASTQ file, forward reads” to the FASTQ dataset
corresponding to the forward reads, i.e., name contains “R1”. - Set “RNA-Seq FASTQ file, reverse reads” to the FASTQ dataset
corresponding to the reverse reads, i.e., name contains “R2”. - Set “Mean Inner Distance between Mate Pairs” and “Std. Dev for Dis-
tance between Mate Pairs” based on the protocol of library construction.
For example, for paired end runs with fragments selected at 300 bp,
where each end is 50 bp, you should set “Mean Inner Distance between
Mate Pairs” to 200. - Use a built-in genome and select the proper reference genome. In our
dataset, select “Human: hg38”. - Set “TopHat settings to use” to “Use Defaults” or “Full parameter list”
to adjust each TopHat parameter. - Click “Execute” button to start the job.
- Repeat steps 1–9 for each set of pair-end data.
HISAT:
- In the Tools panel, expand “NGS:RNA Analysis” and select “HISAT”
tool. - Set “Input data format” to “FASTQ”.
- Set “Single end or paired reads?” to “Individual paired reads”.
- Set “Forward reads” to the FASTQ dataset corresponding to the forward
reads, i.e., name contains “R1”. - Set “Reverse reads” to the FASTQ dataset corresponding to the reverse
reads, i.e., name contains “R2”. - Use a built-in genome and select the proper reference genome. In our
dataset, select “Human: hg38”.