56 A Practical Guide to Cancer Systems Biology
The first way to achieve the annotation file is from the GENCODE
project.^4
- Go to the GENCODE official website (https://www.gencodegenes.org/).
- You can find all the available files in the page of human current release,
and download the GTF or GFF file or copy the link address directly. - Back to the Galaxy website. In the Tools panel, expand “Get Data” and
click “Upload file”. - If the files have been saved on your computer, drag and drop files into
the pop-up window, or click “Choose local file” to choose the files from
your computer. Alternatively, click “Paste/Fetch data”, then paste the
URL, e.g., ftp://ftp.sanger.ac.uk/pub/gencode/Gencodehuman/release
25/gencode.v25.basic.annotation.gff3.gz, into the text-entry box, and
then click the “Start” button. In this case, the original file is compressed
as gz format, and Galaxy will automatically decompress this file as gff3
format. - Click the pencil icon of that dataset to rename the dataset.
The second way to import the RefSeq annotation file is from the UCSC
interface.
- In the Tools panel, expand “Get Data” and click “UCSC main” link. This
tool will open up the Table Browser from UCSC in the Galaxy window. - Set “genome” to “human” and “assembly” to “GRCh38/hg38”, “group”
to “Genes and Gene Predictions” and “track” to “RefSeq Genes”. - Select “genome” for “region”.
- Choose “GTF — gene transfer format” for “output format” and check
the box of “Galaxy”. - Click “get output” button and then click “Send query to Galaxy” button.
- Imported RefSeq annotation file will be listed in the History panel, and
click the pencil icon of that dataset to rename the dataset.
Read quality assessment using FastQC
Before starting to do RNA-seq analysis, perform the FastQC to check for
any unusual qualities for sequence reads.
- In the Tools panel, expand “NGS: QC and manipulation” and click
“FastQC”. - Under “Short read data from your current history”, select a single or
multiple fastq files for assessment of read quality. - Click “Execute” to start the job.