management system based on the Comprehensive R Archive
Network (CRAN) [17]. The R installation should have the
following software packages installed (including all of their
dependencies): reshape2 [18], GSEABase {GSEABaseGen-
esete:tz}, GSVA [13], ggplot2 [19], and DESeq2 [20], using
the built-in "install.packages" function in R.
- A data set of ortholog-based mappings of Ensembl gene iden-
tifiers for the non-human species (in the example vignette
shown here, dog) to Human Gene Nomenclature Committee
(HGNC) Gene Symbols. Such a mapping can be obtained
using the BioMart tool [21] through the Ensembl genome
portal [22]. This information should be contained in a single-
column data frame "dog_ensgene_to_symbol" in which the
Ensembl gene identifiers are the row names, as shown here
(in this chapter, blue text indicates screen output from an R
session):
>head(dog_ensgene_to_symbol)
Associated.Gene.Name
ENSCAFG00000022708
ENSCAFG00000022709
ENSCAFG00000022710
ENSCAFG00000022711
ENSCAFG00000022712
ENSCAFG00000022713 ND1
(seeNote 1). In the above, the R function "head" is used; this
function prints out the firstnlines (the default isn¼6 lines) of
whatever object is the function argument.
- A data set of human gene annotations mapping Ensembl gene
identifiers to HGNC gene symbols. Such a mapping can be
obtained using Ensembl BioMart. This information should be
contained in a single-column data frame "human_ensgen-
e_to_symbol" in which the Ensembl gene identifiers are the
row names:
>head(human_ensgene_to_symbol)
Associated.Gene.Name
ENSG00000210049 MT-TF
ENSG00000211459 MT-RNR1
ENSG00000210077 MT-TV
ENSG00000210082 MT-RNR2
ENSG00000209082 MT-TL1
ENSG00000198888 MT-ND1
- A file containing mappings of HGNC gene symbols to gene
functional annotation categories, in Gene Matrix Transposed
(GMT) format {GSEATeam:wt}. A comprehensive file of
human gene annotations (for the Gene Ontology functional
annotation categories [23]) can be obtained from the Molecu-
lar Signatures Database (MSigDB [24, 25]) web site [26] via a
downloadable file "c5.all.v5.2.symbols.gmt."
Cross-Species RNA-Seq Analysis 293