Systems Biology (Methods in Molecular Biology)

Sample group information for the mRNA-seq datasets for each
species. For each species, the sample group information should
be contained in a single-column data frame in which the row
names are unique sample names. A portion of the data frame
for the example human-dog analysis vignette (along with the
dimensions of the data frame) is shown here:

>head(dog_sample_info) external_name s01 TCC.1 s02 TCC.2 s03 TCC.3 s05 normal.1 s06 TCC.4 s34 TCC.5 >dim(dog_sample_info) [1] 10 1 Sample information for the other species (in this vignette, human) should be stored in a similar data frame (in this vignette, we will assume the data frame is named "human_sample_info").

Ortholog mappings between the two species, in the form of a
two-column data frame whose first column contains Ensembl
gene identifiers for the second species (in this example vignette,
human) and whose second column contains the Ensembl gene
identifier of an ortholog (if any) for the gene in the first species
(in this example vignette, dog). Such a mapping can be
obtained using Ensembl BioMart. A portion of the data
frame for the example human-dog analysis vignette (along
with the dimensions of the data frame) is shown here (see
Note 2).

>head(human_dog_ensg) Ensembl.Gene.ID Dog.Ensembl.Gene.ID 1 ENSG00000261657 2 ENSG00000223116 3 ENSG00000233440 4 ENSG00000207157 5 ENSG00000229483 6 ENSG00000252952 ENSCAFG00000025776 >dim(human_dog_ensg) [1] 65999 2

3 Methods

Below, I outline the steps required to carry out an unsupervised and a supervised comparison of mRNA-seq data sets from two species, using as an example mRNA-seq data sets from a cross-species (dog and human) study of bladder cancer. The first five steps of the

Cross-Species RNA-Seq Analysis 295

Systems Biology (Methods in Molecular Biology)

Get our desktop app

Company

Features

Documentation

Resources