A Practical Guide to Cancer Systems Biology

(nextflipdebug2) #1

  1. Proteomic Data Analysis 65


Figure 2. The organization of the cls file format for description of sample. The cls file
consists of three lines, the number of samples and classes, the names of classes that have
appeared in the analysis report, and the class label for each sample. Each label should be
separated by a space or a tab in the cls file.


protein name into OfficialGeneSymbol using DAVID Gene ID Conversion
Tool (http://david.abcc.ncifcrf.gov/conversion.jsp) if required. The data
demonstrated in this chapter are iTRAQ labeled proteomics data which can
be derived from the supplementary files from http://pubs.acs.org^15 or the
ProteomeXchange Consortium via the PRIDE partner repository with the
dataset identifier PXD001078.^16


Description of data (phenotype): CLS file (*.cls)


The phenotype file assigns categorical class to each sample accordingly (e.g.,
control vs. treated, tumor vs. normal, or carcinomain situvs. metastasis,
etc.). The file should be prepared in tab delimited txt file format (Fig. 2).
The first line contains the total number of samples, the total number of
classes, and 1. The second line contains a user-visible name for each class
appearing in analysis reports. The line should begin with a pound sign (#)
followed by a space or a tab. The third line contains a class label for each
sample. The class label can be the class name, a number, or a text string.
The order of label used here is same as the expression data and should be
assigned as the same order of the category on the second line. For instance,
CTL goes first and then MIR.


II.Run GSEA


Load expression and phenotype files (Fig. 3).

Free download pdf