A Practical Guide to Cancer Systems Biology

(nextflipdebug2) #1

98 A Practical Guide to Cancer Systems Biology



library(org.Hs.eg.db)
x<-org.Hs.egPATH
mappedIDToKEGG<- mappedLkeys(x)
Gos<- mget(mappedIDToKEGG, org.Hs.egPATH, ifnotfound=NA)
haveKEGG<- as.vector(!sapply(Gos, function(x) any(is.na(x)),
simplify=T))
mappedIDToKEGG<- mappedIDToKEGG[haveKEGG]



mappedIDToKEGG is a vector of Entrez IDs that have at least one mapping
to the KEGG pathways. Next, you may want to fetch Entrez IDs from the
previous SAM results stored in samrSummTable:



up<- samrSummTable$genes.up[,2]
up<- sapply(strsplit(up, “\/”), “[”)[2,]
dn<- samrSummTable$genes.lo[,2]
dn<- sapply(strsplit(dn, “\/”), “[”)[2,]



Now, you also have to derive a gene universe for the hypergeometric test.
A typical selection is the whole Entrez IDs used in the SAM test:



universe<- unique(ez)



You can then restrict these Entrez IDs to those having at least one KEGG
mapping:



up<- up[up %in% mappedIDToKEGG]
dn<- dn[dn %in% mappedIDToKEGG]
universe<- universe[universe %in% mappedIDToKEGG]



After Entrez IDs get prepared, you can start the hypergeometric test
using the function hyperGTest() in R package GOstats.^10 The input for
hyperGTest() takes a KEGGHyperGParams object with several arguments
as follows:



library(GOstats)
params<- new(“KEGGHyperGParams”,



  • geneIds=up,

  • universeGeneIds=universe,

  • annotation=“org.Hs.eg.db”,

  • pvalueCutoff=0.05,

  • testDirection=“over”)



where genelds takes the Entrez IDs to be tested, universeGeneIds takes the
universe of Entrez IDs, annotation indicates the R package for annotation,
pvalueCutoff is set to 0.05, and testDirection determines overrepresentation
or underrepresentation of the test. This KEGGHyperGParams object is

Free download pdf