15
16 mutlinks <- function(cls,m) {
17 n <- nrow(m)
18 nc <- length(cls)
19 # determine which worker gets which chunk of i
20 options(warn=-1)
21 ichunks <- split(1:n,1:nc)
22 options(warn=0)
23 counts <- clusterApply(cls,ichunks,mtl,m)
24 do.call(sum,counts) / (n*(n-1)/2)
25 }
Suppose we have this code in the fileSnowMutLinks.R. Let’s first discuss
how to run it.
16.2.1 Running snow Code.............................................
Running the abovesnowcode involves the following steps:
- Load the code.
- Load thesnowlibrary.
- Form asnowcluster.
- Set up the adjacency matrix of interest.
- Run your code on that matrix on the cluster you formed.
Assuming we are running on a dual-core machine, we issue the follow-
ing commands to R:
> source("SnowMutLinks.R")
> library(snow)
> cl <- makeCluster(type="SOCK",c("localhost","localhost"))
> testm <- matrix(sample(0:1,16,replace=T),nrow=4)
> mutlinks(cl,testm)
[1] 0.6666667
Here, we are instructingsnowto start two new R processes on our
machine (localhostis a standard network name for the local machine),
which I will refer to here asworkers. I’ll refer to the original R process—the
one in which we type the preceding commands—as themanager. So, at this
point, three instances of R will be running on the machine (visible by run-
ning thepscommand if you are in a Linux environment, for example).
The workers form aclusterinsnowparlance, which we have namedcl.
Thesnowpackage uses what is known in the parallel-processing world as a
scatter/gatherparadigm, which works as follows:
- The manager partitions the data into chunks and parcels them out to
the workers (scatter phase).
Parallel R 335