The Art of R Programming

(WallPaper) #1

15
16 mutlinks <- function(cls,m) {
17 n <- nrow(m)
18 nc <- length(cls)
19 # determine which worker gets which chunk of i
20 options(warn=-1)
21 ichunks <- split(1:n,1:nc)
22 options(warn=0)
23 counts <- clusterApply(cls,ichunks,mtl,m)
24 do.call(sum,counts) / (n*(n-1)/2)
25 }


Suppose we have this code in the fileSnowMutLinks.R. Let’s first discuss
how to run it.

16.2.1 Running snow Code.............................................


Running the abovesnowcode involves the following steps:


  1. Load the code.

  2. Load thesnowlibrary.

  3. Form asnowcluster.

  4. Set up the adjacency matrix of interest.

  5. Run your code on that matrix on the cluster you formed.


Assuming we are running on a dual-core machine, we issue the follow-
ing commands to R:

> source("SnowMutLinks.R")
> library(snow)
> cl <- makeCluster(type="SOCK",c("localhost","localhost"))
> testm <- matrix(sample(0:1,16,replace=T),nrow=4)
> mutlinks(cl,testm)
[1] 0.6666667

Here, we are instructingsnowto start two new R processes on our
machine (localhostis a standard network name for the local machine),
which I will refer to here asworkers. I’ll refer to the original R process—the
one in which we type the preceding commands—as themanager. So, at this
point, three instances of R will be running on the machine (visible by run-
ning thepscommand if you are in a Linux environment, for example).
The workers form aclusterinsnowparlance, which we have namedcl.
Thesnowpackage uses what is known in the parallel-processing world as a
scatter/gatherparadigm, which works as follows:


  1. The manager partitions the data into chunks and parcels them out to
    the workers (scatter phase).


Parallel R 335
Free download pdf