The Art of R Programming

(WallPaper) #1
Recall that the argumentdfis the two-fangyan data frame, output from
merge2fy(). The argumentsfromcolandtocolare the names of the source and
mapped columns. The stringsourcevalis the source value to be mapped. For
concreteness, consider the earlier examples in whichsourcevalwasx.
The first task is to determine which rows indfcorrespond tosourceval.
This is accomplished via a straightforward application ofwhich()in line 2.
This information is then used in line 3 to extract the relevant subdata frame.
In that latter frame, consider the form thatbasedf[[tocol]]will take in
line 6. These will be the values thatxmaps to—that is,ch,h, and so on. The
purpose of line 6 is to determine which rows ofbasedfcontain which of these
mapped values. Here, we use R’ssplit()function. We’ll discusssplit()in
detail in Section 6.2.2, but the salient point is thatspwill be a list of data
frames: one forch, one forh, and so on.
This sets up line 8. Sincespwill be a list of data frames—one for each
mapped value—applying thenrow()function viasapply()will give us the
counts of the numbers of characters for each of the mapped values, such
as the number of characters in which the mapx→choccurs (15 times, as
seen in the example call).
The complexity of the code here makes this a good time to comment on
programming style. Some readers may point out, correctly, that lines 2 and 3
could be replaced by a one-liner:

basedf <- df[df[[fromcol]] == sourceval,]

But to me, that line, with its numerous brackets, is harder to read.
My personal preference is to break down operations if they become too
complex.
Similarly, the last few lines of code could be compacted to another
one-liner:

list(counts=sapply(sp,nrow),images=sp)

Among other things, this dispenses with thereturn(), conceivably speeding
up the code. Recall that in R, the last value computed by a function is auto-
matically returned anyway, without areturn()call. However, the time savings
here are really small and rarely matter, and again, my personal belief is that
including thereturn()call is clearer.

120 Chapter 5

Free download pdf