The Art of R Programming

(WallPaper) #1
Here, I specified that the fileDAhad fields separated by commas. The
function then reported the number of fields in each record of the file, which
fortunately were all 5s.
I could have usedall()to check this, rather than checking it visually, via
this call:

all(count.fields("DA",sep=",") >= 5)

A return value ofTRUEwould mean everything is fine. Alternatively, I
could have used this form:

table(count.fields("DA",sep=","))

I would then get counts of the numbers of records with five fields, four
fields, six fields, and so on.
After this check, I then read in the files as data frames:

da <- read.csv("DA",header=TRUE,stringsAsFactors=FALSE)
db <- read.csv("DB",header=FALSE,stringsAsFactors=FALSE)

I wanted to check for possible spelling errors in the various fields, so I
ran the following code:

for (col in 1:6)
print(unique(sort(da[,col])))

This gave me a list of the distinct values in each column so that I could
visually scan for incorrect spellings.
I needed to merge the two data frames, matching by employee ID, so I
ran the following code:

mrg <- merge(da,db,by.x=1,by.y=1)

I specified that the first column would be the merge variable in both
cases. (As remarked earlier, I could also have used field names rather than
numbers here.)

5.4 Applying Functions to Data Frames...........................................


As with lists, you can use thelapplyandsapplyfunctions with data frames.

5.4.1 Using lapply() and sapply() on Data Frames.......................


Keep in mind that data frames are special cases of lists, with the list compo-
nents consisting of the data frame’s columns. Thus, if you calllapply()on a
data frame with a specified functionf(), thenf()will be called on each of
the frame’s columns, with the return values placed in a list.

112 Chapter 5

Free download pdf