The Art of R Programming

(WallPaper) #1

3 Jillian MA
4 John HI



d2a <- rbind(d2,list(15,"Jill"))
d2a
ages kids
1 12 Jack
2 10 Jill
3 7 Lillian
4 15 Jill
merge(d1,d2a)
kids states ages
1 Jack CA 12
2 Jill MA 10
3 Jill MA 15



There are two Jills ind2a. There is a Jill ind1who lives in Massachu-
setts and another Jill with unknown residence. In our previous example,
merge(d1,d2), there was only one Jill, who was presumed to be the same per-
son in both data frames. But here, in the callmerge(d1,d2a), it may have
been the case that only one of the Jills was a Massachusetts resident. It is
clear from this little example that you must choose matching variables with
great care.


5.3.1 Extended Example: An Employee Database.......................


The following is an adaptation of one of my consulting projects. At issue
was whether older workers were faring as well as younger ones. I had data
on several variables, such as age and performance ratings, which I used in
my comparison of the older and younger employees. I also had employee
ID numbers, which were crucial in being able to connect the two data files:
DAandDB.
TheDAfile had this header:


"EmpID","Perf 1","Perf 2","Perf 3","Job Title"


These are names for the employee ID, three performance ratings, and
the job title.DBhad no header. The variables again began with the ID, fol-
lowed by start and end dates of employment.
Both files were in CSV format. Part of my data-cleaning phase consisted
of checking that each record contained the proper number of fields.DA, for
example, should have five fields per record. Here is the check:



count.fields("DA",sep=",")
[1]5555555555555555555555555555555555
5555
...



Data Frames 111
Free download pdf