The Art of R Programming

(WallPaper) #1
7 # 2, and tones in row 3
8 saplout <- sapply((fy[[2]]),sepsoundtone)
9 # convert it to a data frame
10 tmpdf <- data.frame(fy[,1],t(saplout),row.names=NULL,
11 stringsAsFactors=F)
12 # add names to the columns
13 consname <- paste(names(fy)[[2]]," cons",sep="")
14 restname <- paste(names(fy)[[2]]," sound",sep="")
15 tonename <- paste(names(fy)[[2]]," tone",sep="")
16 names(tmpdf) <- c("Ch char",consname,restname,tonename)
17 # need to use merge(), not cbind(), due to possibly different
18 # ordering of fy, outdf
19 outdf <- merge(outdf,tmpdf)
20 }
21 return(outdf)
22 }
23
24 # separates romanized pronunciation pronun into initial consonant, if any,
25 # the remainder of the sound, and the tone, if any
26 sepsoundtone <- function(pronun) {
27 nchr <- nchar(pronun)
28 vowels <- c("a","e","i","o","u")
29 # how many initial consonants?
30 numcons <- 0
31 for (i in 1:nchr) {
32 ltr <- substr(pronun,i,i)
33 if (!ltr %in% vowels) numcons <- numcons + 1 else break
34 }
35 cons <- if (numcons > 0) substr(pronun,1,numcons) else NA
36 tone <- substr(pronun,nchr,nchr)
37 numtones <- tone %in% letters # T is 1, F is 0
38 if (numtones == 1) tone <- NA
39 therest <- substr(pronun,numcons+1,nchr-numtones)
40 return(c(cons,therest,tone))
41 }

So, even the merging code is not so simple. And this code makes some
simplifying assumptions, excluding some important cases. Textual analysis is
never for the faint of heart!
Not surprisingly, the merging process begins with a call tomerge(),in
line 3. This creates a new data frame,outdf, to which we will append new
columns for the separated sound components.
The real work, then, involves the separation of a romanization into its
sound components. For that, there is a loop in line 5 across the two input
data frames. In each iteration, the current data frame is split into sound
components, with the result appended tooutdfin line 19. Note the com-
ment preceding that line regarding the unsuitability ofcbind()in this
situation.

118 Chapter 5

Free download pdf