The Art of R Programming

(WallPaper) #1
This reports that “uat” did indeed appear in “Equator,” starting at char-
acter position 3.

11.1.8 gregexpr().....................................................


The callgregexpr(pattern,text)is the same asregexpr(), but it finds all
instances ofpattern. Here’s an example:

> gregexpr("iss","Mississippi")
[[1]]
[1]25

This finds that “iss” appears twice in “Mississippi,” starting at character
positions 2 and 5.

11.2 Regular Expressions


When dealing with string-manipulation functions in programming languages,
the notion ofregular expressionssometimes arises. In R, you must pay atten-
tion to this point when using the string functionsgrep(),grepl(),regexpr(),
gregexpr(),sub(),gsub(), andstrsplit().
A regular expression is a kind of wild card. It’s shorthand to specify
broad classes of strings. For example, the expression"[au]"refers to any
string that contains either of the lettersaoru. You could use it like this:

> grep("[au]",c("Equator","North Pole","South Pole"))
[1]13

This reports that elements 1 and 3 of("Equator","North Pole","South
Pole")—that is, “Equator” and “South Pole”—contain either anaor au.
A period (.) represents any single character. Here’s an example of
using it:

> grep("o.e",c("Equator","North Pole","South Pole"))
[1]23

This searches for three-character strings in which anois followed by any
single character, which is in turn followed by ane. Here is an example of the
use of two periods to represent any pair of characters:

> grep("N..t",c("Equator","North Pole","South Pole"))
[1] 2

Here, we searched for four-letter strings consisting of anN, followed by
any pair of characters, followed by at.
A period is an example of ametacharacter, which is a character that is not
to be taken literally. For example, if a period appears in the first argument
ofgrep(), it doesn’t actually mean a period; it means any character.

254 Chapter 11

Free download pdf