CHAPTER 15 ■ GENERICS AND REGULAR EXPRESSIONS
Continued
Metacharacter Description
$ Matches the end of the string. Z$ matches a Z character at the end of the string.
| Matches the expression on either side of itself. (This character is sometimes called
the pipe character.) For example, this|that would match either “this” or “that”. The
expressions need not be string literals. [a-s]|[u-z] finds any lower-case character
other than t.
] Closes a set of characters. For example, [a-z] would match any lower-case
character.
} Closes a match count specifier. For example, n{2} would match two n characters in
a row: nn. ban{2} matches “bann”, and [a-z]{2}n matches any lower-case three-letter
string that ends with n, such as “sun”, “fun”, “ban”, “wan”, and so on. Of course, it
also matches nonsense strings, such as “dcn”.
) Closes a subpattern (a pattern within the larger pattern). For example
identit(y|ies) lets you match either “identity” or “identities”.
Also ends the definition of a group. (Cat) treats those three characters as a single
unit for other regular expression operators.
? Matches the preceding character 0 or 1 times. For example, ban? matches “ba” and
“ban”.
* Matches the preceding character any number (including 0) of times. For example,
ban* matches “ba”, “ban”, “bann”, “bannn”, and so on.
+ Matches the character one or more times. For example ban+ matches “ban”, “bann”,
bannn”, and so on. It does not match “ba” because the n character has to appear at
least once.
. Matches any single character. For example, bar. matches “bark”, “bard”, “bar9”,
and so on. .* matches any number of any character. It is probably the most used
regular expression, because it lets you skip over any text you don't want to match to
find the bits you do want to match. We'll see some examples later in this chapter.
From all those examples, I bet you're beginning to get an idea of how powerful regular expressions
can be. In truth, though, describing the metacharacters is just scratching the surface of regular
expressions. There's lots more to it than what I've shown here. Let's learn a little more by looking at
examples.
Returning to our example involving fictional characters named Sam, suppose we want to get the
whole name (including the separator, which is a semicolon). We might try something like the following:
(Sam).*;