Microsoft Word - Core PHP Programming Using PHP to Build Dynamic Web Sites

(singke) #1

Chapter 16. PARSING AND STRING


EVALUATION


Tokenizing


Regular Expressions........................................................................................


Defining Regular Expressions......................................................................


Using Regular Expressions in PHP Scripts..............................................


Parsing is the act of breaking a whole into components, usually a sentence into words.
PHP must parse the code you write as a first step in turning a script into an HTML
document. There will come a time when you are faced with extracting or verifying data
collected in a string. This could be as simple as a tab-delimited list. It could be as
complicated as the string a browser uses to identify itself to a Web server. You may
choose to tokenize the string, breaking it into pieces. Or you may choose to apply a
regular expression. This chapter examines PHP's functions for parsing and string
evaluation.


Tokenizing


PHP allows for a simple model for tokenizing a string. Certain characters, of your choice,
are considered separators. Strings of characters between separators are considered tokens.
You may change the set of separators with each token you pull from a string, which is
handy for irregular strings—that is, ones that aren't simply comma-separated lists.


Listing 16.1 accepts a sentence and breaks it into words using the strtok function,


described in Chapter 9, "Data Functions." As far as the script is concerned, a word
is surrounded by a space, punctuation, or either end of the sentence. Single and double
quotes are left as part of the word.


Listing 16.1 Tokenizing a String

Free download pdf