PHP Objects, Patterns and Practice (3rd edition)

(Barry) #1

A P P E N D I X B


■ ■ ■


A Simple Parser


The Interpreter pattern discussed in Chapter 11 does not cover parsing. An interpreter without a parser
is pretty incomplete, unless you persuade your users to write PHP code to invoke the interpreter! Third-
party parsers are available that could be deployed to work with the Interpreter pattern, and that would
probably be the best choice in a real-world project. This appendix, however, presents a simple object-
oriented parser designed to work with the MarkLogic interpreter built in Chapter 11. Be aware that these
examples are no more than a proof of concept. They are not designed for use in real-world situations.


■Note The interface and broad structure of this parser code are based on Steven Metsker’s Building Parsers with


Java (Addison-Wesley, 2001). The brutally simplified implementation is my fault, however, and any mistakes


should be laid at my door. Steven has given kind permission for the use of his original concept.


The Scanner


In order to parse a statement, you must first break it down into a set of words and characters (known as
tokens). The following class uses a number of regular expressions to define tokens. It also provides a
convenient result stack that I will be using later in this section. Here is the Scanner class:


namespace gi\parse;


class Scanner {


// token types
const WORD = 1;
const QUOTE = 2;
const APOS = 3;
const WHITESPACE = 6;
const EOL = 8;
const CHAR = 9;
const EOF = 0;
const SOF = -1;


protected $line_no = 1;
protected $char_no = 0;

Free download pdf