Concepts of Programming Languages

(Sean Pound) #1

190 Chapter 4 Lexical and Syntax Analysis


It is not difficult to see that together, these two rules generate the same lan-
guage as the two rules with which we began. However, these two pass the
pairwise disjointness test.
If the grammar is being used as the basis for a recursive-descent parser, an
alternative to left factoring is available. With an EBNF extension, the problem
disappears in a way that is very similar to the left factoring solution. Consider
the original rules above for <variable>. The subscript can be made optional by
placing it in square brackets, as in

<variable> → identifier [ [<expression] ]

In this rule, the outer brackets are metasymbols that indicate that what is inside
is optional. The inner brackets are terminal symbols of the programming lan-
guage being described. The point is that we replaced two rules with a single
rule that generates the same language but passes the pairwise disjointness test.
A formal algorithm for left factoring can be found in Aho et al. (2006). Left
factoring cannot solve all pairwise disjointness problems of grammars. In some
cases, rules must be rewritten in other ways to eliminate the problem.

4.5 Bottom-Up Parsing


This section introduces the general process of bottom-up parsing and includes
a description of the LR parsing algorithm.

4.5.1 The Parsing Problem for Bottom-Up Parsers


Consider the following grammar for arithmetic expressions:

E → E + T | T
T → T * F | F
F → (E) | id

Notice that this grammar generates the same arithmetic expressions as the
example in Section 4.4. The difference is that this grammar is left recursive,
which is acceptable to bottom-up parsers. Also note that grammars for bottom-
up parsers normally do not include metasymbols such as those used to specify
extensions to BNF. The following rightmost derivation illustrates this grammar:
E => E + T
=> E + T * F
=> E + T * id
=> E + F * id
=> E + id * id
=> T + id * id
=> F + id * id
=> id + id * id
Free download pdf