3.3 Parsing and Ambiguity 111
Thus, a parse tree of a string IX: demontrates the structure of the generation
of x from the grammar rules, disregarding the order of applications of these
Parsing Algorithms. The task of finding a parse tree for ~1: from a given
context-free grammar G is called parsing. Parsing is one of the most important
issues in compiler design. By adding certain constraints to the grammar,
people have developed a number of parsing techniques. Here, we only present
a general (but often inefficient) top-cdozun parsing method.
Let G = (V, C, R, S) b e a context-free grammar. A string w E (V U C)*
is called a sentential form if S 3 w, and it is called a left sentential form if
there is a leftmost derivation from S to w. If w is a string of terminal symbols,
then it has a leftmost derivation from S if and only if it has a derivation from
S. This is not true for sentential forms. It is possible that a sentential form
can only be obtained from S through a non-leftmost derivation.
For any context-free grammar G, we define the leftmost graph g(G) of G
as follows:
(a) The vertices of g(G) are all left sentential forms of G.
(b) For any two left sentential forms w1 and ~2, if there is a rule r = (A +
Z) of G such that wr = uAv and w2 = uxv for some u E C and
v E (VuC), th en there is an directed edge from WI to ~2, with label
r.
Thus, g(G) is an edge-labeled directed graph, which is often infinite. Note
that if all leftmost sentential forms of G have a unique leftmost derivation from
the starting symbol S, then g(G) is actually a tree, with the start symbol S
as the root.
To find a parse tree of a string IX: of terminals, we can simply perform a
search of the graph g(G). S ince the graph g(G) might be infinite, the graph
search is not guaranteed to halt on all inputs. Nevertheless, with a minor
restriction to the grammar G, we can be sure that the graph search always
halts. A grammar rule is called an E-rule, if it is of the form A + E for some
A E V. We observe that if a context-free grammar does not have an E-rule,
then the size of the sentential forms in a derivation must be nondecreasing.
Thus, to determine whether S 3 x:, we may simply restrict ourself to search
the subgraph of g(G) over vertices of length < 1x1. -
Example 3.22 Determine whether x = abaca belongs to L(G) or not, where
G = ({S}, {a, b, c}, {S + SbS 1 ScS 1 a), S>.
Solution. We start with the starting symbol S, and perform a depth-first
search of the graph s(G) 7 using the following ordering of the rules: rl : (S +
a), 7-2 1 (S + SbS), r3 : (S + ScS). D uring the search, we also use the
following conditions to prune the graph:
(1) If a vertex has length > 6, then - we delete this vertex.