3.3 Parsing and Ambiguity 117
nn
abaabb
1 J
is incorrect, since we should match the first two symbols a and t, as soon as
they are read. In the following, we will design a context-free grammar to
enforce this matching policy.
For any string w E {a$}*, let c(w) = #&0) - #I. We will use two
extra nonterminals X and Y. The nonterminal X will generate the set
A = {z E {a,b}* 1 c(z) = 0, (Vy, y is a prefix of z$[c(y) > - 01).
The nonterminal Y will generate the set
B = {x E {a, b}* 1 c(x) = 0, (Vy, y is a prefix of z)[c(y) < - 01).
We now construct an unambiguous context-free grammar Gx for set A:
Gx : X _jr ax&X 1 E.
First, we show that L(Gx) = A. It is clear that each string 2 E L(Gx)
has c(z) = 0. Furthermore, a simple induction on the length of z shows that
each prefix y of x must have c(y) > - 0: This is trivial if 1x1 = 0. If 1x1 > 0 and
if x is derived from X via
with X 3 u and X 3 w, then, by the inductive hypothesis on u, any
prefix y of au has c(y) > 1 and c(aub) = 0. Thus, for any prefix x of v,
c(aubz) = c(z) > - 0. It follows that L(Gx) C - A.
Conversely, we prove, by induction on the length of x, that each x E A is
in L(Gx). Again, this is trivial for x = E. For any nonempty string x E A,
it is easy to see that x must begin with a and end with b. Write x = aubv,
where au is the longest prefix of x with the following property:
PI : All (the nonempty prefixes x of au, including au itself, have c(z) > 0.
We claim that both u and v are in A. Indeed, for any prefix w of u, we
have c(w) = c(aw) - 1 > 0, since property PI implies c(aw) > 1. Also,
by the definition of au, we must have c(aub) = 0 and, hence,-c(u) = 0.
Therefore, u E A. We can also verify that c(v) = 0 since c(x) = 0 and
c(aub) = 0, and that all prefixes x of v satisfy c(z) > - 0, since c(aub) =^0
implies c(z) = c(aubx) > 0. -
From the claim and the inductive hypothesis, we have X 5 u and X 3 V.
Together, we get a derivation for x:
X + aXbX 3 aubX 3 aubv = x. (3 1).
This completes the proof of A C L(Gx). -