Problem Solving in Automata, Languages, and Complexity

(Dana P.) #1
94 CONTEXT-FREE LANGUAGES

then d(ul) = 1 and d(uk-1) = -1 (since d(uk) = d(u) = 0). It implies that
d( ua) = 0 for some i between 1 and k - 1. However, we assumed that u is the
shortest nonempty prefix of x with d(u) = 0; thus, this is a contradiction and
the claim is proven.


Now, from the above claim, we see that u can be written as u = bya for
some y E {atb), and x can be written as z = byax for some y, x E {a, b}.
Furthermore, since d(z) = d(u) = 0, we see that d(y) = d(x) = 0. That is, we
have a recursive expression for x: E L: x = byax with y, x E L.
Similarly, if u starts with symbol a, then it must end with symbol b, and
x: can be written as IX: = aybx for some y, x E L.
From the above analysis, we conclude that L can be generated by a gram-
mar G = ({S}, {a, b}, R, S) with the following rules:


S + E 1 uSbS 1 bSuS.

For instance, u4b8u4 has the following derivation:


S 3 uSbS * uuSbSbS 5 u4(Sb)4S $ u4b4S
3 u4b4b4(uS)4 3 u4b8u4. cl

All of the languages studied in the above examples are not regular. Thus,
it shows that a context-free language is not necessarily regular. The following
theorem shows that, on the other hand, a regular language must be context-
free. Thus, the class of regular languages is a proper subclass of the class of
context-free languages.


Theorem 3.7 For each regular language L, there exists a context-free grum-
mar G such that L = L(G).


Proof. A regular language L is accepted by a DFA A4 = (Q, C, 6, 40, F).
Without loss of generality, we may assume that QnE = 0. Now, we construct
a context-free grammar G = (V, C, R, S) with V = Q, S = qo and


R = (4 + UP I &I, a) =PPJ{f---,+fEF}*


Then, by a simple induction, we can prove that for any p, q E Q and any

x E c*, s(4,x) = p if and only if there is a derivation q 5 xp in G. In
particular, x E L if and only if there is a derivation


for some state f E F. This shows that L C L(G). Moreover, since the only
rules in R having a right-hand side with no nonterminal symbols are of the
form f + E with f E F, the above derivations are the only ones in G. It
follows that L = L(G). cl

Free download pdf