Context-Free Grammars: Examples and Construction

 
UNIT III
 
Context Free Grammar and
Languages
 
Ms. Shital Pawar
Assistant Professor
Computer Engineering Department
Bharati Vidyapeeth’s College of Engineering for Women,
Pune 43
 
Formal Definition
 
CFG stands for context-free grammar. It is  a formal grammar which is
used to generate all possible patterns of strings in a given formal
language. Context-free grammar G can be defined by four tuples as:
G = (V, T, P, S)  Where,
G is the grammar, which consists of a set of the production rule. It is
used to generate the string of a language.
T is the final set of a terminal symbol. It is denoted by lower case
letters.
V is the final set of a non-terminal symbol. It is denoted by capital
letters.
P is a set of production rules, which is used for replacing non-terminals
symbols(on the left side of the production) in a string with other
terminal or non-terminal symbols(on the right side of the production).
S is the start symbol which is used to derive the string. We can derive
the string by repeatedly replacing a non-terminal by the right-hand
side of the production until all non-terminal have been replaced by
terminal symbols.
 
 
Example 1:
Construct the CFG for the language having any number of a's
over the set ∑= {a}.
 
Solution:
As we know the regular expression for the above language is
r.e. = a*
Production rule for the Regular expression is as follows:
S → aS    rule 1
S → ε     rule 2
Now if we want to derive a string "aaaaaa", we can start with start
symbols.
 S
aS
aaS          rule 1
aaaS         rule 1
aaaaS        rule 1
aaaaaS       rule 1
aaaaaaS      rule 1
aaaaaaε      rule 2
aaaaaa
 
Example 2
Construct a CFG for the regular expression (0+1)*
 
Solution:
The CFG can be given by,
Production rule (P):
S → 0S | 1S                        e.g.    1010             1S        (S
1S)
S → ε                                                                   10S      (S
0S)
                                                                             101S    (S
1S)
                                                                             1010S   (S
0S)
                                                                              1010    (S
 ε
)
     The rules are in the combination of 0's and 1's with the start
symbol. Since (0+1)* indicates {ε, 0, 1, 01, 10, 00, 11, ....}. In this set,
ε is a string, so in the rule, we can set the rule S → ε.
 
Example 3
Construct a CFG for a language L = {wcwR | where w € (a, b)*}
 
Solution:
The string that can be generated for a given language is {aacaa, bcb,
abcba, bacab, abbcbba, ....}
The grammar could be:
S → aSa     rule 1
S → bSb     rule 2
S → c       rule 3
Now if we want to derive a string "abbcbba", we can start with start
symbols.
S → aSa
S → abSba          from rule 2
S → abbSbba     from rule 2
S → abbcbba     from rule 3
Thus any of this kind of string can be derived from the given
production rules.
 
Example 4
Construct a CFG for the language L = a
n
b
2n
 where n>=1.
 
Solution:
The string that can be generated for a given language is {abb, aabbbb,
aaabbbbbb....}.
The grammar could be:
S → aSbb | abb
 
Now if we want to derive a string "aabbbb", we can start with start
symbols.
 
S → aSbb
S → aabbbb
 
Example 5
Construct a CFG for the language L={w
ϵ
{a,b}*|w is palindrome of
odd length}
 
Solution:
     As string is odd length palindrome middle letter can be either a or b.
     Smallest string for given language can be either a or b.
 
e.g: L={a, b, aba, bab, babab, abbba, aabaa, baaab,….}
S 
aSa
S 
bSb
S 
a|b
It can be written as
S 
 aSa | bSb | a | b
 
Example 6
Construct a CFG for the language L={w
ϵ
{a,b}*|w is palindrome of
even length with |w| > 0}
 
Solution:
     As string is odd length palindrome middle letter can be either a or b.
     Smallest string for given language can be either aa or bb.
 
e.g: L={aa, bb, abba, baab, babbab, abbbba, aabbaa, baaaab,….}
S 
aSa
S 
bSb
S 
aa|bb
It can be written as
S 
 aSa | bSb | aa | bb
 
Example 7
Construct a CFG for the language L={w
ϵ
{a,b}*|w is palindrome of
even length or odd length with |w| > 0}
 
Solution:
     As string is odd length palindrome middle letter can be either a or b.
     Smallest string for given language can be either a or b.
 
e.g: L={a, b, aba, bab, babab, abbba, aabaa, baaab,aa, bb, abba, baab,
babbab, abbbba, aabbaa, baaaab,….}
S 
aSa
S 
bSb
S 
 a|b|aa|bb
It can be written as
S 
 aSa | bSb | a | b | aa | bb
 
Example 8
Construct a CFG for the language L={a
i
 b
j 
c
k 
 | i = j + k}
 
 
Solution:
Outer a’s should be matched with c’s then the remaining  a’s should be
matched with b’s .
e.g. L = {aabc, aaabcc, aaabbc, aaaabbcc, aaaabccc,…..}
 
S  
 aSc |aXc
X 
 aXb |ab
 
Example 9
Construct a CFG for the language L={a
i
 b
j 
c
k 
 | j = i + k}
 
 
Solution:
Outer a’s should be matched with c’s then the remaining  a’s should be
matched with b’s .
e.g. L = {abbc,aabbbc,abbbcc,aabbbbcc,………}
  a
i
 b
i   
b
k 
c
k
 
S  
 XY
X 
 aXb |ab
Y 
 bYc | bc
 
Example 10
Construct a CFG for the language (011+1)*(01)*
 
Solution:
Let us consider that (011 + 1)* is generated by variable X
X 
 011X | 1X | 
ε
Let us consider that (01)* is generated by variable Y
Y 
 01Y | 
ε
 
By using concatenation rule, we can combine above two production
rules:
So, CFG for given language is,
S  
 XY
X 
 011X | 1X | 
ε
Y 
 01Y | 
ε
 
Example 11
Construct a CFG for the language L={a
n
 b
m 
| n
 ≠ m
}
 
Solution:
As n ≠ m number a’s are different than number of b’s.
Let us consider a
n  
 is generated
  
 by variable X
X 
 aX | a
Let us consider b
m  
is generated
  
 by variable Y
Y 
 bY | b
So, CFG for given language is,
S 
 aSb | X | Y
X 
 aX | a
Y 
 bY | b
 
 
 
 
Derivation
 
Derivation is a sequence of production rules. It is used to get
the input string through these production rules. During
parsing, we have to take two decisions. These are as follows:
 
We have to decide the non-terminal which is to be replaced.
 
We have to decide the production rule by which the non-
terminal will be replaced.
 
We have two options to decide which non-terminal to be
placed with production rule.
 
 
 
Leftmost Derivation
 
In the leftmost derivation, the input is scanned and replaced with the
production rule from left to right. So in leftmost derivation, we read the
input string from left to right.
Example:
Production rules:
E = E + E
E = E - E
E = a | b
Input
a - b + a
The leftmost derivation is:
E = E + E
E = E - E + E
E = a - E + E
E = a - b + E
E = a - b + a
 
 Rightmost Derivation
 
In rightmost derivation, the input is scanned and replaced with the
production rule from right to left. So in rightmost derivation, we read the
input string from right to left.
Example
Production rules:
E = E + E
E = E - E
E = a | b
Input
a - b + a
The rightmost derivation is:
E = E - E
E = E - E + E
E = E - E + a
E = E - b + a
E = a - b + a
 
Note:
 When we use the leftmost derivation or rightmost derivation, we
may get the same string. This type of derivation does not affect on getting
of a string.
 
 
Example 1
Derive the string "abb" for leftmost derivation and rightmost
derivation using a CFG given by,
S → AB | ε
A → aB
B → Sb
Solution:
Leftmost derivation:                                           Rightmost derivatio
n:
 
 
Example 2
Derive the string "aabbabba" for leftmost derivation and rightmost
derivation using a CFG given by,
S → aB | bA
A → a | aS | bAA
B → b | bS | aBB
 
Solution:
Leftmost derivation:
S
aB            S → aB
aaBB          B → aBB
aabB          B → b
aabbS         B → bS
aabbaB        S → aB
aabbabS       B → bS
aabbabbA      S → bA
aabbabba      A → 
a
 
 
Example 2
Derive the string "aabbabba" for leftmost derivation and rightmost
derivation using a CFG given by,
S → aB | bA
A → a | aS | bAA
B → b | bS | aBB
Rightmost derivation:
S
aB            S → aB
aaBB          B → aBB
aaBbS         B → bS
aaBbbA        S → bA
aaBbba        A → a
aabSbba       B → bS
aabbAbba      S → bA
aabbabba      A → a
 
Example 3
Derive the string "00101" for leftmost derivation and rightmost
derivation using a CFG given by,
S → A1B
A → 0A | ε
B → 0B | 1B | ε
Solution:
Leftmost derivation:
S
A1B
0A1B
00A1B
001B
0010B
00101B
00101
 
Example 3
Derive the string "00101" for leftmost derivation and rightmost
derivation using a CFG given by,
S → A1B
A → 0A | ε
B → 0B | 1B | ε
 
Rightmost derivation:
S
A1B
A10B
A101B
A101
0A101
00A101
00101
 
Derivation Tree
 
Derivation tree is a graphical representation for the derivation
of the given production rules for a given CFG. It is the simple
way to show how the derivation can be done to obtain some
string from a given set of production rules. The derivation tree
is also called a parse tree.
 
Parse tree follows the precedence of operators. The deepest
sub-tree traversed first. So, the operator in the parent node
has less precedence over the operator in the sub-tree.
 
Properties of parse tree
 
 
 
The root node is always a node indicating start symbols.
The derivation is read from left to right.
The leaf node is always terminal nodes.
The interior nodes are always the non-terminal nodes.
 
Step 1:                                                                          Step 3:
 
 
 
 
 
 
 
 
 
 
 
Step 2:                                                                         Step 4:
 
Example 1
Production rules:
E = E + E
E = E * E
      E = a | b | c
Input
a * b + c
 
Example 2
Draw a derivation tree for the string "bab" from the CFG given by
S → bSb | a | b
 
Solution:
Now, the derivation tree for the string "bbabb" is as follows
:
 
 
 
The above tree is a derivation tree drawn for deriving a string bbabb. By
simply reading the leaf nodes, we can obtain the desired string. The same
tree can also be denoted by,
 
 
Example 3
Construct a derivation tree for the string aabbabba for the CFG
given by,
S → aB | bA
A → a | aS | bAA
B → b | bS | aBB
 
Solution:
To draw a tree, we will first try to obtain derivation for the string aabbabba
 
 
 
Derivation
 
            
Derivation Tree
 
Ambiguity in Grammar
 
A grammar is said to be ambiguous if there exists more
than one leftmost derivation or more than one rightmost
derivation or more than one parse tree for the given
input string. If the grammar is not ambiguous, then it is
called unambiguous.
If the grammar has ambiguity, then it is not good for
compiler construction. No method can automatically
detect and remove the ambiguity, but we can remove
ambiguity by re-writing the whole grammar without
ambiguity.
 
Example 1
Let us consider a grammar G with the production rule
E → I
E → E + E
E → E * E
E → (E)
I → 
ε | 0 | 1 | 2 | ... | 9
 
Solution:
      For the string "3 * 2 + 5", the above grammar can generate two parse
trees by leftmost derivation:
 
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is
ambiguous.
 
Example 2
Check whether the given grammar G is ambiguous or not.
S → aSb | SS
S → ε
 
Solution:
For the string "aabb" the above grammar can generate two parse trees
 
Since there are two parse trees for a single string "aabb", the grammar
G is ambiguous.
 
Example 3
Check whether the given grammar G is ambiguous or not.
A → AA
A → (A)
A → a
 
Solution:
     For the string "a(a)aa" the above grammar can generate two
parse trees:
 
Unambiguous Grammar
 
 
 
A grammar can be unambiguous if the grammar does not
contain ambiguity that means if it does not contain more
than one leftmost derivation or more than one rightmost
derivation or more than one parse tree for the given
input string.
 
To convert ambiguous grammar to unambiguous grammar, we
will apply the following rules:
 
1.  If the left associative operators (+, -, *, /) are used in the
production rule, then apply left recursion in the production
rule. Left recursion means that the leftmost symbol on the
right side is the same as the non-terminal on the left side. For
example,
           X → Xa
 
2.   If the right associative operates(^) is used in the production
rule then apply right recursion in the production rule. Right
recursion means that the rightmost symbol on the left side is
the same as the non-terminal on the right side. For example,
          X → aX
 
 
 
Example 1
Consider a grammar G is given as follows:
S → AB | aaB
A → a | Aa
B → b
 
      Determine whether the grammar G is ambiguous or not. If G is
ambiguous, construct an unambiguous grammar equivalent to G.
Solution:
Let us derive the string "aab"
 
As there are two different parse tree for
deriving the same string, the given
grammar is ambiguous.
Unambiguous grammar will be:
S → AB
A → Aa | a
B → b
 
Example 2
Show that the given grammar is ambiguous. Also, find an
equivalent unambiguous grammar.
S → ABA
A → aA | ε
B → bB | ε
 
Solution:
The given grammar is ambiguous because we can derive two different
parse tree for string aa.
 
The unambiguous grammar
is:
S → aXY | bYZ | ε
Z → aZ | a
X → aXY | a | ε
Y → bYZ | b | ε
 
Example 3
Show that the given grammar is ambiguous. Also, find an
equivalent unambiguous grammar.
E → E + E
E → E * E
E → id
 
Solution:
Let us derive the string "id + id * id“
 
 
As there are two different parse tree
for deriving the same string, the
given grammar is ambiguous.
Unambiguous grammar will be:
E → E + T
E → T
T → T * F
T → F
F → id
 
Example 4
Check that the given grammar is ambiguous or not. Also, find an
equivalent unambiguous grammar.
 S → S + S
S → S * S
S → S ^ S
S → a
 
Solution:
   
The given grammar is ambiguous because the derivation of string aab
can be represented by the following string:
 
Unambiguous grammar will be:
S → S + A |
A → A * B | B
B → C ^ B | C
C → a
 
Simplification of CFG
 
As we have seen, various languages can efficiently be
represented by a context-free grammar. All the grammar are
not always optimized that means the grammar may consist of
some extra symbols(non-terminal). Having extra symbols,
unnecessary increase the length of grammar.
Simplification of grammar means reduction of grammar by
removing useless symbols. The properties of reduced
grammar are given below:
Each variable (i.e. non-terminal) and each terminal of G
appears in the derivation of some word in L.
There should not be any production as X → Y where X and Y
are non-terminal.
If ε is not in the language L then there need not to be the
production X → ε.
 
 
Let us study the reduction process in detail.
Non
Reachable
Symbol
Non
generating
Symbol
 
Removal of Useless Symbols
 
 
A symbol can be useless if it does not appear on the right-
hand side of the production rule and does not take part in the
derivation of any string.
     That symbol is known as a useless symbol.
Similarly, a variable can be useless if it does not take part in
the derivation of any string.
     That variable is known as a useless variable.
 
Example 1
Remove the useless symbols from given grammar
T → aaB | abA | aaT                  A → aA
B → ab | b                                   C → ad
 
In the above example, the variable 'C' will never occur in the
derivation of any string, so the production C → ad is useless.
 So we will eliminate it, and the other productions are written in
such a way that variable C can never reach from the starting
variable 'T'.
Production A → aA is also useless because there is no way to
terminate it. If it never terminates, then it can never produce a
string. Hence this production can never take part in any derivation.
To remove this useless production A → aA, we will first find all the
variables which will never lead to a terminal string such as variable
'A'. Then we will remove all the productions in which the variable
‘A' occurs.
Simplified Grammar is:
      T → aaB | aaT
B → ab | b
 
Elimination of ε Production
 
The productions of type S → ε are called ε productions. These
type of productions can only be removed from those
grammars that do not generate ε.
Step 1:
 First find out all null able non-terminal variable which
derives ε.
Step 2:
 For each production A → a, construct all production A
→ x, where x is obtained from a by removing one or more
non-terminal from step 1.
Step 3:
 Now combine the result of step 2 with the original
production and remove ε productions.
 
Example 2
Remove the ε production from the following CFG by preserving
the meaning of it.
S → XYX
X → 0X | ε
Y → 1Y | ε
 
Solution:
      Now, while removing ε production, we are deleting the rule X → ε
and Y → ε. To preserve the meaning of CFG we are actually placing
ε at the right-hand side whenever X and Y have appeared.
Let us take
S → XYX
If the first X at right-hand side is ε. Then
S → YX
if the last X in R.H.S. = ε. Then
S → XY
If Y = ε then
S → XX
If Y and X are ε then,
S → X
 
 
Example 3
Remove the production from the following CFG by preserving the
meaning of it.
S → XYX
X → 0X | ε         Y → 1Y | ε
 
Solution:
If both X are replaced by ε
S → Y
Now,
S → XY | YX | XX | X | Y
Now let us consider
X → 0X
If we place ε at right-hand side for X then,
X → 0
X → 0X | 0
Similarly Y → 1Y | 1
Collectively we can rewrite the CFG with removed ε production as
S → XY | YX | XX | X | Y
X → 0X | 0
Y → 1Y | 1
 
Removing Unit Productions
 
The unit productions are the productions in which one non-terminal
gives another non-terminal. Use the following steps to remove unit
production:
 
Step 1:
 To remove X → Y, add production X → a to the grammar rule
whenever Y → a occurs in the grammar.
Step 2:
 Now delete X → Y from the grammar.
Step 3:
 Repeat step 1 and step 2 until all unit productions are
removed.
 
Example 4
 construct reduced grammar equivalent to the grammar
S → 0A | 1B | C
A → 0S | 00
B → 1 | A
C → 01
 
Solution:
S → C is a unit production. But while removing S → C we have to consider
what C gives. So, we can add a rule to S.
S → 0A | 1B | 01
Similarly, B → A is also a unit production so we can modify it as
B → 1 | 0S | 00
Thus finally we can write CFG without unit production as
S → 0A | 1B | 01
A → 0S | 00
B → 1 | 0S | 00
C → 01
As C is non-reachable symbol, we have to remove C. so simplified
grammar after removing C is:
S → 0A | 1B | 01
A → 0S | 00
B → 1 | 0S | 00
 
 
Example 5
construct reduced grammar equivalent to the grammar
S 
aAa
A
Sb | bcc | DaA
C
abb | DD
D 
 aDA
E 
 ac
 
Finding non - generating symbols:  D is non – generating symbol
So, reduced grammar is after elimination of non – generating symbol D is:
      S 
 aAa
A
 Sb | bcc
C 
 abb
E 
ac
Finding Non- reachable symbols: C and E are non- generating symbols:
So, reduced grammar after elimination of non -reachable symbols A and E is
       S 
 aAa
A 
 Sb | bcc
As there are no unit productions and 
ε
 productions in given grammar so
reduced grammar is :
        S 
 aAa
 A 
 Sb | bcc
 
Example 6
Simplify the following grammar:
S 
 0A0 | 1B1 | BB
A 
 C
B 
 S| A
C 
 S|
ε
 
Solution:
Elimination of 
ε
 production:
      
S 
 0A0 | 1B1 | BB | 00 | 11 | B
A 
 C
B 
 S| A
C 
 S
Eliminate Unit Productions: As B
 A 
 C 
 S 
 B
      
S 
 0A0 | 1B1 | BB | 00 | 11
      A
 
 0A0 | 1B1 | BB | 00 | 11
B
 0A0 | 1B1 | BB | 00 | 11
C
 
 0A0 | 1B1 | BB | 00 | 11
Elimination of non reachable symbol C:
       
S 
 0A0 | 1B1 | BB | 00 | 11
 A
 
 0A0 | 1B1 | BB | 00 | 11
 B
 0A0 | 1B1 | BB | 00 | 11
 
Example 7
Simplify the following grammar:
S
 ASB | 
ε
A
aAS  | a          B
 SbS | A | bb
 
Solution:
Eliminate 
ε
 productions:
       S 
 ASB | AB
       A 
 aAS| aA | a
       B 
 SbS | Sb | bS | b | A | bb
Eliminate Unit Production: Eliminate unit production A form production rule B
       S 
 ASB | AB
       A 
 AaS | Aa | a
       B 
 SbS | Sb | bS | b | aAS | aA | bb |a
As there are no useless symbols , so simplified grammar is:
       S 
 ASB | AB
       A 
 AaS | Aa | a
       B 
 SbS | Sb | bS | b | aAS | aA | bb |a
 
 
 
 
Example 8
Simplify the following grammar:
S 
 aC| SB
A 
 bSCa
B
 aSB|bBC
C
aBC|ad
 
Solution:
Elimination of non generation symbols: symbol B is non generating
symbol
      S 
 aC
A 
 bSCa
C
 ad
Elimination of non reachable symbols: Symbol A is non reachable
symbol
       S 
 aC
 C
 ad
As there is no 
ε
 production and there is no unit production in given
grammar the reduced grammar is:
      
S 
 aC
C
 ad
 
Example 9
Simplify the following grammar
S
ABA | BA | AA | AB | A | B
A 
Aa | a
B 
bB | b
 
Solution:
Eliminate unit production: A and B are unit productions.
      S
ABA | BA | AA | AB | Aa | a | bB | b
A 
Aa | a
B 
bB | b
As there is no 
ε
 production and useless symbol in this grammar. So
simplified grammar is:
      S
ABA | BA | AA | AB | Aa | a | bB | b
A 
Aa | a
B 
bB | b
 
Chomsky's Normal Form (CNF)
 
CNF stands for Chomsky normal form. A CFG(context free
grammar) is in CNF(Chomsky normal form) if all production
rules satisfy one of the following conditions:
Start symbol generating ε. For example, A → ε.
A non-terminal generating two non-terminals.
     For example, S → AB.
A non-terminal generating a terminal. For example, S → a.
 
Example
 
G1 = {S → AB, S → c, A → a, B → b}
G2 = {S → aA, A → a, B → c}
The production rules of Grammar G1 satisfy the rules
specified for CNF, so the grammar G1 is in CNF.
However, the production rule of Grammar G2 does not satisfy
the rules specified for CNF as S → aA contains terminal
followed by non-terminal. So the grammar G2 is not in CNF.
 
Steps for converting CFG into CNF
 
Step 1:
 In the grammar, remove the null, unit and useless
productions. (simplification of grammar)
Step 2:
 Eliminate terminals from the RHS of the production if
they exist with other non-terminals or terminals. For example,
production S → aA can be decomposed as:
S → RA
R → a
Step 3:
 Eliminate RHS with more than two non-terminals. For
example, S → ASB can be decomposed as:
S → RB
R → AS
 
Example 1
Convert the given CFG to CNF. Consider the given grammar G1
S → a | aA | B
A → aBB | ε
B → Aa | b
 
Solution:
Step 1:
 As grammar G1 contains A → ε null production, its
removal from the grammar yields:
S → a | aA | B
A → aBB
B → Aa | b | a
 Now, as grammar G1 contains Unit production S → B, its
removal yield:
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
 
 
S → a | aA | B
A → aBB | ε
B → Aa | b
 
Step 2:
 In the production rule S → aA | Aa, S → aA | Aa, A → aBB and
B → Aa, terminal a exists on RHS with non-terminals. So we will
replace terminal a with X:
 S → a | XA | AX | b
A → XBB
B → AX | b | a
X → a
Step 3:
 In the production rule A → XBB, RHS has more than two
symbols, removing it from grammar yield:
S → a | XA | AX | b
A → RB
B → AX | b | a
X → a
R → XB
Hence, for the given grammar, this is the required CNF.
 
 
Example 2
Convert the following CFG into CNF
S → ASA | aB, A → B | S |  ε, B → b | ε
 
Solution:
1.
 Now we will remove the null productions −
     B → 
 and A → 
After removing B → ε, the production set becomes −
     S→ ASA | aB | a,        A → B | S | 
,              B → b
After removing A → 
, the production set becomes −
    S→ ASA | aB | a | AS | SA ,          A → B | S,        B → b
2.
 After removing unit production A→ B, the production set becomes −
     S→ ASA | aB | a | AS | SA  ,             A → S | b  ,                      B → b
   After removing A→ S, the production set becomes −
    S→ ASA | aB | a | AS | SA
   A → b |ASA | aB | a | AS | SA
   B → b
 
 
S → ASA | aB, A → B | S, B → b | ε
 
3. 
Now we will find out more than two variables in the R.H.S
Here, S → ASA, A→ ASA violates two Non-terminals in R.H.S.
Hence we will apply step 4 and step 5 to get the following final
production set which is in CNF −
S→ AX | aB | a | AS | SA
A → b |AX | aB | a | AS | SA
B → b
X → SA
4.
 We have to change the productions  S→ aB, A→ aB
And the final production set becomes −
S→ AX | YB | a | AS | SA
A → b |AX | YB | a | AS | SA
B → b
X → SA
Y → a
 
Example 3
Check whether given grammar is CNF
S 
 bA | aB            A 
bAA| aS | a        B 
 aBB|bS|b
 if it is not in CNF then find equivalent CNF.
 
The grammar is not in CNF as following productions are not allowed
in CNF.
      S 
 bA | aB                     A 
bAA| aS                  B 
 aBB|bS
Step 1: Simplification of grammar: grammar is already simplified.
Step 2: Some productions i.e.
 
S 
 bA | aB  , A 
bAA| aS  and  B 
aBB|bS are violating the rules of CNF. So we replace the terminals a
with variable X and terminal b with Y.
   S 
 YA | XB                     A 
YAA| XS |a                 B 
 XBB|YS |b
   X
a                                   Y
b
Now,  a-
YAA is violating the rules of CNF. So we replace AA with W
and BB with Z.
     S 
 YA | XB                     A 
YW| XS |a                 B 
 XZ|YS |b
     W 
 AA                 Z 
 BB                     X
a                          Y
b
 
Example 4
Convert given grammar in CNF form
S
Aba  A
aab    B
Ac
 
1. Simplification of given grammar
B is non reachable . So grammar after removing B is:
    S
Aba
    A
aab
2. Both productions 
S
Aba and A
aab are violating the rules of CNF.
So we replace a with X and b with Y
   S
AYX
   A
XXY
   X
a
   Y
b
3. Now ,
S
AYX and A
XXY are violating the rules of CNF. So we
replace AY with R and XX with W.
  S
RX            A
RY             R
 AY                   W
XX
  X
a               Y
b
 
Example 5
Convert the following grammar into CNF form
S
AACD       A
aAb|
ε
C
aC|a         A
aDa|bDb|
 ε
 
1. Simplification of grammar:
Removing 
ε
 production from production A .
        
S 
AACD | ACD | CD
        A 
 aAb |ab
        C 
 aC |a
        A
 aDa | bDb
Removing non reachable symbol D.
       S
C
       A
aAb | ab
       C
 aC | a
Removing unit production S
C
     S
 aC|a
     A
aAb | ab
     C
 aC|a
 
 Example 5
Convert the following grammar into CNF form
S
AACD       A
aAb|
ε
C
aC|a         A
aDa|bDb|
 ε
 
2. S
 aC  and C
 aC are violating the CNF rules. So we replace a with X and
replace b with Y.
     S
XC|a
    A
XAY | XY
     C
XC|a
     X
a
 
As A
 XAX is voilating the riles of CNF .So, Now substitute new production
Z
XA.
    S
XC|a
    A
ZY | XY
    C
XC|a
    X
a
     Z
XA
Now, the grammar is in CNF form.
 
 
 
Example 6
Convert following grammar into CNF form.
S
AACD       A
aAb|
ε
C
aC|a         D
aDa|bDb|
 ε
 
1.Simplification of grammar:
Removing 
ε
 productions from production A and D.
      
S
AACD | ACD | CD | AAC | AC | C
      A
aAb|ab
      C
aC|a
      D
aDa|bDb|aa |bb
 
Removing unit 
productions S
C
      S
AACD | ACD | CD | AAC | AC | aC|a
      A
aAb|ab
      C
aC|a
      D
aDa|bDb|aa |bb
 
 
Example 6
Convert following grammar into CNF form.
S
AACD       A
aAb|
ε
C
aC|a         D
aDa|bDb|
 ε
 
To convert above grammar in CNF form. Let we add two variables X for
a and Y for b.
      S
AACD | ACD | CD | AAC | AC | XC | a
      A
XAY|XY
      C
XC|a
      D
XDX|YDY|XX|YY
Let we add some more variables to above grammar to  convert it in
CNF form.
      S
 PQ | AQ | CD | PC | AC | XC |a
      A 
 RY | XY
      C
XC | a
      D
 TX | UY |XX |YY
      P
AA          Q
 CD      R
 XA         T 
 XD        U
YD
      X
a           Y
b
 
 
Greibach Normal Form (GNF)
 
GNF stands for Greibach normal form. A CFG(context free
grammar) is in GNF(Greibach normal form) if all the
production rules satisfy one of the following conditions:
A start symbol generating ε. For example, S → ε.
A non-terminal generating a terminal. For example, A →
a.
A non-terminal generating a terminal which is followed
by any number of non-terminals. For example, S → aASB.
 
Example
 
 
G1 = {S → aAB | aB, A → aA| a, B → bB | b}
G2 = {S → aAB | aB, A → aA | 
ε, 
B → bB | 
ε}
 
The production rules of Grammar G1 satisfy the rules
specified for GNF, so the grammar G1 is in GNF. However, the
production rule of Grammar G2 does not satisfy the rules
specified for GNF as A → 
ε 
and B → 
ε 
contains 
ε(
only start
symbol can generate 
ε). 
So the grammar G2 is not in GNF.
 
Steps for converting CFG into GNF
 
Step 1:
 Convert the grammar into CNF.
If the given grammar is not in CNF, convert it into CNF. You can
refer the following topic to convert the CFG into CNF:
Chomsky normal form
Step 2:
 If the grammar exists left recursion, eliminate it.
If the context free grammar contains left recursion, eliminate
it. You can refer the following topic to eliminate left recursion:
Left Recursion
Step 3:
 In the grammar, convert the given production rule into
GNF form.
If any production rule in the grammar is not in GNF form,
convert it.
 
Removing Left Recursion
 
The production of the form A
A
α
 is called as 
left recursive 
as the left
side variable appears as the first symbol on the right hand side.
Examples:
1. A
Aa | b
Grammar without left recursion is:
A
b | bA
B
 aB | a
 
2. A
Aa | b | c
Grammar without left recursion is:
A
 b | c | bA | cA
B 
 aB | a
 
Example 1
Construct the grammar in GNF which is equivalent to the grammar
S → XB | AA         A → a | SA
B → b             X → a
 
 
Solution:
As the given grammar G is already in CNF and there is no left recursion,
so we can skip step 1 and step 2 and directly go to step 3.
 
The production rule A → SA is not in GNF, so we substitute S → XB |
AA in the production rule A → SA as:
S → XB | AA
A → a | XBA | AAA
B → b              X → a
The production rule S → XB and A → XBA is not in GNF, so we
substitute X → a in the production rule S → XB and A → XBA as:
S → aB | AA
A → a | aBA | AAA
B → b               X → a
 
Example 1
Construct the grammar in GNF which is equivalent to the grammar
S → XB | AA         A → a | SA
B → b             X → a
 
S → aB | AA
A → a | aBA | AAA
B → b               X → a
Now we will remove left recursion (A → AAA), we get:
S → aB | AA
A → aA | aBAA | a | aBA
C → AAC |  AA
B → b            X → a
 
The production rule S → AA is not in GNF, so we substitute A → aC |
aBAC | a | aBA in production rule S → AA as:
S → aB | aAA | aBAAA | aA | aBAA
A → aA | aBACA| a | aBA
C → AAC
C → aAA | aBAAA | aA | aBAA
B → b              X → a
 
Example 1
Construct the grammar in GNF which is equivalent to the grammar
S → XB | AA         A → a | SA
B → b             X → a
 
The production rule C → AAC is not in GNF, so we substitute A →
aC | aBAC | a | aBA in production rule C → AAC as:
S → aB | aCA | aBACA | aA | aBAA
A → aA | aBAA | a | aBA
C →  aCAA | aBAAAC | aAC | aBAAC
C → aCA | aBACA | aA | aBAA
B → b
X → a
Hence, this is the GNF form for the grammar G.
 
Example 2
Convert following grammar to GNF
S
 ABA | AB | BA | AA | A | B
A
 aA | a
B 
 bB | b
 
Solution:
Removing unit productions: S
 A and S 
B
      S
 ABA | AB | BA | AA | aA | a | bB | b
A
 aA | a
B 
 bB | b
 
Production A  and B is already in GNF form. So, production S can be
brought into GNF form by using productions of A and B.
     S 
 aABA | aBA | aAB | aB | bBA| bA |aAA |aA |a |b
     A
 aA| a
     B
 bB | b
Now the grammar is in GNF form.
 
Exapmle 3
Find GNF of given grammar
S
ABAb|ab
B
 ABA|a
A
a|b
 
Solution:
As given grammar is simplified.
Let’s substitute and Y for b
      
S
 ABAY|aY
B
 ABA|a
A
a|b
      Y
b
Now substitute A
a |b in productions of S and B
      
S
 aBAY|bBAY|aY
B
 aBA|bBA|a
A
a|b
      Y
b
Now, the grammar is in GNF form.
 
 
Chomsky Classification for Grammar
 
According to Noam Chomsky, there are four types of grammars −
Type 0, Type 1, Type 2, and Type 3. The following table shows how
they differ from each other −
 
Scope of each Type of Grammar
 
Type - 3 Grammar
 
Type-3 grammars
 generate regular languages. Type-3
grammars must have a single non-terminal on the left-hand
side and a right-hand side consisting of a single terminal or
single terminal followed by a single non-terminal.
The productions must be in the form 
X → a or X → aY
where 
X, Y 
 N
 (Non terminal)
and 
a 
 T
 (Terminal)
The rule 
S → ε
 is allowed if 
S
 does not appear on the right
side of any rule.
Example
X → ε        X → a | aY         Y → b
 
Type - 2 Grammar
 
Type-2 grammars
 generate context-free languages.
The productions must be in the form 
A → γ
where 
A 
 N
 (Non terminal)
and 
γ 
 (T 
 N)*
 (String of terminals and non-terminals).
These languages generated by these grammars are be
recognized by a non-deterministic pushdown automaton.
Example
S → X a        X → a         X → aX        X → abc          X → ε
 
Type - 1 Grammar
 
Type-1 grammars
 generate context-sensitive languages. The
productions must be in the form
α A β → α γ β
where 
A 
 N
 (Non-terminal)
and 
α, β, γ 
 (T 
 N)*
 (Strings of terminals and non-terminals)
The strings 
α
 and 
β
 may be empty, but 
γ
 must be non-empty.
The rule 
S → ε
 is allowed if S does not appear on the right side
of any rule. The languages generated by these grammars are
recognized by a linear bounded automaton.
Example
AB → AbBc   A → bcA    B → b
 
Type - 0 Grammar
 
Type-0 grammars
 generate recursively enumerable languages. The
productions have no restrictions. They are any phase structure
grammar including all formal grammars.
They generate the languages that are recognized by a Turing
machine.
The productions can be in the form of 
α → β
 where 
α
 is a string of
terminals and non-terminals with at least one non-terminal
and 
α
 cannot be null. 
β
 is a string of terminals and non-terminals.
Example
S → ACaB       Bc → acB       CB → DB         aD → Db
 
Closure Properties of Context Free
Languages
 
Closure Properties of CFLs
• CFLs are closed under:
–Union
 –Concatenation
–Kleene closure
 –Reversal
 • CFLs are not closed under intersection, difference, or
complement
 
Closure Under Union
 
Let L and M be CFLs with grammars G and H, respectively
• Assume G and H have no variables in common
 – Rename the variables if necessary
– Names of variables do not affect the language
• Let S1 and S2 be the start symbols of G and H
• Form a new grammar for L 
 M by combining all the symbols
and productions of G and H
 
Closure Under Union
 
Then, add a new start symbol S
 • Add productions S → S1 | S2
 • In the new grammar, all derivations start with S
 • The first step replaces S with either S1 or S2
 • In the first case, the result must be a string in L(G) = L, and in
the second case, the result must be a string in L(H) = M
 
Closure Under Concatenation
 
• Let L and M be CFLs with grammars G and H, respectively
• Assume G and H have no variables in common
 • Let S1 and S2 be the start symbols of G and H
 • Form a new grammar for LM by starting with all symbols and
productions of G and H
• Add a new start symbol S and production S → S1S2
• Every derivation from S results in a string in L followed by one
in M
 
Closure Under Star
 
Let L have grammar G, with start symbol S1
 • Form a new grammar for L* by introducing to G a new start
symbol S and the productions S → S1S | ε
 • A rightmost derivation from S generates a sequence of zero or
more S1’s, each of which generates some string in L
 
Closure of Under Reversal
 
If L is a CFL with grammar G, form a grammar for LR by
reversing the body of every production
 • Example: Let G have S → 0S1 | 01
• The reversal of L(G) has grammar
S → 1S0 | 10
 
CFLs are Not Closed Under Intersection
 
Unlike the regular languages, the class of CFLs is not closed
under intersection
• For example, we know that L1 = {0n1n2n : n ≥ 1} is not context-
free (use the pumping lemma)
 • However, L2 = {0n1n2i : n ≥ 1, i ≥ 1} is context-free – CFG: S →
AB, A → 0A1 | 01, B → 2B | 2
• So is L3 = {0i1n2n : n ≥ 1, i ≥ 1}
 • But L1 = L2 ∩ L3 is not context-free
 
CFLs are Not Closed Under Difference
 
We can prove something more general: –
Any class of languages that is closed under difference is also
closed under intersection
 • Proof: L ∩ M = L – (L – M)
• Thus, if CFL’s were closed under difference, they would also be
closed under intersection, but they are not CFLs
 
CFLs are Not Closed Under Complement
 
L1 ∩ L2 = ¬(¬L1 
 ¬L2)
Recall that the context-free languages are closed under union.
So if they were closed under complement, they would also be
closed under intersection (which they are not)
 
CFLs are Not Closed Under Complement
 
Recall AnBnCn = {an bncn: n ≥ 0}.
Now consider L = ¬ (AnBnCn), which is L1 
 L2,
      where:
     L1 = {w 
 {a, b, c}* : the letters are out of order}
     L2 = {aibjck: i, j, k ≥ 0 and (i ≠ j or j ≠ k)}  (in other words,
unequal numbers of a’s, b’s, and c’s)
 A simple DFA can recognize L1. L2 can be built similar to the
one we created for accepting strings with an unequal number
of a’s and b’s.
Thus, ¬ (AnBnCn) is context-free, whereas AnBnCn is not
context-free (as we already proved).
 
Discussion Properties of CFL
 
Context Free Language
Context Free Languages(CFL) is the next
class of languages outside of Regular
Languages:
– Means for defining: Context Free Grammar
– Machine for accepting: Pushdown Automata
The decision properties of Context Free Language are:
Membership
Emptiness
Finiteness
 
 
Membership
 
Unlike FAs, we can’t just run the string through the
machine and see where it goes since PDAs are non-
deterministic.
 It must consider all possible paths.
Instead, start with your grammar in CNF.
The proof of the pumping lemma states that the longest
derivation path of a string of size n will be 2n – 1.
 Systematically generate all derivations with one step,
then two steps, …, then 2n – 1 steps where the length of
the string tested = n.
If one of the derivations derive x, return true, else return
false.
 
Emptiness
 
By the proof of the pumping lemma, if a grammar in CNF
has p states, the longest string, not subject to the pumping
lemma will have length n = 2 ^p+1
Systematically generate all strings with lengths less than n.
Test each one using membership algorithm.
If all fail and Epsilon not equal to L, then L is empty
Else L is not empty
We learned to eliminate variables that generate no
terminal string.
If the start symbol is one of these, then the CFL is empty;
otherwise not.
 
Finiteness
 
The idea is essentially the same as for regular languages.
Use the pumping lemma constant n.
If there is a string in the language of length between n and 2n-
1, then the language is infinite; otherwise not
.
 
Regular Grammar
 
The language accepted by finite automata can be described
using set of productions is known as regular grammar.
Let us consider some production rules:
     A
 a
     A
aA
     A
Ba
     A
ε
The language generated by regular grammar is known as
regular language.
There are two forms to represent the regular grammar.
1.
Right Linear form
2.
Left Linear form
 
 
Types of Regular Grammar
 
1.
Right Linear Grammar:
The productions of right linear grammar can be represented as:
        A
a
        A
 aB    [The variable B in this production appears at rightmost
                         position]
        A
ε
2.     Lef
t Linear Grammar:
The productions of left linear grammar can be represented as:
        A
a
        A
 Ba    [The variable B in this production appears at leftmost
                         position]
        A
ε
 
DFA to Right Linear Grammar
 
DFA can be described by using tuple set: M = (Q, ∑, 
δ, 
q
0
 , F)
Right linear grammar can be represented by using tuple set:          G= (V, T,
P, S)
Following are the steps to convert DFA to Right Linear Grammar:
1.
Rename 
q
0
 
 Q as S 
 V i.e. Relating start sate of M with start symbol of
G.
2.
Rename the states of Q as A, B,C,D,…. where A,B,C,D,….. 
 V
3.
Generating a set of productions P as:
       i) If 
q
0 ,
 
 F then add productions S
 ε
 to P.
       ii) for every production of the form
                             a
          Add production A
 aB where A is non final state.
 
          Add production  A
 aB, A
 a , where B is final state
A
B
A
 B
 
a
 
Example1
Convert following DFA to right linear grammar
 
Solution:
By Renaming states, we will get following DFA
 
From above DFA we can write following productions :
S 
 aA| bB
A 
 aS| bC| b
B 
 bS| aC| a
C 
 aB| bA
This is the right linear grammar for given DFA.
 
Example 2
Write a right linear grammar for given DFA.
 
Solution:
State D is dead state, so it is removed.
DFA after deletion of state D is:
 
From above DFA we can write following
productions:
A 
 
ε
A 
0B| 0
B 
1C
C 
 0B|0
 
Right Linear Grammar to DFA
 
Following are the steps to convert regular grammar to DFA:
1.
The production of the form A
 aB will generate the transition for
DFA as:
 
 
2. The production of the form A 
aB| a will generate the transition as
 
 
3. The production of the form A 
ε
 will make a final state.
 
 
4. The production of the form A 
b, will generate a transition
                                                      
Where F is new state and it should be final state
 
 
A
B
 
a
A
 
B
 
a
 
A
A
 
F
 
Example  1
Convert following right linear grammar to DFA.
S 
 bB     B 
 bC       B 
 aB        C
 a        B 
b
 
Solution:
By re-writing the productions grammar will be:
S 
 bB     B 
 bC  | b        C
 a        B 
 aB
Transition diagram for above grammar will be:
 
 
 
 
By adding E state to handle Ø transitions , the Final DFA will be:
 
 
Example 2
Convert following RG to DFA
S 
 0A | 1B        A 
0C | 1A | 0
B 
 1B | 1A | 1          C 
 0 | 0A
 
Solution:
For production A 
 0, B 
1, C 
0  add new state F .
 
 
 
 
 
Equivalent DFA for above transition diagram is:
 
DFA after renaming the states and dead state is introduced to handle Ø
transactions.
 
DFA to Left Linear Grammar
 
Following are the steps to write a left linear grammar to DFA:
 
1.
Interchange the starting state and final state .
2.
Reverse the direction of all transitions.
3.
Write the grammar from transition graph in left linear form.
 
Example 1
Write the left linear grammar for following DFA.
 
After interchanging the states and reversing the directions of
transitions , Transition graph will be
:
 
Equivalent left linear
grammar is:
S 
 Ba | Ab
A
 Sb| Ca | a
B 
 Sa | Cb | b
C 
 Bb | Aa
 
Example 2
Write left linear grammar for following DFA
 
In given DFA , there are three final states S, A and B. we can draw an
equivalent DFA with single final state D by adding 
ε
 transitions from S to
D, A to D and N to D.
 
By interchanging the starting state and final state and reversing the
direction of transitions , DFA will be:
 
An equivalent left linear grammar is:
S 
 D
S 
 A
S 
B
C 
Ba
B 
 Cb | Ab
A 
 Aa | Da | Ca | a
D 
 Bb | Db | b
 
By removing unit productions:
S 
 Cb | Ab | Aa | Da | Ca | a | Bb | Db | b
C 
Ba
B 
 Cb | Ab
A 
 Aa | Da | Ca | a
D 
 Bb | Db | b
 
 
Left Linear Grammar to DFA
 
Following are the steps to convert left linear grammar to DFA:
 
1.
Draw transition graph from the given left linear grammar.
2.
Reverse the direction of all the transitions.
3.
Interchange the start state and final state.
4.
Carry out conversion from FA to DFA.
 
Example 1
Construct DFA for accepting regular language generated by the linear
grammar given below:
S 
 Ca | Bb
C 
 Bb
B 
 Ba | b
 
Solution: 
Transition diagram for given left linear grammar is:
 
By reversing the directions of transitions and interchanging starting
state and final state:
 
State D is added for production B 
b
 
 
Conversion of FA to DFA
 
Example 2
Construct DFA for accepting the language generated by left linear
grammar given below:
S 
 B1 | A0 | C0
B 
 B1 | 1
A 
 A1 | B1 | C0 | 0
C 
A0
 
 
 
Solution:
 
     By reversing direction of transitions
and interchanging starting and final
state:
 
Transition diagram for given left
linear grammar is:
 
Conversion from FA to DFA
 
Right Linear Grammar to Left linear
Grammar
 
Following are the steps to convert right linear grammar to Left
Linear Grammar:
1.
Represent the right linear grammar using transition graph
and mark 
ε
 as a final state.
2.
Interchange the start state and final state.
3.
Reverse the direction of all transitions.
4.
Write left linear grammar from transition graph.
 
Example 1
Convert the following right linear grammar to an equivalent left
linear grammar
S 
bB | b
B 
bC
B 
 aB
C 
a
B 
b
 
Solution:
Transition graph for given right linear grammar is:
 
By interchanging the start state with final state and reversing the
transitions:
 
Left linear
grammar is:
S 
 b | Bb | Ca
B 
Ba | b
C 
 Bb
 
Example 2
Write an equivalent left linear grammar from the given right linear
grammar
S 
0A | 1B
A 
0C | 1A | 0
B 
 1B | 1A | 1
C 
0 | 0A
 
Solution:
Transition diagram for given right linear grammar is:
 
By interchanging the start state with final state and reversing direction of
transitions:
 
Left linear grammar is:
S 
 C0 | A0 | B1
A 
 A1 | C0 | B1 | 0
B 
 B1 | 1
C 
 A0
 
Left linear grammar to right linear
grammar
 
Following are the steps to convert left linear grammar to right
linear grammar:
1.
Represent the left linear grammar using transition graph and
mark  
ε
  as a final state.
2.
Interchange the start state and final state.
3.
Reverse the direction of all transitions.
4.
Write right linear grammar from transition graph.
 
Example 1
Write an equivalent right linear grammar from the given left linear
grammar
S 
 C0 | A0 | B1
A 
 A1 | C0 | B1 | 0
B 
 B1 | 1
C 
 A0
 
Solution:
The transition diagram for given left linear grammar is:
 
 
By interchanging the start state with final state and reversing the
direction of transition
 
Right linear grammar is:
S 
 1B | 0A
A 
 0C | 1A | 0
B 
 1B | 1A | 1
C 
  0A | 0
 
Example 2
Construct right linear grammar corresponding to the regular
expression R = (1+(01)*)1*(0+1)
 
Solution:
The transition diagram for given regular expression is:
 
Right linear grammar is:
S 
 1B | A
A 
 01A | B
B 
 1B| 0C | 1C | 0 | 1
 
After removing unit productions
grammar will be:
S 
 1B | 01A | 1B | 0C | 1C | 0 | 1
A 
 01A | 1B| 0C | 1C | 0 | 1
B 
 1B| 0C | 1C | 0 | 1
 
Example 3
Construct left linear and right linear grammar for the language
0*(1(0+1))*
 
Solution:
 
Right linear grammar from above transition diagram can be
written as:
S 
 0S | A | 
ε
A 
 1B
B 
 0A | 1A | 0 | 1
 
After removing the unit production, right linear grammar will be:
S 
 0S | 1B | 
ε
A 
 1B
B 
 0A | 1A | 0 | 1
 
 
By interchanging start state with final state and reversing the
direction of transitions resulting transition diagram is:
 
Left linear grammar is:
S 
 B0 | B1 | A | 
ε
B 
 S1
A 
 A0 | 0
 
After removing unit production left linear
grammar will be:
S 
 B0 | B1 | A0 | 0 | 
ε
B 
 S1
A 
 A0 | 0
 
Applications of CFL
 
Following are the applications of CFG:
1.
Parser
2.
Markup languages(XML)
Slide Note
Embed
Share

Context-free grammars (CFG) are formal grammars used to generate patterns in a given language. This content provides examples of constructing CFGs for different languages, showcasing the process with detailed explanations and visuals.

  • Context-free grammars
  • Examples
  • Language construction
  • Formal grammars
  • CFG

Uploaded on Aug 03, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. UNIT III Context Free Grammar and Languages Ms. Shital Pawar Assistant Professor Computer Engineering Department Bharati Vidyapeeth s College of Engineering for Women, Pune 43

  2. Formal Definition CFG stands for context-free grammar. It is a formal grammar which is used to generate all possible patterns of strings in a given formal language. Context-free grammar G can be defined by four tuples as: G = (V, T, P, S) Where, G is the grammar, which consists of a set of the production rule. It is used to generate the string of a language. T is the final set of a terminal symbol. It is denoted by lower case letters. V is the final set of a non-terminal symbol. It is denoted by capital letters. P is a set of production rules, which is used for replacing non-terminals symbols(on the left side of the production) in a string with other terminal or non-terminal symbols(on the right side of the production). S is the start symbol which is used to derive the string. We can derive the string by repeatedly replacing a non-terminal by the right-hand side of the production until all non-terminal have been replaced by terminal symbols.

  3. Example 1: Construct the CFG for the language having any number of a's over the set = {a}. Solution: As we know the regular expression for the above language is r.e. = a* Production rule for the Regular expression is as follows: S aS rule 1 S rule 2 Now if we want to derive a string "aaaaaa", we can start with start symbols. S aS aaS rule 1 aaaS rule 1 aaaaS rule 1 aaaaaS rule 1 aaaaaaS rule 1 aaaaaa rule 2 aaaaaa

  4. Example 2 Construct a CFG for the regular expression (0+1)* Solution: The CFG can be given by, Production rule (P): S 0S | 1S S e.g. 1010 1S (S 1S) 10S (S 0S) 101S (S 1S) 1010S (S 0S) 1010 (S ) The rules are in the combination of 0's and 1's with the start symbol. Since (0+1)* indicates { , 0, 1, 01, 10, 00, 11, ....}. In this set, is a string, so in the rule, we can set the rule S .

  5. Example 3 Construct a CFG for a language L = {wcwR | where w (a, b)*} Solution: The string that can be generated for a given language is {aacaa, bcb, abcba, bacab, abbcbba, ....} The grammar could be: S aSa rule 1 S bSb rule 2 S c rule 3 Now if we want to derive a string "abbcbba", we can start with start symbols. S aSa S abSba from rule 2 S abbSbba from rule 2 S abbcbba from rule 3 Thus any of this kind of string can be derived from the given production rules.

  6. Example 4 Construct a CFG for the language L = anb2nwhere n>=1. Solution: The string that can be generated for a given language is {abb, aabbbb, aaabbbbbb....}. The grammar could be: S aSbb | abb Now if we want to derive a string "aabbbb", we can start with start symbols. S aSbb S aabbbb

  7. Example 5 Construct a CFG for the language L={w {a,b}*|w is palindrome of odd length} Solution: As string is odd length palindrome middle letter can be either a or b. Smallest string for given language can be either a or b. e.g: L={a, b, aba, bab, babab, abbba, aabaa, baaab, .} S aSa S bSb S a|b It can be written as S aSa | bSb | a | b

  8. Example 6 Construct a CFG for the language L={w {a,b}*|w is palindrome of even length with |w| > 0} Solution: As string is odd length palindrome middle letter can be either a or b. Smallest string for given language can be either aa or bb. e.g: L={aa, bb, abba, baab, babbab, abbbba, aabbaa, baaaab, .} S aSa S bSb S aa|bb It can be written as S aSa | bSb | aa | bb

  9. Example 7 Construct a CFG for the language L={w {a,b}*|w is palindrome of even length or odd length with |w| > 0} Solution: As string is odd length palindrome middle letter can be either a or b. Smallest string for given language can be either a or b. e.g: L={a, b, aba, bab, babab, abbba, aabaa, baaab,aa, bb, abba, baab, babbab, abbbba, aabbaa, baaaab, .} S aSa S bSb S a|b|aa|bb It can be written as S aSa | bSb | a | b | aa | bb

  10. Example 8 Construct a CFG for the language L={aibjck | i = j + k} Solution: Outer a s should be matched with c s then the remaining a s should be matched with b s . e.g. L = {aabc, aaabcc, aaabbc, aaaabbcc, aaaabccc, ..} S aSc |aXc X aXb |ab

  11. Example 9 Construct a CFG for the language L={aibjck | j = i + k} Solution: Outer a s should be matched with c s then the remaining a s should be matched with b s . e.g. L = {abbc,aabbbc,abbbcc,aabbbbcc, } aibi bkck S XY X aXb |ab Y bYc | bc

  12. Example 10 Construct a CFG for the language (011+1)*(01)* Solution: Let us consider that (011 + 1)* is generated by variable X X 011X | 1X | Let us consider that (01)* is generated by variable Y Y 01Y | By using concatenation rule, we can combine above two production rules: So, CFG for given language is, S XY X 011X | 1X | Y 01Y |

  13. Example 11 Construct a CFG for the language L={anbm| n m} Solution: As n m number a s are different than number of b s. Let us consider an is generated by variable X X aX | a Let us consider bmis generated by variable Y Y bY | b So, CFG for given language is, S aSb | X | Y X aX | a Y bY | b

  14. Derivation Derivation is a sequence of production rules. It is used to get the input string through these production rules. During parsing, we have to take two decisions. These are as follows: We have to decide the non-terminal which is to be replaced. We have to decide the production rule by which the non- terminal will be replaced. We have two options to decide which non-terminal to be placed with production rule.

  15. Leftmost Derivation In the leftmost derivation, the input is scanned and replaced with the production rule from left to right. So in leftmost derivation, we read the input string from left to right. Example: Production rules: E = E + E E = E - E E = a | b Input a - b + a The leftmost derivation is: E = E + E E = E - E + E E = a - E + E E = a - b + E E = a - b + a

  16. Rightmost Derivation In rightmost derivation, the input is scanned and replaced with the production rule from right to left. So in rightmost derivation, we read the input string from right to left. Example Production rules: E = E + E E = E - E E = a | b Input a - b + a The rightmost derivation is: E = E - E E = E - E + E E = E - E + a E = E - b + a E = a - b + a Note: When we use the leftmost derivation or rightmost derivation, we may get the same string. This type of derivation does not affect on getting of a string.

  17. Example 1 Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG given by, S AB | A aB B Sb Solution: Leftmost derivation: Rightmost derivation:

  18. Example 2 Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG given by, S aB | bA A a | aS | bAA B b | bS | aBB Solution: Leftmost derivation: S aB S aB aaBB aabB aabbS aabbaB aabbabS aabbabbA aabbabba B aBB B b B bS S aB B bS S bA A a

  19. Example 2 Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG given by, S aB | bA A a | aS | bAA B b | bS | aBB Rightmost derivation: S aB S aB aaBB B aBB aaBbS aaBbbA aaBbba aabSbba aabbAbba aabbabba B bS S bA A a B bS S bA A a

  20. Example 3 Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given by, S A1B A 0A | B 0B | 1B | Solution: Leftmost derivation: S A1B 0A1B 00A1B 001B 0010B 00101B 00101

  21. Example 3 Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given by, S A1B A 0A | B 0B | 1B | Rightmost derivation: S A1B A10B A101B A101 0A101 00A101 00101

  22. Derivation Tree Derivation tree is a graphical representation for the derivation of the given production rules for a given CFG. It is the simple way to show how the derivation can be done to obtain some string from a given set of production rules. The derivation tree is also called a parse tree. Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the operator in the parent node has less precedence over the operator in the sub-tree.

  23. Properties of parse tree The root node is always a node indicating start symbols. The derivation is read from left to right. The leaf node is always terminal nodes. The interior nodes are always the non-terminal nodes.

  24. Example 1 Production rules: E = E + E E = E * E E = a | b | c Input a * b + c Step 1: Step 3: Step 2: Step 4:

  25. Example 2 Draw a derivation tree for the string "bab" from the CFG given by S bSb | a | b Solution: Now, the derivation tree for the string "bbabb" is as follows: The above tree is a derivation tree drawn for deriving a string bbabb. By simply reading the leaf nodes, we can obtain the desired string. The same tree can also be denoted by,

  26. Example 3 Construct a derivation tree for the string aabbabba for the CFG given by, S aB | bA A a | aS | bAA B b | bS | aBB Solution: To draw a tree, we will first try to obtain derivation for the string aabbabba Derivation Derivation Tree

  27. Ambiguity in Grammar A grammar is said to be ambiguous if there exists more than one leftmost derivation or more than one rightmost derivation or more than one parse tree for the given input string. If the grammar is not ambiguous, then it is called unambiguous. If the grammar has ambiguity, then it is not good for compiler construction. No method can automatically detect and remove the ambiguity, but we can remove ambiguity by re-writing the whole grammar without ambiguity.

  28. Example 1 Let us consider a grammar G with the production rule E I E E + E E E * E E (E) I | 0 | 1 | 2 | ... | 9 Solution: For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost derivation: Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous.

  29. Example 2 Check whether the given grammar G is ambiguous or not. S aSb | SS S Solution: For the string "aabb" the above grammar can generate two parse trees Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.

  30. Example 3 Check whether the given grammar G is ambiguous or not. A AA A (A) A a Solution: For the string "a(a)aa" the above grammar can generate two parse trees:

  31. Unambiguous Grammar A grammar can be unambiguous if the grammar does not contain ambiguity that means if it does not contain more than one leftmost derivation or more than one rightmost derivation or more than one parse tree for the given input string.

  32. To convert ambiguous grammar to unambiguous grammar, we will apply the following rules: 1. If the left associative operators (+, -, *, /) are used in the production rule, then apply left recursion in the production rule. Left recursion means that the leftmost symbol on the right side is the same as the non-terminal on the left side. For example, X Xa 2. If the right associative operates(^) is used in the production rule then apply right recursion in the production rule. Right recursion means that the rightmost symbol on the left side is the same as the non-terminal on the right side. For example, X aX

  33. Example 1 Consider a grammar G is given as follows: S AB | aaB A a | Aa B b Determine whether the grammar G is ambiguous or not. If G is ambiguous, construct an unambiguous grammar equivalent to G. Solution: Let us derive the string "aab" As there are two different parse tree for deriving the same string, the given grammar is ambiguous. Unambiguous grammar will be: S AB A Aa | a B b

  34. Example 2 Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar. S ABA A aA | B bB | Solution: The given grammar is ambiguous because we can derive two different parse tree for string aa. The unambiguous grammar is: S aXY | bYZ | Z aZ | a X aXY | a | Y bYZ | b |

  35. Example 3 Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar. E E + E E E * E E id Solution: Let us derive the string "id + id * id As there are two different parse tree for deriving the same string, the given grammar is ambiguous. Unambiguous grammar will be: E E + T E T T T * F T F F id

  36. Example 4 Check that the given grammar is ambiguous or not. Also, find an equivalent unambiguous grammar. S S + S S S * S S S ^ S S a Solution: The given grammar is ambiguous because the derivation of string aab can be represented by the following string: Unambiguous grammar will be: S S + A | A A * B | B B C ^ B | C C a

  37. Simplification of CFG As we have seen, various languages can efficiently be represented by a context-free grammar. All the grammar are not always optimized that means the grammar may consist of some extra symbols(non-terminal). Having extra symbols, unnecessary increase the length of grammar. Simplification of grammar means reduction of grammar by removing useless symbols. The properties of reduced grammar are given below: Each variable (i.e. non-terminal) and each terminal of G appears in the derivation of some word in L. There should not be any production as X Y where X and Y are non-terminal. If is not in the language L then there need not to be the production X .

  38. Let us study the reduction process in detail. Non Non Reachable Symbol generating Symbol

  39. Removal of Useless Symbols A symbol can be useless if it does not appear on the right- hand side of the production rule and does not take part in the derivation of any string. That symbol is known as a useless symbol. Similarly, a variable can be useless if it does not take part in the derivation of any string. That variable is known as a useless variable.

  40. Example 1 Remove the useless symbols from given grammar T aaB | abA | aaT A aA B ab | b C ad In the above example, the variable 'C' will never occur in the derivation of any string, so the production C ad is useless. So we will eliminate it, and the other productions are written in such a way that variable C can never reach from the starting variable 'T'. Production A aA is also useless because there is no way to terminate it. If it never terminates, then it can never produce a string. Hence this production can never take part in any derivation. To remove this useless production A aA, we will first find all the variables which will never lead to a terminal string such as variable 'A'. Then we will remove all the productions in which the variable A' occurs. Simplified Grammar is: T aaB | aaT B ab | b

  41. Elimination of Production The productions of type S are called productions. These type of productions can only be removed from those grammars that do not generate . Step 1: First find out all null able non-terminal variable which derives . Step 2: For each production A a, construct all production A x, where x is obtained from a by removing one or more non-terminal from step 1. Step 3: Now combine the result of step 2 with the original production and remove productions.

  42. Example 2 Remove the production from the following CFG by preserving the meaning of it. S XYX X 0X | Y 1Y | Solution: Now, while removing production, we are deleting the rule X and Y . To preserve the meaning of CFG we are actually placing at the right-hand side whenever X and Y have appeared. Let us take S XYX If the first X at right-hand side is . Then S YX if the last X in R.H.S. = . Then S XY If Y = then S XX If Y and X are then, S X

  43. Example 3 Remove the production from the following CFG by preserving the meaning of it. S XYX X 0X | Y 1Y | Solution: If both X are replaced by S Y Now, S XY | YX | XX | X | Y Now let us consider X 0X If we place at right-hand side for X then, X 0 X 0X | 0 Similarly Y 1Y | 1 Collectively we can rewrite the CFG with removed production as S XY | YX | XX | X | Y X 0X | 0 Y 1Y | 1

  44. Removing Unit Productions The unit productions are the productions in which one non-terminal gives another non-terminal. Use the following steps to remove unit production: Step 1: To remove X Y, add production X a to the grammar rule whenever Y a occurs in the grammar. Step 2: Now delete X Y from the grammar. Step 3: Repeat step 1 and step 2 until all unit productions are removed.

  45. Example 4 construct reduced grammar equivalent to the grammar S 0A | 1B | C A 0S | 00 B 1 | A C 01 Solution: S C is a unit production. But while removing S C we have to consider what C gives. So, we can add a rule to S. S 0A | 1B | 01 Similarly, B A is also a unit production so we can modify it as B 1 | 0S | 00 Thus finally we can write CFG without unit production as S 0A | 1B | 01 A 0S | 00 B 1 | 0S | 00 C 01 As C is non-reachable symbol, we have to remove C. so simplified grammar after removing C is: S 0A | 1B | 01 A 0S | 00 B 1 | 0S | 00

  46. Example 5 construct reduced grammar equivalent to the grammar S aAa A Sb | bcc | DaA C abb | DD D aDA E ac Finding non - generating symbols: D is non generating symbol So, reduced grammar is after elimination of non generating symbol D is: S aAa A Sb | bcc C abb E ac Finding Non- reachable symbols: C and E are non- generating symbols: So, reduced grammar after elimination of non -reachable symbols A and E is S aAa A Sb | bcc As there are no unit productions and productions in given grammar so reduced grammar is : S aAa A Sb | bcc

  47. Example 6 Simplify the following grammar: S 0A0 | 1B1 | BB A C B S| A C S| Solution: Elimination of production: S 0A0 | 1B1 | BB | 00 | 11 | B A C B S| A C S Eliminate Unit Productions: As B A C S B S 0A0 | 1B1 | BB | 00 | 11 A 0A0 | 1B1 | BB | 00 | 11 B 0A0 | 1B1 | BB | 00 | 11 C 0A0 | 1B1 | BB | 00 | 11 Elimination of non reachable symbol C: S 0A0 | 1B1 | BB | 00 | 11 A 0A0 | 1B1 | BB | 00 | 11 B 0A0 | 1B1 | BB | 00 | 11

  48. Example 7 Simplify the following grammar: S ASB | A aAS | a B SbS | A | bb Solution: Eliminate productions: S ASB | AB A aAS| aA | a B SbS | Sb | bS | b | A | bb Eliminate Unit Production: Eliminate unit production A form production rule B S ASB | AB A AaS | Aa | a B SbS | Sb | bS | b | aAS | aA | bb |a As there are no useless symbols , so simplified grammar is: S ASB | AB A AaS | Aa | a B SbS | Sb | bS | b | aAS | aA | bb |a

  49. Example 8 Simplify the following grammar: S aC| SB A bSCa B aSB|bBC C aBC|ad Solution: Elimination of non generation symbols: symbol B is non generating symbol S aC A bSCa C ad Elimination of non reachable symbols: Symbol A is non reachable symbol S aC C ad As there is no production and there is no unit production in given grammar the reduced grammar is: S aC C ad

  50. Example 9 Simplify the following grammar S ABA | BA | AA | AB | A | B A Aa | a B bB | b Solution: Eliminate unit production: A and B are unit productions. S ABA | BA | AA | AB | Aa | a | bB | b A Aa | a B bB | b As there is no production and useless symbol in this grammar. So simplified grammar is: S ABA | BA | AA | AB | Aa | a | bB | b A Aa | a B bB | b

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#