Context Free Grammars
Context-free grammars can generate context-free languages. They do this by taking a set of
variables which are defined recursively, in terms of one another, by a set of production rules.
Context-free grammars are named as such because any of the production rules in the grammar
can be applied regardless of context—it does not depend on any other symbols that may or may
not be around a given symbol that is having a rule applied to it.
Context-free grammars have the following components:
A set of terminal symbols which are the characters that appear in the language/strings
generated by the grammar. Terminal symbols never appear on the left-hand side of the
production rule and are always on the right-hand side.
A set of nonterminal symbols (or variables) which are placeholders for patterns of terminal
symbols that can be generated by the nonterminal symbols. These are the symbols that will
always appear on the left-hand side of the production rules, though they can be included on the
right-hand side. The strings that a CFG produces will contain only symbols from the set of
nonterminal symbols.
A set of production rules which are the rules for replacing nonterminal symbols. Production
rules have the following form: variable → string of variables and terminals.
A start symbol which is a special nonterminal symbol that appears in the initial string
generated by the grammar.
For comparison, a context-sensitive grammar can have production rules where both the left-
hand and right-hand sides may be surrounded by a context of terminal and nonterminal
symbols.
To create a string from a context-free grammar, follow these steps:[1]
Begin the string with a start symbol.
Apply one of the production rules to the start symbol on the left-hand side by replacing
the start symbol with the right-hand side of the production.
Repeat the process of selecting nonterminal symbols in the string, and replacing them
with the right-hand side of some corresponding production, until all nonterminals have
been replaced by terminal symbols. Note, it could be that not all production rules are
used.
Formal Definition
A context-free grammar can be described by a four-element tuple (V,Σ,R,S), where
V is a finite set of variables (which are non-terminal);
Σ is a finite set (disjoint from V) of terminal symbols;
R is a set of production rules where each production rule maps a variable to a
string s∈(V∪Σ)∗;
S (which is in V) which is a start symbol.
Come up with a grammar that will generate the context-free (and also regular) language
that contains all strings with matched parentheses.
There are many grammars that can do this task. This solution is one way to do it, but
should give you a good idea of if your (possibly different) solution works too.
Starting symbol →→ S
Non-terminal variables = {(,)}{(,)}
Production rules:
S →→ ( )
S →→ SS
S →→ (S).
A way to condense production rules is as follows:
We can take
S →→ ( )
S →→ SS
S →→ (S)
and translate them into a single line: S →→ ( ) | SS | (S) | ,ϵ, where ϵ is an empty string.
Strings that end in 0 Strings that have an odd number of 1’s Strings that have an equal
number of 1’s and 0’s Strings that end in 1 Strings that have an even number of 1’s and 0’s
What does the following context-free grammar describe?
S is the start symbol and the set of nonterminal symbols is {0,1}{0,1}. Here are the
production rules:
S →→ 0
S →→ 0S
S →→ 1S.
Here is a context-free grammar that generates arithmetic expressions (subtraction, addition,
division, and multiplication)[1].
Start symbol = <expression>
Terminal symbols = {+,−,∗,/,(,),number},{+,−,∗,/,(,),number}, where
“number” is any number
Production rules:
1. <expression> → number
2. <expression> → (<expression>)
3. <expression> → <expression> + <expression>
4. <expression> →→ <expression> - <expression>
5. <expression> → <expression> * <expression>
6. <expression> → <expression> / <expression>
This allows us to construct whatever expressions using multiplication, addition, division,
and subtraction we want. What these production rules tell us is that the result of any
operation, for example, multiplication, is also an expression (denoted, <expression>).
Using the example above, show the steps of deriving the following
expression: (4+5)∗(2−6)(4+5)∗(2−6). Note, there are many ways to do this, but the
solution below should give you enough guidance to check if your derivation works.
<expression> →→ 4 (using rule 1)
<expression> →→ 5 (using rule 1)
<expression> →→ 4 + 5 (using rule 3)
<expression> →→ (4 + 5) (using rule 2)
<expression> →→ 2 (using rule 1)
<expression> →→ 6 (using rule 1)
<expression> →→ 2 - 6 (using rule 4)
<expression> →→ (2 - 6) (using rule 2)
<expression> →→ (4 + 5) * (2 - 6) (using rule 5)
Context-free grammars can be modeled as parse trees. The nodes of the tree represent the
symbols and the edges represent the use of production rules. The leaves of the tree are the
end result (terminal symbols) that make up the string the grammar is generating with that
particular sequence of symbols and production rules.
x + y * x - z * y / x + x( x + y ) * x - z * y / ( x + x )( x + y ) + x - z + y - ( x + x )x + y + z * y + x
*x
What expression does this parse tree generate?
The conext-free grammar wikipedia page
The parse trees below represent two ways to generate the string "a + a - a" with the
grammar
A→A+A∣A−A∣a.
Example of an ambiguous grammar—one that can have multiple ways of generating the same string[2]
Because this grammar can be implemented with multiple parse trees to get the same
resulting string, this is said to be ambiguous.