BITS Pilani
Principles of
Programming Language
Amit Dua
BITS Pilani Computer Science and Information Systems Department
BITS, Pilani
Pilani Campus
Story so far
BITS Pilani
• Motivation for studying the course
• Architecture on which programs learn
• Parameters to evaluate a programming language
• Implementing Programming Languages
The Compilation
Process BITS Pilani
Pure Interpretation Process
BITS Pilani
BITS Pilani
Hybrid
Implementation
Process
Just-in-Time Implementation
Systems BITS Pilani
• Translate programs to an intermediate language.
• Compile the intermediate language of the subprograms
into machine code only when they are called.
• Machine code version is kept for subsequent calls.
• JIT systems are widely used for Java programs.
• .NET languages are implemented with a JIT system.
Examples
BITS Pilani
C, C++ use compilers
Python uses interpreter
Java uses hybrid implementation, specifically JIT
implementation
Tradeoff in speed and on-the-fly features + user experience
+debugging
Code optimization
BITS Pilani
Eliminate unreachable code
Substitute variables for efficiency
Reduce execution frequency
Evaluations can be done at compile time
Story so far
BITS Pilani
• Motivation for studying the course
• Architecture on which programs learn
• Parameters to evaluate a programming language
• Implementing Programming Languages
BITS Pilani
Questions
BITS Pilani
BITS Pilani
Pilani Campus
Evaluating a language
Lecture 3
Topics
BITS Pilani
Introduction
The General Problem of Describing Syntax
Formal Methods of Describing Syntax
BNF and context-free grammars
Derivation and Parse trees
Ambiguity in grammars
EBNF
Introduction
BITS Pilani
Syntax: the form or structure of the expressions,
statements, and program units
Semantics: the meaning of the expressions, statements,
and program units
The General Problem of
Describing Syntax: Terminology BITS Pilani
A sentence is a string of characters from some alphabet
A language is a set of sentences
A lexeme is the lowest level syntactic unit of a language
(e.g., numeric literals, operators, special symbols, etc.)
A token is a category of lexemes (e.g., identifier)
Example
BITS Pilani
Formal Definition of
Languages BITS Pilani
Recognizers
– A recognition device reads input strings over the alphabet of the
language and decides whether the input strings belong to the
language
– Example: syntax analysis part of a compiler
Generators
– A device that generates sentences of a language
– One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of the
generator
BNF and Context-Free
Grammars BITS Pilani
Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators, meant to describe the syntax of natural languages
– Define a class of languages called context-free languages
Backus-Naur Form (1959)
– Invented by John Backus to describe the syntax of Algol 58
– BNF is equivalent to context-free grammars
BNF Fundamentals
BITS Pilani
• In BNF, abstractions are used to represent classes of
syntactic structures-they act like syntactic variables
(abstractions are also called as nonterminal symbols)
• Terminals are lexemes or tokens
• A rule (or production) has a left-hand side (LHS), which
is a nonterminal, and a right-hand side (RHS), which is a
string of terminals and/or non-terminals.
BNF Fundamentals (continued)
BITS Pilani
• Nonterminals are often enclosed in angle brackets
– Example of BNF rule:
– <if_stmt> → if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules
• A start symbol is a special element of the nonterminals of
a grammar
BNF Rules
BITS Pilani
An abstraction (or nonterminal symbol) can have more
than one RHS
<stmt> <single_stmt>
| begin <stmt_list> end
Another example:
Describing Lists
BITS Pilani
Syntactic lists are described using recursion
<ident_list> identifier
| identifier, <ident_list>
Grammars and Derivations
BITS Pilani
• A grammar is a generative device for defining languages.
• The sentences of the language are generated through a
sequence of applications of the rules, beginning with a
special nonterminal of the grammar called the start
symbol.
• This sequence of rule applications is called a derivation.
A derivation is a repeated application of rules, starting
with the start symbol and ending with a sentence (all
terminal symbols)
Derivations
BITS Pilani
Every string of symbols in a derivation is a sentential form
A sentence is a sentential form that has only terminal
symbols
A leftmost derivation is one in which the leftmost
nonterminal in each sentential form is the one that is
expanded.
Derivation continues until the sentential form contains no
non-terminals.
Derivation order has no effect on the language generated
by the grammar.
Progress
BITS Pilani
Introduction
The General Problem of Describing Syntax
Formal Methods of Describing Syntax
BNF and context-free grammars
Derivation and Parse trees
Ambiguity in grammars
EBNF
A grammar for a small
Language BITS Pilani
<program> -> begin <stmt_list> end
< stmt_list> -> <stmt> | <stmt>;< stmt_list>
< stmt> -> <var> = <expression>
<var> -> A | B | C
<expression> -> <var> + <var>
| <var> - <var>
| <var>
Can this language accept these sentences?
begin
A=B+C; B=C
end
If yes how do we prove this?
Derivation
BITS Pilani
Left Derivation
BITS Pilani
Left-most derivation applies a production to the leftmost
nonterminal at each step.
A=B*(A+C)
<assign> -><id> = <expr>
<id> -> A | B | C
<expr> -> <id> + <expr>
| <id> * <expr>
| ( <expr> )
| <id>
Right Derivation
BITS Pilani
A right-most derivation applies a production rule to the
rightmost nonterminal at each step.
A Grammar for Simple
Assignment Statements BITS Pilani
<assign> → <id> = <expr>
<id> → A | B | C
<expr> → <id> + <expr>
| <id> * <expr>
| ( <expr> ) A=B*(A+C)
| <id> <assign> => <id> = <expr>
=> <id>= <id>* <expr>
=> <id> = <id> * (<expr>)
=> <id> = <id> * (<id> + <expr>)
=> <id> = <id> * ( <id>+<id> )
=> <id> = <id> * ( <id> + C )
=> <id> = <id> * ( A + C )
=> <id> = B * (A + C)
=> A = B * ( A + C )
BITS Pilani
A hierarchical representation of a derivation
Parse tree for
A=B*(A+C)
Derivation
<assign> => <id> = <expr>
=> <id>= <id>* <expr>
=> <id> = <id> * (<expr>)
=> <id> = <id> * (<id> + <expr>)
=> <id> = <id> * ( <id>+<id> )
=> <id> = <id> * ( <id> + C )
=> <id> = <id> * ( A + C )
=> <id> = B * (A + C)
=> A = B * ( A + C )
Ambiguity in grammar
BITS Pilani
• A grammar that generates two or more distinct parse
trees is said to be ambiguous.
<assign> -> <id> = <expr>
<id> -> A | B | C
<expr> -> < expr > + <expr>
| < expr > * <expr>
| ( <expr> )
| <id>
A=B+C*A
Two distinct parse trees
BITS Pilani
Resolving Ambiguity:
Precedence BITS Pilani
• Precedence : * over +
• lower in the parse tree: higher precedence
Using Operator precedence for
designing unambiguous grammar BITS Pilani
• Use additional non-terminals and new rules.
• Force different operators to different levels in the parse
tree.
<assign> -> <id> = <expr>
<id> -> A | B | C
<expr> -> < expr > + <expr>
| < expr > * <expr>
| ( <expr> )
| <id>
Using Operator precedence for
designing unambiguous grammar BITS Pilani
A=B+C*A
Left Most Derivation
BITS Pilani
A=B+C*A
Right Most Derivation
BITS Pilani
A=B+C*A
Parse Tree
BITS Pilani
Parse tree generated is unique as the grammar is
unambiguous.
What about this?
BITS Pilani
A=B+C+A
A=B*C*A
Associativity of Operators
BITS Pilani
• Represent Left associativity by left-recursive grammars
and vice-versa.
• When a grammar rule has its LHS also appearing at the
beginning of its RHS, the rule is said to be left recursive
and vice-versa.
• +, -, * , / are all left associative
• exponentiation operation is right associative.
Example
BITS Pilani
Example
BITS Pilani
Right recursive grammar to represent the exponentiation
operator which is right associative operator.
Problems with BNF Notation
BITS Pilani
• BNF notation is too long.
• Must use recursion to specify repeated occurrences.
• Must use separate an alternative for every option.
Extended BNF
BITS Pilani
Does not add any descriptive power but increases the
readability and writability of BNF.
3 extensions to BNF:
• Optional parts of RHS are placed in brackets [ ]
• Repetitions (0 or more) are placed inside braces { }
• Multiple choice options or Alternative parts of RHSs are
placed inside parentheses and separated via vertical
bars
Tutorial problem
BITS Pilani
Conversion of BNF to EBNF:
– (i) Look for recursion in grammar:
–A-> aA |a
A-> a{a}
– (ii) Look for common string that can be
factored out with grouping and options.
–A ->aB |a
A -> a [B]
EBNF to BNF
BITS Pilani
EBNF to BNF:
– Option: []
–A-> a[B]C
A’->aNC N->B| ε
– Repetition: {}
–A ->a{ B1B2... Bn}C
A’->aNC N->B1B2...BnN| ε
Example
BITS Pilani
Convert the following BNF to EBNF. Assume that <S> is
the starting symbol
S → A | AC
C → bA | bAC
A → aD| abD
D→ z
S → A { bA }
A → a [b] D
D→z
Example
BITS Pilani
Convert the following EBNF to BNF:
S → A { bA } { } repeat
A → a [b] D
D→z [ ] optional
BNF
S → A | AC
C → bA | bAC
A → aD | abD
D→z
BNF and EBNF versions of an
expression grammar BITS Pilani
EBNF limitations
BITS Pilani
Associativity of operators cannot be represented in EBNF.
Recent Variations in EBNF
BITS Pilani
• Alternative RHSs are put on separate lines instead of
using a vertical bar
• Use of a colon instead of ->
• Use of opt for optional parts in place of square brackets.
• Use of oneof for choices