0% found this document useful (0 votes)
28 views19 pages

Lec 05

Uploaded by

Ambreen Raja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views19 pages

Lec 05

Uploaded by

Ambreen Raja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Theory of Programming

Languages
LECTURE#5
Chapter # 2

Describing Syntax and Semantics


Introduction

 The study of programming languages (study of natural languages)


can be divided into:
 syntax
 semantics
Introduction

 The syntax of a programming language is the form of its expressions,


statements, and program units.
 Its semantics is the meaning of those expressions, statements, and
program units.
 For example, the syntax of a Java while statement is
while (boolean_expr) statement
Describing Syntax

 Language
 is a set of strings of characters from some alphabet.
 Sentences
 The strings of a language are called sentences or statements.
 Syntax rules
 The syntax rules of a language specify which strings of characters from the
language’s alphabet are in the language.
Describing Syntax

 Lexemes
 the lowest-level syntactic or small units are called lexemes.
 The lexemes of a programming language include its
 numeric literals,
 operators, and
 special words
Describing Syntax

 Identifiers
 Lexemes are partitioned into groups called identifiers.
 For example, the names of variables, methods, classes,
 Token
 Each lexeme group is represented by a name, or token.
Describing Syntax

 For example, an identifier is a token that can have lexemes, or


instances, such as sum and total.
 In some cases, a token has only a single possible lexeme.
 For example, the token for the arithmetic operator symbol + has just
one possible lexeme.
Describing Syntax

 Consider the following Java Lexemes Tokens


statement: index identifier

index = 2 * count + 17; = equal_sign


2 int_literal
* mult_op
count identifier
+ plus_op
17 int_literal
; semicolon
Describing Syntax

 Languages can be formally defined in two distinct ways:


 by recognition
 capable of reading strings of characters
 by generation
 generate the sentences of a language
Formal Methods of Describing Syntax

Backus-Naur Form and Context-Free Grammars


 In the middle to late 1950s, two men, Noam Chomsky and John Backus,
in
unrelated research efforts, developed the same syntax description
formalism, which subsequently became the most widely used method
for programming language syntax.
Grammars

 The grammar classes which describe the syntax of programming


languages named as:
 context-free and regular.
 Regular grammars: The forms of the tokens of programming
languages.
 Context-free grammars: The syntax of whole programming
languages.
BNF

 BNF is a natural notation for describing syntax.


 BNF is a metalanguage for programming languages.
 A metalanguage is a language that is used to describe another language.
 BNF uses abstractions for syntactic structures.
BNF

 A simple Java assignment statement, for example,


<assign> → <var> = <expression>
 The text on the left side of the arrow, which is aptly called the left-
hand side (LHS), is the abstraction being defined. The text to the right
of the arrow is the definition of the LHS. It is called the right-hand
side (RHS) and consists of some mixture of tokens, lexemes, and
references to other abstractions.
 An example sentence whose syntactic structure is described by the rule
is: total = subtotal1 + subtotal2
BNF

 The abstractions in a BNF description, or grammar, are often called


nonterminal symbols, or simply nonterminals.
 The lexemes and tokens of the rules are called terminal symbols, or
simply terminals.
 A BNF description, or grammar, is a collection of rules.
A Grammar for a Small Language

<program> → begin <stmt_list> end


<stmt_list> → <stmt>
| <stmt> ; <stmt_list>
<stmt> → <var> = <expression>
<var> → A | B | C
<expression> → <var> + <var>
| <var> – <var>
| <var>
A derivation of a program:

 <program> => begin <stmt_list> end


=> begin <stmt> ; <stmt_list> end
=> begin <var> = <expression> ; <stmt_list> end
=> begin A = <expression> ; <stmt_list> end
=> begin A = <var> + <var> ; <stmt_list> end
=> begin A = B + <var> ; <stmt_list> end
=> begin A = B + C ; <stmt_list> end
=> begin A = B + C ; <stmt> end
=> begin A = B + C ; <var> = <expression> end
=> begin A = B + C ; B = <expression> end
=> begin A = B + C ; B = <var> end
=> begin A = B + C ; B = C end
A Grammar for Simple Assignment
Statements

<assign> → <id> = <expr>


<id> → A| B | C
<expr> → <id> + <expr>
| <id> * <expr>
| ( <expr>)
| <id>
A Grammar for Simple Assignment
Statements

 For example, the statement


A=B*(A+C)
is generated by the leftmost derivation:
<assign> => <id> = <expr>
=> A = <expr>
=> A = <id> * <expr>
=> A = B * <expr>
=> A = B * ( <expr>)
=> A = B * ( <id> + <expr>)
=> A = B * ( A + <expr>)
=> A = B * ( A + <id>)
=> A = B * ( A + C )

You might also like