Lecture21-22 Compiler Construction
Lecture21-22 Compiler Construction
(CS-636)
1
Outline
2
Semantic Analysis
Lecture: 21-22
3
Data Types & Type Checking
5
Type Expressions & Type
Constructors
A programming language always contain a number
of built-in types
These predefined types correspond either to
numeric data types like int or double OR they are
elementary types like boolean or char
Such data types are called simple types, in that
their values exhibit no explicit internal structure
An interesting predefined type in C language is
void type
This type has no values, and so represents empty set
6
Type Expressions & Type
Constructors (Continue)
In some languages it is possible to define new
simple types
subrange in Pascal and enumerated types in C
In Pascal, subrange of integers from 0 to 9 can be
declared as
type Digit = 0..9;
7
Type Expressions & Type
Constructors (Continue)
Given a set of predefined types, new data types can
be created using type constructors, such as array
and record, or struct
Such constructors can be viewed as functions that
take existing types as parameters and return new
types with a structure that depends on the
constructor
Such types are called structured types
8
Type Names, Type Declarations, and
Recursive Types
Languages that have a rich set of type constructors
usually also have a mechanism for a programmer to
assign names to type expressions
Such type declarations (sometimes called type
definitions) can be done in C as follows
struct RealIntRec {
double r;
int I;
};
9
Type Names, Type Declarations, and
Recursive Types (Continue)
Type declarations cause the declared type names to
be entered into the symbol table just as variable
declarations cause variable names to be entered
Type names are associated with attributes in the
symbol table in a similar way to variable declarations
These attributes include scope and type
expressions corresponding to the type name
Since type names can appear in type expressions,
question arise about the recursive use of type
names
10
Type Names, Type Declarations, and
Recursive Types (Continue)
In C programming language, recursive type names
cannot be declared directly because at time of
declaration it is unknown that how much memory be
required for the structure;
struct intBST {
int val;
struct intBST *left, *right;
};
11
Type Equivalence
12
Type Equivalence (Continue)
13
Type Inference & Type Checking
14
Type Inference & Type Checking
(Continue)
15
Type Inference & Type Checking
(Continue)
16
Intermediate-Code
Generation
Back-end of a Compiler
17
Where Are We Now?
Source code
Scanner
Tokens
Parser
Syntax Tree
Semantics Analyzer
Annotated Tree
18
Intermediate-Code Generation
21
Intermediate-Code Generation
(Continue)
22
Intermediate-Code Generation
(Continue)
23
Variants of Syntax Trees
24
Directed Acyclic Graphs for
Expressions
A directed acyclic graph (DAG), is a directed graph
with no directed cycles
Like syntax tree for an expression, a DAG has
leaves corresponding to atomic operands and
interior nodes corresponding to operators
A node N in a DAG has more than one parent if N
represents a common subexpression
A DAG not only represents expressions more
succinctly, it gives the compiler important clues
regarding the generation of efficient code to
evaluate the expression
25
Directed Acyclic Graphs for
Expressions (Continue)
Create Syntax Trees and DAGs for the following
expressions
a = a + 10
a + b + (a + b)
a+b+a+b
a + a * (b c) + (b c) * d
26
The Value-Number Method for
Constructing DAGs
Often, the nodes of a syntax tree or DAG are stored
in an array of records
Each row of the array represents one record, and
therefore one node
Consider the figure on next slide that shows a DAG
along with an array for expression i = i + 10
27
The Value-Number Method for
Constructing DAGs (Continue)
In the following figure leaves have one additional
field, which holds the lexical value, and interior
nodes have two additional fields indicating the left
and right children
28
The Value-Number Method for
Constructing DAGs (Continue)
In the array, we refer to nodes by giving the integer
index of the record for that node within the array
This integer is called the value number for the node
or for the expression represented by the node
29
Three-Address Code
30
Three-Address Code (Continue)
Exercise
Represent the following DAG in three-address code
sequence
31
Addresses and Instructions
32
Addresses and Instructions (Continue)
33
Addresses and Instructions (Continue)
34
Addresses and Instructions (Continue)
35
Quadruples & Triples
36
Quadruples
37
Quadruples (Continue)
38
Triples
39
Triples (Continue)
40
Static Single-Assignment Form
The Static Single-Assignment Form (SSA) is an
intermediate representation that facilitates certain
code optimizations
Two aspects distinguish SSA from three-address
code
All assignments in SSA are to variables with distinct names
SSA uses a notational convention -function to combine
two definitions of same variables
if( flag ) x = -1; else x = 1;
y = x + a
41
Summary
Any Questions?
42