#Chapter 1 - CD
#Chapter 1 - CD
Mattu University
Email:[email protected]
Compiler Design
Chapter One
Textbook:
Alfred V. Aho, Ravi Sethi, and Jeffrey D.
Ullman,
“Compilers: Principles, Techniques, and Tools”
Addison-Wesley, 2007.
3
Objectives
At the end of this session students will be able to:
Understand the basic concepts and principles of Compiler Design
works.
Be familiar with cousins of compiler: Interpreters,
Assemblers.
Understand the need of studying Compiler Design and
Construction
Understand the Phases of Compilation and the steps of
Compilation. 4
LANGUAGE PROCESSING SYSTEM IN COMPILER DESIGN
source program
preprocessor
Modified source program/ Pure
HLL
compiler
assembler
7
CONT…
The linker is part of the library files. The loader is part of an operating system.
The linker performs the linking operation. The loader loads the program for execution.
Linker Combines object files into a single Loader Loads executable files into memory
executable file. for execution.
Object files generated by the compiler. Executable files generated by the linker.
Linker is a Single executable file. Loader loads the program into memory.
Linker assigns memory addresses to code Loader Allocates memory for the program in
and data sections. the process space.
Linker does not execute the program. Loader executes the program in memory.
9
What is a compiler?
A compiler is a software that converts program written in High
Level Language (HLL) (source program) to equivalent program
in a target language.
A program that reads a program written in one language and
translates it into another language.
Traditionally, compilers go from high-level languages to low-
level languages.
Source Target
Program Compiler Program
Error
10
Cont…
Source Program is normally a program written in a high-
level language.
Target Program is normally the equivalent program in
machine code (relocatable object file)
11
11
Cousins of Compiler
A. Assembler:- is a translator that converts programs written in
assembly language into machine code.
Translate mnemonic operation codes to their machine language equivalents.
Assigning machine addresses to symbolic labels.
12
Contd…
13
Compiler vs. Interpreter
Compiler Interpreter
Takes Entire program as input Take single instruction as input
It is Faster It is Slower
Required more memory due to Required less memory As no
intermediate object code intermediate code is generated
Program not need compile every Every time higher level program is
time converted into lower level program.
Errors are displayed after entire Errors are displayed for every
program is checked. instruction interpreted.
Debugging is comparatively hard. Debugging is easy.
Ex: C, C++. Ex: python, Ruby, basic. 14
Basic Compiler Design
Write a huge program that takes as input another program in the source
language for the compiler, and gives as output an executable that we can run.
For modifying code easily, usually, we use modular design
(decomposition) methodology to design a compiler.
Two design strategies:
1. Write a “front end” of the compiler (i.e. the lexer, parser, semantic analyzer,
and assembly tree generator), and write a separate back end for each platform
that you want to support
2. Write an efficient highly optimized back end, and write a different front end
for several languages, such as Fortran, C, C++, and Java.
Sour Intermedi Targe
ce Front End ate Back End t
code code code 15
Major Parts of Compilers
There are two major parts of a compiler: Analysis and
Synthesis
Analysis (machine independent)- front end
Synthesis (machine dependent)- back end
In analysis phase, an intermediate representation is
created from the given source program. Analysis
determines the operations implied by the source
program which are recorded in a tree structure
Lexical Analyzer, Syntax Analyzer and Semantic Analyzer are
the parts of this phase.
representation
6. Code
Takes the tree structure and
Generation
translates the operations
into the target program 17
Phases of Compiler
Compiler Phases: A compiler operates as a sequence of
phases, each of which transforms the source program
from one intermediate representation to another.
Each phase transforms the source program from one
representation into another representation.
They communicate with error handlers.
They communicate with the symbol table.
18
Phases of Compiler
Source Program
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Code Optimizer
Code Generation
Target Program
19
Cont…
Code
Optimizer
21
Phase I: Lexical Analyzer
(Scanner)
Lexical Analyzer reads the source program character by character and returns
the tokens of the source program.
A token describes a pattern of characters having same meaning in the source
program. (such as identifiers, operators, keywords, numbers, delimiters and so
on)
Ex1: newval = oldval + 12 => tokens: newval identifier
= assignment
operator
oldval identifier
+ add operator
12 a number
22
Example- 2
Input: result = a + b * c / d
Phase II: Syntax Analyzer
A Syntax Analyzer creates the syntactic structure (generally
a parse tree) of the given program.
A syntax analyzer is also called as a Parser.
A Parse tree describes a syntactic structure.
Constructed by repeated application of rules in Context Free
Grammar (CFG)
24
Input: result = a + b * c / d
Exp ::= Exp ‘+’ Exp Assign
| Exp ‘*’ Exp
| Exp ‘/’ Exp
| ID ID ‘=‘ Exp
Assign ::= ID ‘=‘ Exp
Exp ‘+’ Exp
ID ID
25
25
Syntax Analyzer (CFG)
The type of the identifier newval must match with type of the
28
Semantic Analysis
Syntactic/semantic
Syntactic structure
Scanner Parser structure Semantic Code
Source Target
language
(lexical (syntax Analysis Generator
language
analysis) analysis) (IC generator)
Syntactic/semantic
structure
Code
Optimizer
• “Meaning”
• Type/Error Checking
• Intermediate Code
Generation – abstract machine Symbol
Table
29
Phase IV: Intermediate Code
Generation
A compiler may produce an explicit intermediate codes
representing the source program.
These intermediate codes are generally machine
codes(architecture independent).
TAC(Three Address Code)
Ex: newval := oldval + fact * 1
id1 := id2 + id3 * 1
temp1 := inttoreal (1)
temp2 := id3 * temp1
temp3 := id2 + temp2
newval := temp3
30
Phase V: Code Optimizer
The code optimizer optimizes the code
produced by the intermediate code
generator in the terms of time and space.
Improving efficiency (machine
independent)
Phase
Ex: VI: Code
temp1 := id3 *Generator
1.0
id1 := id2 + temp1
Produces the target language in a specific
architecture.
The target program is normally is a relocatable
object file containing the machine codes.
Ex: ( assume that we have an architecture with instructions whose at
least one of its operands is
a machine register)
MOVE id3,R1
MULT #1,R1
ADD id2,R1 31
Summary of Phases of Compiler
32
Compiler-Construction Tools
37