UNIT 5
Q1. What is DAG? Explain its use in code generation. Generate DAG
T1=A+B T2=C+D
T3=E-T2
T4=T1-T3
Directed Acyclic Graph :
The Directed Acyclic Graph (DAG) is used to represent the structure of basic blocks, to
visualize the flow of values between basic blocks, and to provide optimization techniques in
the basic block. To apply an optimization technique to a basic block, a DAG is a three-address
code that is generated as the result of an intermediate code generation.
• Directed acyclic graphs are a type of data structure and they are used to apply
transformations to basic blocks.
• The Directed Acyclic Graph (DAG) facilitates the transformation of basic blocks.
• DAG is an efficient method for identifying common sub-expressions.
• It demonstrates how the statement’s computed value is used in subsequent
statements.
Q3. What is DAG? Explain role of DAG in Code generation phase.
The Directed Acyclic Graph (DAG) is used to represent the structure of basic blocks, to
visualize the flow of values between basic blocks, and to provide optimization techniques in
the basic block. To apply an optimization technique to a basic block, a DAG is a three-address
code that is generated as the result of an intermediate code generation.
1. In compiler design, a Directed Acyclic Graph (DAG) is commonly used to represent
the control flow and data dependencies of a program. This representation is often
used as an intermediate representation (IR) that facilitates program optimization and
transformation.
2. When a program is compiled, it goes through several stages, including lexical
analysis, parsing, semantic analysis, and code generation. During these stages, the
compiler constructs a DAG representation of the program that captures the control
flow and data dependencies between its components.
3. One common use of DAGs in compiler design is to represent expressions. An
expression can be represented as a DAG in which each node represents a sub-
expression, and each edge represents a dependency between sub-expressions. The use
of DAGs in this context enables the compiler to perform optimizations such as
common sub-expression elimination and constant folding, which can significantly
improve the performance of the generated code.
4. Another use of DAGs in compiler design is to represent the control flow of a
program. A control flow DAG is a representation of the program’s control flow in
which each node represents a basic block of code, and each edge represents a control
flow transfer between basic blocks. Control flow DAGs can be used to perform
optimizations such as dead code elimination and loop unrolling.
5. Overall, the use of DAGs in compiler design provides an efficient and effective
approach to representing the control flow and data dependencies of a program,
enabling the compiler to perform a variety of optimizations that can significantly
improve the performance of the generated code.
Q4. Explain : Issues in code generation.
Input to code generator – The input to the code generator is the intermediate code generated
by the front end, along with information in the symbol table that determines the run-time
addresses of the data objects denoted by the names in the intermediate representation.
Intermediate codes may be represented mostly in quadruples, triples, indirect triples, Postfix
notation, syntax trees, DAGs, etc. The code generation phase just proceeds on an assumption
that the input is free from all syntactic and state semantic errors, the necessary type checking
has taken place and the type-conversion operators have been inserted wherever necessary.
Target program: The target program is the output of the code generator. The output may be
absolute machine language, relocatable machine language, or assembly language.
• Absolute machine language as output has the advantages that it can be placed in a fixed
memory location and can be immediately executed. For example, WATFIV is a
compiler that produces the absolute machine code as output.
• Relocatable machine language as an output allows subprograms and subroutines to be
compiled separately. Relocatable object modules can be linked together and loaded by a
linking loader. But there is added expense of linking and loading.
• Assembly language as output makes the code generation easier. We can generate
symbolic instructions and use the macro-facilities of assemblers in generating code.
And we need an additional assembly step after code generation.
• Memory Management – Mapping the names in the source program to the addresses of
data objects is done by the front end and the code generator. A name in the three
address statements refers to the symbol table entry for the name. Then from the symbol
table entry, a relative address can be determined for the name.
Instruction selection – Selecting the best instructions will improve the efficiency of the
program. It includes the instructions that should be complete and uniform. Instruction speeds
and machine idioms also play a major role when efficiency is considered. But if we do not care
about the efficiency of the target program then instruction selection is straightforward. For
example, the respective three-address statements would be translated into the latter code
sequence as shown below:
P:=Q+R
S:=P+T
MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S
Here the fourth statement is redundant as the value of the P is loaded again in that statement
that just has been stored in the previous statement. It leads to an inefficient code sequence. A
given intermediate representation can be translated into many code sequences, with significant
cost differences between the different implementations. Prior knowledge of instruction cost is
needed in order to design good sequences, but accurate cost information is difficult to predict.
Register allocation issues – Use of registers make the computations faster in
•
comparison to that of memory, so efficient utilization of registers is important. The
use of registers is subdivided into two subproblems:
1. During Register allocation – we select only those sets of variables that will reside
in the registers at each point in the program.
2. During a subsequent Register assignment phase, the specific register is picked to
access the variable.
To understand the concept consider the following three address code sequence
t:=a+b
t:=t*c
t:=t/d
Their efficient machine code sequence is as follows:
MOV a,R0
ADD b,R0
MUL c,R0
DIV d,R0
MOV R0,t
1. Evaluation order – The code generator decides the order in which the instruction
will be executed. The order of computations affects the efficiency of the target
code. Among many computational orders, some will require only fewer registers to
hold the intermediate results. However, picking the best order in the general case is
a difficult NP-complete problem.
2. Approaches to code generation issues: Code generator must always generate the
correct code. It is essential because of the number of special cases that a code
generator might face. Some of the design goals of code generator are:
• Correct
• Easily maintainable
• Testable
• Efficient
Q5. Write all tree-techniques used for code generator - generator concept.
A code generator is a compiler that translates the intermediate representation of the source
program into the target program. In other words, a code generator translates an abstract syntax
tree into machine-dependent executable code. The process of generating machine-dependent
output from an abstract syntax tree involves two steps: one for constructing the abstract syntax
tree and another for generating its corresponding machine code.
The first step involves constructing an Abstract Syntax Tree (AST) by traversing all possible
paths through your input file(s). This tree will contain information about every bit of data in your
program as they are encountered during parsing or execution time; it’s important to note that this
can take place both at compile time (as part of compiling) or runtime (in some cases).
Register Descriptor
Register descriptors are data structures that store information about the registers used in the
program. This includes the registration number and its name, along with its type. The compiler
uses this information when generating machine code for your program, so it’s important to keep
it up-to-date while writing code!
The compiler uses the register file to determine what values will be available for use in your
program. This is done by walking through each of the registers and determining if they contain
valid data or not. If there’s nothing in a register, then it can be used for other purposes!
Address Descriptor
An address descriptor is used to represent the memory locations used by a program. Address
descriptors are created by the getReg function, which returns a structure containing information
about how to access memory. Address descriptors can be created for any instruction in your
program’s code and stored in registers or on the stack; however, only one instance of an address
descriptor will exist at any given time (unless another thread is executing).
When the user wants to retrieve data from an arbitrary location within the program’s source code
using getReg, call this method with two arguments: The first argument specifies which register
contains your desired value (e.g., ‘M’), while the second argument specifies where exactly within
this register should it be placed back onto its original storage location on disk/memory before
returning it back up into main memory again after successfully accessing its contents via indirect
calls like LoadFromBuffer() or StoreToBuffer().
Code Generation Algorithm
The code generation algorithm is the core of the compiler. It sets up register and address
descriptors, then generates machine instructions that give you CPU-level control over your
program.
The algorithm is split into four parts: register descriptor set-up, basic block generation,
instruction generation for operations on registers (e.g., addition), and ending the basic block with
a jump statement or return command.
Register Descriptor Set Up: This part sets up an individual register’s value in memory space
by taking its index into an array of all possible values for that type of register (i32). It also stores
information about what kind of operation was performed on it so that subsequent steps can
identify which operation happened if they’re called multiple times during execution.
Basic Block Generation: This step involves creating individual blocks within each basic block
as well as lines between them so we can keep track of where things are happening at any given
moment during execution.
Instruction Generation For Operations On Registers: This step converts source code
statements into machine instructions using information from both our ELF file format files (the
ones generated by GCC) as well as other sources such as Bazel’s build system which knows how
to generate particular kind of machine code for particular CPUs. This is where we start to see
the magic of how compilers work in practice, as they’re able to generate code that’s optimized
in various ways based on the type of operation being performed (e.g., addition) and the registers
involved (i32). This step can also be thought of as “register allocation” because it’s where we
determine which registers will be used for each operation, and how many there are in total. This
step uses the information generated in the previous steps as well as other information such as
rules about how many registers are needed for certain operations. For example, we might know
that 32-bit addition requires two registers: one to hold the value being added, and one for the
result of this operation.
Instruction Scheduling: This step reorders instructions so that they’re executed efficiently on
a particular CPU architecture. This step uses information about the execution resources available
on each CPU architecture to determine the best order for executing operations. It also considers
things like whether or not we have enough registers to store values (if some are in use), or if
there’s a bottleneck somewhere else in the pipeline.
Design of the Function getReg
The getReg function is the main function that returns the value of a register passed in. It uses
two parameters: A register number, and an action to perform on it. When you call getReg with
no parameter, it will return all registers’ values (i.e., all registers).
If you want to return a specific register’s value, then you can call getReg with that register
number and nothing else; if there are other parameters after this one (ie: 2nd parameter), then
they’ll be searched for related to that first parameter’s type instead of being added as yet another
argument after everything else has been evaluated already — this way we don’t waste any time
processing data when nothing happens at all! If there isn’t anything after those two types but just
an empty string (” “); then nothing happens either!
The output of this phase is a sequence of machine instructions that can be executed, with
the help of a runtime system. This code generator generates assembly language for the target
computer and object code for the target computer. The code generator is responsible for
generating the assembly language for the target computer. It takes as input an intermediate
format (sometimes called a compiler IR), which has been processed by the parser and typed
checker but not yet lowered into machine code.
The code generator is also responsible for generating object code that can be executed on the
target computer. This object code is usually in a format specific to the target architecture, such
as Intel 8086 or Motorola 68000.
The compiler front end parses source code and performs some initial analysis on it. It then passes
this data through several phases of compilation which turns it into machine instructions that can
run on a computer processor.
[Link] short notes on
1) Peephole optimization
2) Code movement
Peephole optimization is a type of code Optimization performed on a small part of the code. It
is performed on a very small set of instructions in a segment of code.
The small set of instructions or small part of code on which peephole optimization is
performed is known as peephole or window.
It basically works on the theory of replacement in which a part of code is replaced by shorter
and faster code without a change in output. The peephole is machine-dependent optimization.
Objectives of Peephole Optimization:
The objective of peephole optimization is as follows:
1. To improve performance
2. To reduce memory footprint
3. To reduce code size
Peephole Optimization Techniques
A. Redundant load and store elimination: In this technique, redundancy is eliminated.
Initial code:
y = x + 5;
i = y;
z = i;
w = z * 3;
Optimized code:
y = x + 5;
w = y * 3; //* there is no i now
//* We've removed two redundant variables i & z whose value were just being copied from one
another.
B. Constant folding: The code that can be simplified by the user itself, is simplified. Here
simplification to be done at runtime are replaced with simplified code to avoid additional
computation.
Initial code:
x = 2 * 3;
Optimized code:
x = 6;
C. Strength Reduction: The operators that consume higher execution time are replaced by the
operators consuming less execution time.
Initial code:
y = x * 2;
Optimized code:
y = x + x; or y = x << 1;
Initial code:
y = x / 2;
Optimized code:
y = x >> 1;
D. Null sequences/ Simplify Algebraic Expressions : Useless operations are deleted.
a := a + 0;
a := a * 1;
a := a/1;
a := a - 0;
E. Combine operations: Several operations are replaced by a single equivalent operation.
F. Deadcode Elimination:- Dead code refers to portions of the program that are never
executed or do not affect the program’s observable behavior. Eliminating dead code helps
improve the efficiency and performance of the compiled program by reducing unnecessary
computations and memory usage.
Initial Code:-
int Dead(void)
{
int a=10;
int z=50;
int c;
c=z*5;
printf(c);
a=20;
a=a*10; //No need of These Two Lines
return 0;
}
Optimized Code:-
int Dead(void)
{
int a=10;
int z=50;
int c;
c=z*5;
printf(c);
return 0;
}
Code motion, a sophisticated compiler optimization technique, plays a pivotal role in
enhancing the performance of computer programs. By strategically relocating statements or
expressions, code motion reduces the frequency at which they are executed, leading to
improved program efficiency. This optimization approach effectively minimizes redundant
computations, optimizes cache utilization, and boosts overall program execution. This article
explores the concept of code motion, its manifold benefits, and provides insights into its
professional application for optimizing program performance.
Types of Code Motion
Following are the types of Code Motion:
• Loop-Invariant Code Motion (LICM)
• Conditional Code Motion
Loop-Invariant Code Motion (LICM)
LICM focuses on recognizing expressions or statements within loops that remain unchanged
throughout the loop’s execution. These loop-invariant computations can be safely moved outside
the loop, significantly reducing the number of times they need to be evaluated. This optimization
proves especially effective when dealing with loops that encompass a substantial number of
iterations.
Conditional Code Motion
Conditional code motion emphasizes the relocation of conditional statements or computations
from loops whenever possible. By decreasing the frequency of condition evaluations, this
optimization minimizes the impact of branching instructions, ultimately resulting in improved
program performance.
Q7. Explain Dynamic programming algorithm for code generation.
this works for machines whereby all computations are done in registers whereby instructions
consist of operators applied in two registers or a register and memory location.
A dynamic programming algorithm is used to extend the class of machines for which optimal
code can be generated from expressions trees in linear time. This algorithm works for a broad
class of register machines with complex instruction sets.
Such an algorithm is used to generate code for machines with r interchangeable registers R0, R1,
..., Rr-1 and load, store and operation instructions.
Throughout this article, we will assume that every instruction has a cost of one unit however a
dynamic programming algorithm works even with instructions having their own costs.
The algorithm has three phases, we assume the machine has three registers;
1. First we compute bottom-up for each node n of the expression tree T an array C of costs
whereby its ith component C[i] is the optimal cost of computing the subtree S rooted
at n into a register, assuming i registers are available for the computation for 1 ≤ i ≤ r.
2. Traverse T using the cost vectors to determine subtrees of T that must be computed into
memory.
3. Traverse each tree using the cost vectors and associate instructions to generate the target
code. The code for the subtrees that are computed into memory locations is generated
first.
Q9. Explain with example: i) Basic blocks and flow graph
A graph representation of three-address statements, called a flow graph, is useful for
understanding code-generation algorithms, even if the graph is not explicitly constructed by a
code-generation algorithm. Nodes in the flow graph represent computations, and the edges
represent the flow of control. Flow graph of a program can be used as a vehicle to collect
information about the intermediate program. Some register-assignment algorithms use flow
graphs to find the inner loops where a program is expected to spend most of its time.
BASIC BLOCKS
A basic block is a sequence of consecutive statements in which flow of control
enters at the beginning and leaves at the end without halt or possibility of branching except at
the end. The following sequence of three-address statements forms a basic block:
t1 := a*a
t2 := a*b
t3 := 2*t2
t4 := t1+t3
t5 := b*b
t6 := t4+t5
A three-address statement x := y+z is said to define x and to use y or z. A name in a basic
block is said to live at a given point if its value is used after that point in the program,
perhaps in another basic block.
The following algorithm can be used to partition a sequence of three-address statements into
basic blocks.
Algorithm 1: Partition into basic blocks.
Input: A sequence of three-addressstatements.
Output: A list of basic blocks with each three-address statement in exactly one block.
Method:
1. We first determine the set of leaders, the first statements of basic blocks.
The rules we use are the following:
I) The first statement is a leader.
II) Any statement that is the target of a conditional or unconditional goto is a leader.
III) Any statement that immediately follows a goto or conditional goto statement is a
leader.
2. For each leader, its basic block consists of the leader and all statements up to but not
including the next leader or the end of the program.
Example 3: Consider the fragment of source code shown in fig. 7; it computes the dot
product of two vectors a and b of length 20. A list of three-address statements performing
this computation on our target machine is shown in fig. 8.
Begin
prod := 0;
i := 1;
do begin
prod := prod + a[i] * b[i];
i := i+1;
end
while i<= 20
end
Let us apply Algorithm 1 to the three-address code in fig 8 to determine its basic
blocks. statement (1) is a leader by rule (I) and statement (3) is a leader by rule (II), since the
last statement can jump to it. By rule (III) the statement following (12) is a leader. Therefore,
statements (1) and (2) form a basic block. The remainder of the program beginning with
statement (3) forms a second basic block.
(1) prod := 0
(2) i := 1
(3) t1 := 4*i
(4) t2 := a [ t1 ]
(5) t3 := 4*i
(6) t4 :=b [ t3 ]
(7) t5 := t2*t4
(8) t6 := prod +t5
(9) prod := t6
(10) t7 := i+1
(11) i := t7
(12) if i<=20 goto (3)
Q10. Explain simple code generation algorithm? Generate code for following
C Program
main ( )
{
inti;
int a [10];
while (i < 10)
a [i] = 0;
}