CO1
1. List out the phases of a compiler.
The phases of a compiler are:
1. Lexical Analysis
2. Syntax Analysis
3. Semantic Analysis
4. Intermediate Code Generation
5. Code Optimization
6. Code Generation
7. Error Handling
8. Symbol Table Management
2. Describe the two parts of a compilation.
The two main parts of compilation are:
1. Analysis Phase: This phase breaks the source code into parts and creates an
intermediate representation. It includes lexical, syntax, and semantic analysis.
2. Synthesis Phase: This phase takes the intermediate representation and generates
target code. It includes code optimization and code generation.
3. Differentiate between compiler and interpreter.
Feature Compiler Interpreter
Execution Translates the whole program at once Translates and executes line by line
Speed Faster execution after compilation Slower due to line-by-line execution
Output Generates object code Does not produce object code
Error
Detects errors after full compilation Detects errors line by line
Detection
4. List various Compiler Construction tools.
Various compiler construction tools include:
1. Lex – for lexical analysis
2. Yacc – for syntax analysis
3. Parser generators
4. Syntax-directed translation engines
5. Data flow analysis engines
6. Code generation tools
5. Define Symbol Table.
A Symbol Table is a data structure used by a compiler to store information about the
occurrence of identifiers such as variable names, function names, objects, etc., in the
source code. It helps in semantic analysis and code generation.
6. What is an interpreter?
An interpreter is a program that reads and executes source code line by line. It does not
produce intermediate object code and immediately executes each instruction. It is
commonly used for scripting languages like Python and JavaScript.
7. Illustrate diagrammatically how a language is processed.
8. What is the role of lexical analysis phase?
The lexical analysis phase reads the source program and converts it into a sequence of
tokens. It removes whitespace and comments, detects lexical errors, and passes the
stream of tokens to the syntax analyzer. It also maintains the symbol table entries for
identifiers.
9. List the various compiler construction tools.
Various compiler construction tools include:
1. Lex – Lexical analyzer generator
2. Yacc – Parser generator
3. Syntax-directed translation engines
4. Data flow analysis tools
5. Intermediate code generators
6. Debuggers and profilers
7. Code optimizers
10. List the operations on languages.
Operations on languages include:
1. Union
2. Concatenation
3. Kleene Closure (Repetition)
4. Intersection
5. Complementation
6. Difference
7. Reversal
11. Discuss how you will group the phases of the compiler.
Compiler phases can be grouped into two main parts:
1. Analysis Phase: Includes Lexical Analysis, Syntax Analysis, Semantic Analysis,
and Intermediate Code Generation. It breaks down the source code and checks for
correctness.
2. Synthesis Phase: Includes Code Optimization and Code Generation. It produces
the target machine code from the intermediate representation.
12. Mention a few cousins of the compiler.
Some cousins of the compiler are:
1. Interpreter – Executes code line by line
2. Assembler – Converts assembly code to machine code
3. Preprocessor – Processes directives before compilation
4. Linker – Combines object files into a single executable
5. Loader – Loads the executable into memory for execution
CO2
1. Write about the Frontend Backend model of a compiler.
In the Frontend-Backend model of a compiler:
• Frontend includes lexical analysis, syntax analysis, and semantic analysis. It
checks the correctness of the source code and produces an intermediate
representation.
• Backend includes code optimization and code generation. It takes the
intermediate representation and produces optimized target code.
*2. Construct a syntax tree for the Regular Expression ab(a|b)abb.
A simplified syntax tree for the regular expression ab(a|b)*abb:
3. Discuss the possible error recovery actions in lexical analyzer.
Error recovery actions in lexical analysis include:
1. Panic mode recovery – Skip input until a well-defined token is found.
2. Deleting extra characters – Remove characters causing errors.
3. Inserting missing characters – Insert a likely missing character.
4. Replacing characters – Replace an incorrect character with a likely correct one.
5. Transposing adjacent characters – Fix errors due to wrong ordering.
4. Define buffer pair. Why is buffering used in lexical analysis?
A buffer pair is a two-part buffer used to hold portions of the input source code during
lexical analysis.
Buffering is used to reduce the number of I/O operations and speed up scanning by
reading large chunks of input at once instead of character by character.
5. Write the Regular expressions for identifier and number.
• Identifier: [a-zA-Z_][a-zA-Z0-9_]*
• Number: [0-9]+(\.[0-9]+)? (for integers and floating-point numbers)
6. List the commonly used buffering methods.
Common buffering methods include:
1. Single Buffering
2. Double Buffering (Buffer Pair)
3. Sentinel Buffering
4. Circular Buffering
7. Show the advantage of having sentinels at the end of each buffer halves in buffer
pairs.
Sentinels help in avoiding explicit end-of-buffer checks by placing a special sentinel
character (e.g., EOF) at the end of each buffer half. This improves performance by
reducing the need for repeated boundary checks during scanning.
8. Define lex and give its execution steps.
Lex is a lexical analyzer generator that converts regular expressions into a lexical analyzer
(scanner).
Execution steps:
1. Write Lex specifications (patterns and actions).
2. Run Lex to generate lex.yy.c.
3. Compile lex.yy.c using a C compiler.
4. Run the resulting executable to perform lexical analysis.
9. List the various parts in LEX program.
A LEX program has three parts:
1. Definitions – Declarations and macros.
2. Rules – Patterns and corresponding actions.
3. User Subroutines – Optional C functions used in actions.
Example structure:
%{
// Definitions
%}
%%
// Rules
%%
// User subroutines
10. State the role of lexical analyser. Identify the lexemes and their corresponding
tokens in the following statement: print(“Total-%d\n”,score);
• Role of lexical analyser:
The lexical analyser reads the source code, breaks it into tokens, removes
whitespace and comments, and passes tokens to the parser.
• Lexemes and tokens in the statement:
Lexeme Token Type Description
print IDENTIFIER Function/procedure name
( LEFT_PAREN Opening parenthesis
"Total-%d\n" STRING_LITERAL String constant with format specifier and newline
, COMMA Parameter separator
score IDENTIFIER Variable name
) RIGHT_PAREN Closing parenthesis
; SEMICOLON Statement terminator
11. Discuss about the possible error recovery actions in the Lexical phase of a
compiler.
Possible error recovery actions include:
1. Panic mode – Skip characters until a valid token is found.
2. Delete extra characters – Remove invalid characters causing errors.
3. Insert missing characters – Add likely missing characters.
4. Replace characters – Substitute incorrect characters with likely correct ones.
5. Transposition – Correct swapped adjacent characters.
12. Define buffer pair.
A buffer pair consists of two buffers used alternately during lexical analysis to hold parts
of the input source code, improving efficiency by reducing the frequency of input
operations.
13. Give the transition diagram for an identifier.
The transition diagram for an identifier (starting with a letter or underscore followed by
letters, digits, or underscores):
(Start) --[a-zA-Z_]--> (State 1)
(State 1) --[a-zA-Z0-9_]--> (State 1)
(State 1) --[other]--> (Accept Identifier)
14. Define tokens, patterns and lexemes.
• Token: A category of lexemes (e.g., keyword, identifier, operator).
• Pattern: A rule describing the set of lexemes that belong to a token (usually
specified by regular expressions).
• Lexeme: A sequence of characters in the source code that matches the pattern of
a token.
15. Mention the issues in lexical analyzer.
Issues in lexical analysis include:
1. Handling whitespace and comments.
2. Recognizing tokens correctly.
3. Dealing with ambiguous token definitions.
4. Error detection and recovery.
5. Efficient buffering and input management.
6. Symbol table management.
16. Define lex and give its execution steps.
Lex is a tool to generate lexical analyzers from regular expression specifications.
Execution steps:
1. Write Lex specifications (patterns and actions).
2. Run Lex to generate source code (usually lex.yy.c).
3. Compile the generated code with a C compiler.
4. Execute the compiled program to perform lexical analysis.
17. Describe the operations on languages.
Operations on languages include:
1. Union: Combining two languages.
2. Concatenation: Joining strings from two languages end-to-end.
3. Kleene Closure: Repetition of strings from a language zero or more times.
4. Intersection: Strings common to both languages.
5. Complement: Strings not in the language.
6. Difference: Strings in one language but not in another.
7. Reversal: Reversing the strings of a language.
18. Compare the features of NFA and DFA.
NFA (Nondeterministic Finite DFA (Deterministic Finite
Feature
Automaton) Automaton)
Number of Can have multiple transitions for the Exactly one transition per input from
transitions same input each state
ε-transitions Allowed Not allowed
May require more states than
State complexity Generally fewer states
equivalent NFA
May require backtracking or parallel
Simulation Direct simulation without backtracking
simulation
Design Easier to construct from regular
More complex to construct directly
complexity expressions
CO3
1. Define handle pruning:
Handle pruning is a bottom-up parsing technique where a handle (a substring matching
the RHS of a production) is repeatedly identified and replaced with the corresponding
non-terminal, reducing the string to the start symbol.
2. Obtain the left recursion for the grammar A → Ac | Aad | bd:
Left recursion form:
A → Aα | β
Given grammar has left recursion:
A → Ac | Aad | bd
Can be rewritten as:
A → bd A'
A' → c A' | ad A' | ε
3. Write the rule to eliminate left recursion in a grammar:
For a grammar: A → Aα | β
Left recursion is eliminated as:
A → βA'
A' → αA' | ε
4. Define an ambiguous grammar:
A grammar is ambiguous if there exists at least one string that can have more than one
distinct parse tree or leftmost derivation.
5. Define parse tree:
A parse tree is a hierarchical tree structure that represents the syntactic structure of a
string according to a grammar, showing how the start symbol derives the string.
6. Define left factoring:
Left factoring is a grammar transformation technique to remove common prefixes in
productions, used to make a grammar suitable for predictive (top-down) parsing.
Example:
A → αβ1 | αβ2 becomes A → αA', A' → β1 | β2
7. What is a dangling reference?
A dangling reference occurs when a program continues to use a memory location after
the object it refers to has been deleted or deallocated.
8. What is recursive descent parsing?
Recursive descent parsing is a top-down parsing technique that uses a set of recursive
procedures to process the input, with each non-terminal represented by a function.
9. Syntax tree for a := b * -c + b * -c
10. Parse tree & Syntax tree for 4 - 6 / 3 * 5 + 7
Assuming standard precedence: / and * > + > - (L to R)
Syntax Tree:
11. For what type of grammar, a recursive descent parser cannot be constructed?
A recursive descent parser cannot be constructed for left-recursive grammars.
Parsing "cad" with backtracking:
Grammar:
S → cAd
A → ab | a
Steps:
• Match c → matches
• Try A → ab → needs a, then b → Input: a, then d → fails (expected b)
• Backtrack → try A → a → match → Next: d → match → ACCEPT
12. Top-down & Bottom-up Parse Tree for “abbcde”
Grammar:
S → aABe
A → Abc | b
B→d
Top-down parse:
• S → aABe
• A → b (since A → Abc not possible)
• B→d
=> Matches: a b d e → doesn’t match input "abbcde"
Try: A → Abc
• A→b
• Abc → bbc
Now full string: a b b c d e → matches.
Parse Tree (both):
13. Convert ND Grammar to D Grammar:
Given:
S → iEtS | iEtSeS | a
E→b
Left factoring:
S → iEtSS' | a
S' → eS | ε
14. Problems with Top-down Parsing:
• Cannot handle left recursion
• Requires lookahead for decision making
• Backtracking is inefficient
• May not be suitable for ambiguous grammars
15. Rules to construct FIRST & FOLLOW:
FIRST(X):
• If X is terminal → FIRST(X) = {X}
• If X → ε, include ε
• If X → Y1Y2…Yn, add FIRST(Y1), and if FIRST(Y1) has ε, then add FIRST(Y2), etc.
FOLLOW(A):
• Add $ to FOLLOW(S) (start symbol)
• If A → αBβ, add FIRST(β) (excluding ε) to FOLLOW(B)
• If A → αB or FIRST(β) contains ε, add FOLLOW(A) to FOLLOW(B)
16. LR Parser Properties & Structure:
Properties:
• Bottom-up parser
• Detects errors early
• Efficient & works for a large class of grammars (LR(k))
• Handles all deterministic context-free grammars
Structure:
• Input buffer
• Stack
• Parsing table
• Driver program
Techniques:
• SLR (Simple LR)
• CLR (Canonical LR)
• LALR (Lookahead LR)
17. LR Parsing Steps:
1. Initialize stack with start state
2. Read input symbol
3. Use parsing table (Action & Goto)
4. Shift or reduce based on table
5. Repeat until input is accepted or error is found
18. Predictive Parser:
A predictive parser is a top-down parser that uses lookahead and does not require
backtracking, typically uses a parse table built from FIRST and FOLLOW sets.
19. Operator Precedence Parser:
A bottom-up parser that handles expressions using precedence and associativity of
operators, suitable for expressions with binary operators (like +, *, /).
20. Role of Semantic Analyzer:
• Performs semantic checks (type checking, scope resolution)
• Ensures source code adheres to language rules
• Builds intermediate representations (AST with annotations)
21. Differences: Top-down vs Bottom-up Parser:
Feature Top-down Bottom-up
Parsing direction Left to right Left to right
Build tree from Root to leaves Leaves to root
Backtracking May need it Not needed
Handles left rec. No Yes
22. FIRST & FOLLOW for Grammar:
S → AS | b
A → SA | a
FIRST sets:
• FIRST(S) = {a, b}
• FIRST(A) = {a, b}
FOLLOW sets:
• FOLLOW(S) = { $, a, b }
• FOLLOW(A) = { $, a, b }
CO4
1. Mention the two rules for type checking:
• Type compatibility: The types of operands must match the expected type of the
operator.
• Type coercion/checking: Implicit or explicit conversions are applied when needed
to make types compatible.
2. What is static checking?
Static checking is the process of verifying program properties at compile time without
executing the code. It includes type checking, scope resolution, and syntax checking.
3. Give some examples of static checking:
• Type checking
• Scope resolution
• Variable declaration before use
• Checking for unreachable code
• Function parameter matching
4. What is a type expression?
A type expression represents the type of a language construct (like variable, function,
etc.) using basic types (int, float) and type constructors (arrays, pointers, functions).
5. What is an intermediate code?
Intermediate code is an abstract, machine-independent representation of a program
generated between the front-end and back-end of a compiler. It simplifies optimization
and translation.
6. Advantages of generating intermediate representation:
• Machine independence: Easier to retarget to multiple architectures.
• Simplifies optimization: Allows platform-independent optimizations.
• Separates concerns: Isolates syntax analysis from code generation.
7. Short note on declarations:
A declaration introduces names (variables, functions) and their attributes (type, scope)
into a program. The compiler uses this information for semantic analysis and storage
allocation.
8. 3-Address Code for a = b * -c + b * -c:
t1 = -c
t2 = b * t1
t3 = -c
t4 = b * t3
t5 = t2 + t4
a = t5
9. Three functions of backpatching:
• Handles forward jumps (like in if-else, loops)
• Delays address filling in jump instructions until target is known
• Maintains lists of incomplete jump statements to be updated later
10. Define Type Checker:
A type checker is a component of the semantic analysis phase that ensures expressions
and operations are used with compatible types according to language rules.
11. State the type expressions:
Examples of type expressions:
• Basic types: int, float, char
• Constructed types:
◦ array(10, int) → array of 10 integers
◦ pointer(int) → pointer to int
◦ int → float → function from int to float
12. Three address code for d = (a-b) + (a-c) + (a-c)
To break down the expression:
t1 = a - b
t2 = a - c
t3 = a - c
t4 = t1 + t2
t5 = t4 + t3
d = t5
13. Three address code for x = a + (b * -c) + (d * -e) (using Triples):
Using triples (position-based instead of labels):
(1) t1 = -c
(2) t2 = b * t1
(3) t3 = -e
(4) t4 = d * t3
(5) t5 = t2 + t4
(6) x = a + t5
14. Define Backpatching:
Backpatching is the process of filling in placeholder addresses (e.g., for jump operations)
in intermediate code after the code generation phase has passed, allowing for correct
jumps in control flow (e.g., loops or conditionals).
15. Various ways of representing intermediate languages:
• Three-address code (TAC)
• Postfix notation (Reverse Polish Notation)
• Abstract Syntax Trees (AST)
• Static Single Assignment (SSA) form
• Bytecode (used in virtual machines)
16. Significance of intermediate code:
• Machine independence: Intermediate code allows a compiler to be portable
across different hardware architectures.
• Optimization: Facilitates optimization before final code generation.
• Error checking: Allows detection of errors before target code generation.
17. Properties of intermediate language:
• Machine-independent: It abstracts away hardware-specific details.
• Efficient for optimizations: Allows easier application of optimization techniques.
• Simplifies code generation: Bridges the gap between high-level language and
machine code.
• Readable and analyzable: Facilitates debugging and code analysis.
18. Benefits of using machine-independent intermediate forms:
• Portability: Compiler can target different architectures without major modifications.
• Optimization opportunities: Enables optimization at a higher, architecture-
agnostic level.
• Simplified code generation: A clear intermediate form simplifies the final code
generation process.
19. Various ways of representing intermediate languages (again):
• Three-address code (TAC)
• Abstract Syntax Trees (AST)
• Bytecode
• Static Single Assignment (SSA) Form
• Postfix notation (Reverse Polish Notation)
20. Types of three-address statements:
• Assignment: x = y op z
• Conditional jump: if x relop y goto label
• Unconditional jump: goto label
• Procedure calls: call function, param1, param2
• Return: return x
21. Intermediate code representation for a or b and not c:
Using three-address code:
t1 = not c
t2 = b and t1
t3 = a or t2
CO5
1. What are the limitations of static allocation?
• Limited Flexibility: Static allocation involves fixed memory allocation at compile
time, making it less adaptable to changes in memory requirements during runtime.
• Memory Waste: If the allocated memory is more than required, unused memory
results in wastage.
• No Dynamic Memory Management: Static allocation cannot adapt to varying
memory needs at different execution points, limiting dynamic behavior.
2. Draw the DAG for the statement a = (ab + c) - (ab + c).
3. Define DAG.
A Directed Acyclic Graph (DAG) is a graph used in compilers to represent expressions in
a way that allows for the elimination of common subexpressions. The vertices represent
operations or operands, and edges represent dependencies. It ensures no cycles exist in
the graph.
4. What do you mean by binding of names?
Binding of names refers to the association of variables (or identifiers) with values, types,
or memory locations. It can happen at different stages of the program: during compile
time (static binding) or at runtime (dynamic binding).
5. What are the fields of activation record?
The activation record (or stack frame) typically includes:
1. Return Address: Address to return to after function call.
2. Saved Registers: Storage for registers that need to be restored after the function
call.
3. Local Variables: Space for local variables of the procedure.
4. Parameters: Space for parameters passed to the function.
5. Control Link: Pointer to the activation record of the caller.
6. Access Link: Pointer used for access to non-local variables.
6. What is the order of the calling sequence?
The order of the calling sequence involves the steps followed when a function is called
and returns:
1. Push parameters onto the stack.
2. Save the return address.
3. Allocate space for local variables (Activation record).
4. Transfer control to the called function.
5. Execute function and return the result.
6. Pop local variables and return control to the caller.
7. State how a task is divided between calling and calling a program for stack
updating?
In stack updating:
• Caller: Prepares the arguments, saves registers, and pushes them onto the stack.
• Callee: Allocates space for local variables and sets up the stack for execution. The
callee then executes, performs the task, and returns control to the caller.
• Caller updates the stack by removing the passed parameters and restoring the
stack pointer post-function call.
8. What are the functions and properties of Memory Manager?
A Memory Manager performs the following functions:
1. Allocation: Assigns memory to programs and data structures.
2. Deallocation: Frees memory when no longer needed.
3. Garbage Collection: Removes unused data to reclaim memory.
4. Memory Protection: Ensures that one process doesn't interfere with another.
5. Memory Sharing: Allows multiple processes to share memory.
Properties include efficiency, security, and support for dynamic memory allocation.
9. What is a Procedure?
A Procedure (or function) is a block of code designed to perform a specific task. It is a
self-contained unit that can accept parameters, execute a set of operations, and return a
result.
10. What is an Activation Tree?
An Activation Tree is a hierarchical structure that represents the calling relationships
between functions or procedures during program execution. The root corresponds to the
main program, and each node represents a procedure call, with child nodes indicating
subroutine calls made within the procedure.
11. What is the use of a control stack?
A control stack is used during program execution to manage function calls and
returns. It stores activation records containing return addresses, local variables, and
parameters, helping to maintain control flow during nested or recursive function calls.
12. What are the types of storage allocation strategies? (OR) List Dynamic Storage
allocation techniques.
Storage allocation strategies include:
1. Static Allocation – Memory is allocated at compile time.
2. Stack Allocation – Memory is allocated and deallocated in a last-in-first-out (LIFO)
order.
3. Heap (Dynamic) Allocation – Memory is allocated at runtime and managed via
pointers, supporting structures like linked lists or trees.
13. Define symbol table.
A symbol table is a data structure used by the compiler to store information about
identifiers such as variables, functions, objects, etc. It maintains attributes like name,
type, scope level, and memory location.
14. What are the various ways to pass a parameter in a function?
The common parameter passing methods are:
1. Call by Value – Only the value is passed; changes do not affect the original.
2. Call by Reference – Address is passed; changes reflect in the original variable.
3. Call by Name – Parameters are substituted as expressions in the function body
(like macro expansion).
4. Call by Result – The result is copied back after execution.
15. Give a short note about call-by-name.
Call-by-name delays evaluation of the argument until it is used in the function. The
expression is re-evaluated each time it's accessed, which may lead to multiple
evaluations but allows for more flexible behavior, similar to macro expansion.
16. What are the various data structures used for implementing the symbol table?
Symbol tables can be implemented using:
1. Linear List – Simple, but slow for large numbers of entries.
2. Hash Table – Offers fast lookup, insertion, and deletion.
3. Binary Search Tree (BST) – Maintains sorted order with reasonable search
performance.
4. Trie (Prefix Tree) – Efficient for strings and prefixes.
17. Write a short note on declarations.
Declarations in programming introduce variables, constants, or functions to the compiler.
They specify identifier names, types, and sometimes storage class, allowing the
compiler to allocate memory and perform type checks.
18. What are basic blocks?
A basic block is a sequence of consecutive statements in a program with:
• One entry point (the first statement),
• One exit point (the last statement),
• No branches or jump targets inside the block.
They're used in optimization and code generation phases for simplifying control flow
analysis.
19. What is a flow graph?
A flow graph (control flow graph) is a directed graph that represents the flow of control in
a program.
• Nodes represent basic blocks, and
• Edges represent possible control flow between blocks (e.g., jumps, branches).
20. Mention the applications of DAGs. (Or) List the advantages of DAG.
Applications/Advantages of DAG:
1. Eliminates common subexpressions, optimizing code.
2. Helps in instruction reordering for better performance.
3. Minimizes temporary variables, reducing memory usage.
4. Assists in generating efficient target code by exposing parallelism.
21. What are the issues in the design of code generators?
Key issues in code generator design:
1. Correctness – Generated code must be semantically accurate.
2. Efficiency – Code should be optimized for speed and memory.
3. Target Machine Constraints – Must respect architecture (e.g., number of
registers).
4. Instruction Selection – Choose the best machine instructions.
5. Register Allocation – Efficient use of registers.
22. What is register descriptor and address descriptor?
• Register Descriptor: Shows which variables or values are currently stored in each
register.
• Address Descriptor: Lists the locations (register, memory) where the current
value of a variable can be found.
23. Define code generation.
Code generation is the compiler phase that translates intermediate code into machine
code or assembly code. It involves instruction selection, register allocation, and
addressing mode selection.
24. Write the steps to partition a sequence of 3-address statements into basic
blocks.
Steps:
1. Identify leaders (first statements of basic blocks):
◦ First statement is always a leader.
◦ Targets of jump/goto statements are leaders.
◦ Statements immediately following jumps are leaders.
2. Start a new basic block at each leader.
3. Include all statements after a leader up to the next leader or end of the sequence.
CO6
1. List out the examples of function preserving transformations.
Function-preserving transformations improve performance without changing program
behavior. Examples include:
• Common subexpression elimination
• Copy propagation
• Dead code elimination
• Constant folding
• Loop optimizations (e.g., loop invariant code motion)
2. What is peephole optimization?
Peephole optimization is a local optimization technique that examines a small "window"
(peephole) of target instructions to:
• Replace inefficient patterns with more efficient ones.
• Remove redundant code or unnecessary operations.
3. Identify and write down the optimizations that could be performed on a peephole.
Optimizations done in peephole:
• Redundant instruction elimination
• Algebraic simplifications (e.g., replacing multiplication by 2 with addition)
• Strength reduction (e.g., replacing expensive operations with cheaper ones)
• Unreachable code removal
• Jump-to-jump elimination
4. What do you mean by copy propagation?
Copy propagation is an optimization where assignments like x = y are used to replace
subsequent uses of x with y.
This helps simplify code and can expose further optimization opportunities.
5. Name the techniques in loop optimization.
Loop optimization techniques include:
• Loop invariant code motion
• Loop unrolling
• Loop fusion (combining loops)
• Loop fission (splitting loops)
• Strength reduction within loops
6. Apply the basic block concepts, how would you represent the dummy blocks with
no statements indicated in global dataflow analysis?
Dummy blocks (no operations) in global dataflow analysis are still represented as nodes
in the flow graph, mainly to preserve control flow structure.
They serve as connectors or placeholders and are often kept to handle complex jumps
or merging paths.
7. Identify the constructs for optimization in the basic block.
Constructs to optimize in a basic block:
• Redundant computations
• Common subexpressions
• Constant expressions
• Unreachable code
• Copy and dead code
8. List out the properties of optimizing compilers.
Optimizing compilers:
• Preserve semantic correctness
• Improve runtime performance
• Reduce memory usage
• Apply target-specific optimizations
• Balance between compilation time and execution efficiency
9. What is a flow graph? State its role in the compilation process.
A flow graph represents control flow between basic blocks using nodes and directed
edges.
Role: It helps in performing data flow analysis, optimization, and code generation by
modeling program structure.
10. How is liveness of a variable calculated? Identify it.
A variable is live at a point if its value is used later before being redefined.
Liveness analysis involves:
• Working backward through the flow graph.
• Using use and def sets of variables to compute in and out sets of each basic
block.
11. When do you call a variable to be syntactically live at a point?
A variable is syntactically live at a program point if its value is used after that point in
the control flow before being redefined.
It indicates that the variable's current value may affect the program's behavior later.
12. Give the main idea of dead code elimination and constant folding.
• Dead Code Elimination: Removes code that doesn't affect the program's output
(e.g., assignments to unused variables).
• Constant Folding: Evaluates constant expressions at compile time and replaces
them with their result (e.g., 2 + 3 → 5).
13. Define constant folding.
Constant folding is a compile-time optimization where constant expressions are
evaluated and replaced with their result to reduce runtime computation.
Example: x = 4 * 2 becomes x = 8.
14. What is code motion?
Code motion moves loop-invariant computations (those that produce the same result
in every iteration) outside the loop to avoid redundant execution and improve efficiency.
15. Define Local Transformation & Global Transformation.
• Local Transformation: Optimizations applied within a basic block, like algebraic
simplification or removing redundant instructions.
• Global Transformation: Optimizations across basic blocks or functions, like loop
optimizations or global common subexpression elimination.
16. What is meant by Common Sub-expressions?
A common sub-expression is an expression that is computed more than once with the
same operands and without any changes to them in between.
Example: a + b appearing multiple times can be computed once and reused.
17. What is meant by Dead Code? Or Define Live Variable?
• Dead Code: Code that does not affect the program output (e.g., a variable
assigned a value that is never used).
• Live Variable: A variable is live if its current value is used later in the program
before being redefined.
18. What is meant by Reduction in Strength?
Reduction in strength replaces expensive operations with cheaper equivalents.
Example: replacing a = a * 2 with a = a + a.
19. What is meant by loop invariant computation?
A loop invariant computation is an expression inside a loop that produces the same
result in every iteration.
It can be moved outside the loop to improve performance, a process known as code
motion.
20. What is the induction variable?
An induction variable is a variable that is regularly incremented or decremented in a
loop, often used to control loop iteration.
Example: In for (i = 0; i < n; i++), i is the induction variable.
21. Define use of machine idioms.
The use of machine idioms refers to replacing standard code sequences with efficient
machine-specific instructions or patterns.
This helps generate optimized code that takes advantage of the target architecture's
capabilities.
22. What are the structure preserving transformations on basic blocks?
Structure-preserving transformations maintain the program's control flow while
optimizing:
• Algebraic simplification
• Strength reduction
• Common subexpression elimination
• Copy propagation
These keep the basic block structure intact while improving efficiency.
23. Define data flow equations.
Data flow equations are mathematical representations used in data flow analysis to
compute information (like variable liveness or available expressions) by analyzing the flow
of data across basic blocks in a control flow graph.
24. When is a flow graph reducible?
A flow graph is reducible if:
• It can be broken into loops with a single entry point (a header node).
• It can be simplified to a single node through a series of reductions.
Reducible flow graphs support structured programming constructs like if-else, loops,
and function calls.