0% found this document useful (0 votes)
43 views41 pages

CD 1-8 Units Q&A by NovaSkillHub

The document provides a comprehensive overview of Compiler Design, focusing on the roles and phases of compilation, including lexical analysis and syntax analysis. It includes prepared questions and answers on topics such as tokens, the LEX tool, context-free grammar, and error handling. Additionally, it discusses the structure of compilers, token recognition, and the significance of grammar in programming languages.

Uploaded by

Priyanshu Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views41 pages

CD 1-8 Units Q&A by NovaSkillHub

The document provides a comprehensive overview of Compiler Design, focusing on the roles and phases of compilation, including lexical analysis and syntax analysis. It includes prepared questions and answers on topics such as tokens, the LEX tool, context-free grammar, and error handling. Additionally, it discusses the structure of compilers, token recognition, and the significance of grammar in programming languages.

Uploaded by

Priyanshu Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Unit-wise Prepared Questions and Answers (2, 4 & 5 Marks)

Subject: Compiler Design (CD)

Prepared by: Ch Anil Kumar


Powered by: NovaSkillHub

UNIT-1: Overview of Compilation

2-Marks Questions and Answers

1. What is the role of a lexical analyzer?


The lexical analyzer is the first phase of a compiler.
It reads the source code and breaks it into small parts called tokens (like keywords,
identifiers, symbols).
It removes white spaces and comments, making the code clean for further analysis.

2. Define token and lexeme.

• Token: It is a meaningful unit in the program like a keyword, identifier, or operator.


Example: if, while, +.

• Lexeme: It is the actual text from the source code that matches a token.
Example: In if(x > 0),

o if is a token

o and the actual text if is the lexeme

3. What is LEX? Mention its use.


LEX is a tool used to create lexical analyzers.
It takes a pattern or rule and generates code that can identify tokens from the input source
code.
Used to automate the lexical analysis phase of compilers.
4. List the applications of compiler technology.

• Improving programming languages

• Developing better debugging tools

• Code optimization in software

• Creating interpreters, assemblers, and virtual machines

• Language translation tools like Python to Java converters

4-Marks Questions

1. Explain the structure of a compiler with a neat diagram.

The structure of a compiler consists of several phases that work together to translate source
code written in a high-level language into machine-level code. These phases are organized in
a sequence, and each phase performs a specific task.

• The compiler starts with the lexical analysis phase, where the source code is read and
broken into tokens. Tokens are meaningful words like keywords, identifiers,
operators, etc. This phase also removes white spaces and comments.

• Next is the syntax analysis phase, also called parsing. It checks whether the tokens
follow the grammar rules of the language. If any syntax error is found, it is reported
at this stage. A parse tree or syntax tree is generated which represents the structure
of the source code.

• After that, the semantic analysis phase verifies the meaning of the statements. It
checks things like variable declarations, type compatibility, and scope resolution.

• If everything is correct, the compiler moves to the intermediate code generation


phase. In this phase, the code is translated into an intermediate representation
which is neither source code nor machine code but a form that is easier to optimize.

• Then comes the code optimization phase, which improves the intermediate code by
removing unnecessary instructions or reordering them to make the program run
faster or use less memory.

• After optimization, the code generation phase takes place where the intermediate
code is converted into target machine code.

• The symbol table is used throughout the process to keep track of variable names,
types, and scopes.

• The compiler also has error handling in each phase and provides proper messages to
the programmer to fix issues in the code.
All these phases work in sequence to convert high-level language code into efficient machine
code. A neat block diagram showing these phases should be drawn along with the answer
for better clarity and marks.

2. Describe the process of token recognition with examples.

Token recognition is one of the key tasks performed during the lexical analysis phase of the
compiler. The main aim is to identify the smallest meaningful units in the source code,
known as tokens. A token represents categories such as keywords, identifiers, operators,
constants, delimiters, etc.

• The lexical analyzer reads the input program character by character from left to right
and groups them into tokens using patterns defined by regular expressions.

• Each pattern is associated with a token name, and the input that matches a particular
pattern is considered a valid token. These patterns are matched using tools or
manually written code.

• For example, in the line int x = 10;, the tokens identified are:

o int → Keyword

o x → Identifier

o = → Assignment operator

o 10 → Constant

o ; → Delimiter

• The process of recognizing tokens helps the next phase, i.e., the syntax analyzer, to
build a proper structure (parse tree) of the source code.

This whole process of token recognition plays a vital role in ensuring that the input is valid
and properly formatted before moving to further phases of compilation.

3. Differentiate between hand-written lexical analyzers and LEX.

Both hand-written lexical analyzers and LEX-generated analyzers are used for scanning
source code and producing tokens, but they differ in terms of implementation, ease of use,
and flexibility.

• A hand-written lexical analyzer is manually written using a programming language


like C or C++. It gives the programmer full control over the logic, flow, and
performance.

• On the other hand, LEX is a tool that automatically generates a lexical analyzer. It
takes input in the form of regular expressions and actions and produces the code for
scanning and token recognition.
• Hand-written analyzers are more flexible but require more time and effort.
Debugging and testing also take longer.

• LEX analyzers are faster to develop and easier to modify but might not be as efficient
as hand-written ones in complex scenarios.

In summary, LEX is used for quick and standard lexical analyzer development, while hand-
written analyzers are preferred when more control and customization are needed.

4. Write a simple LEX program to recognize keywords and identifiers.

LEX is a tool used to generate lexical analyzers. It uses regular expressions to define patterns
and matches tokens in the input code.

Below is a simple LEX program to recognize keywords like int, float, and identifiers (variable
names):

%{

#include <stdio.h>

%}

%%

int|float|if|else { printf("Keyword: %s\n", yytext); }

[a-zA-Z_][a-zA-Z0-9_]* { printf("Identifier: %s\n", yytext); }

[ \t\n] ; // Skip whitespace

. { printf("Other: %s\n", yytext); }

%%

int main() {

yylex();

return 0;

• In this program, keywords like int, float, etc., are matched and printed.

• Identifiers are matched using a regular expression for variable names.

• yytext stores the matched token text.

• This code can be compiled using lex and gcc to test and recognize input patterns.

This program clearly shows how LEX is used to automate the process of recognizing
keywords and identifiers in any source code.
5-Marks Questions

1. Explain the phases of a compiler in detail with a block diagram.

The compilation process is divided into several phases, each having a specific role in
translating the source code to machine code. These phases are executed in order and work
together to produce efficient executable programs.

• The first phase is Lexical Analysis, which reads the source code and converts it into
tokens. It removes comments, white spaces, and separates valid words like keywords
and identifiers.

• Next is the Syntax Analysis phase, also known as parsing. It checks if the token
sequence follows the correct grammar rules of the programming language. It
generates a parse tree or syntax tree as output.

• Then comes the Semantic Analysis phase, which checks for the meaning of the
statements. It verifies if variables are declared, types are matched, and functions are
called with correct parameters.

• After semantic checking, the Intermediate Code Generation phase generates an


intermediate form of the source code that is easy to optimize and not dependent on
any machine.

• In the Code Optimization phase, the intermediate code is improved by removing


unnecessary operations and applying better logic for efficiency.

• The Code Generation phase converts the optimized code into actual machine code
that can be executed on the hardware.

• Symbol Table Management is used throughout all these phases to store information
about variables, functions, types, and scope.

• Error Handling is also done at each phase to detect and report errors effectively.
2. Discuss in detail the working of LEX tool with an example.

LEX is a tool used to generate lexical analyzers in compiler design. It takes patterns defined
by regular expressions and produces a program in C that can identify and process tokens
from the input.

• The LEX file is divided into three sections: definitions, rules, and user code. In the
definitions section, header files are included. The rules section contains regular
expressions and actions. The user code section has the main function and other
logic.

• When a LEX file is compiled using lex command, it generates a file called lex.yy.c,
which is then compiled using gcc to create an executable.

• The tool reads the input program character by character, matches it with defined
patterns, and performs specified actions like printing token names.
Example:

%%

int { printf("Keyword: int\n"); }

[a-zA-Z]+ { printf("Identifier: %s\n", yytext); }

%%

• In the above example, if the input contains int or any variable name like count, it will
print appropriate messages.

• This tool is very helpful in automating lexical analysis, which saves time and reduces
the possibility of errors compared to writing it manually.

3. How does lexical analysis help in error handling and token generation?

Lexical analysis plays a very important role in the compiler design process, especially in the
initial stages of source code processing. It is responsible for scanning the input code and
breaking it down into meaningful tokens.

• During token generation, the lexical analyzer uses regular expressions to identify
valid words in the program such as keywords, operators, identifiers, and constants.
These tokens are then passed to the next phase, i.e., syntax analysis.

• Apart from token generation, lexical analysis also handles errors related to invalid
characters, illegal symbols, or unexpected sequences in the source code.

• If a character does not match any defined pattern, the lexical analyzer detects it as an
error and reports it. For example, if the programmer writes int @value;, the @
symbol will be flagged as an error.

• The lexical analyzer also performs error recovery, such as skipping invalid input or
using dummy tokens to allow the compilation process to continue.

• Proper error messages are given with line numbers and descriptions to help the
programmer fix them quickly. This improves the debugging process and code quality.

Hence, lexical analysis ensures that only valid and meaningful tokens are passed to the
parser, and errors in the source code are caught at the earliest stage possible.
UNIT-2: Introduction to Syntax Analysis

2-Marks Questions

1. What is the role of a parser?

• The parser is the second phase in the compiler, after lexical analysis.

• It takes input in the form of tokens from the lexical analyzer.

• It checks whether the token sequence follows the syntax rules of the programming
language, defined by grammar.

• If the structure is correct, it builds a parse tree; otherwise, it reports syntax errors.

2. Define Context-Free Grammar (CFG).

• A Context-Free Grammar is a set of rules used to describe the syntax of programming


languages.

• CFG consists of terminals, non-terminals, a start symbol, and production rules.

• Each production rule has a single non-terminal on the left side and a combination of
terminals and non-terminals on the right side.

• CFG is used by parsers to check the correctness of a program’s structure.

3. What is left recursion? Why is it removed?

• Left recursion happens when a non-terminal refers to itself as the first symbol in one
of its production rules.

• It can cause infinite recursion in top-down parsers like recursive descent parsers.

• To avoid this problem, left recursion is removed and replaced by right recursion or
iteration.

• Removing left recursion helps the parser work efficiently and correctly.

4. Define ambiguity in grammar with an example.

• A grammar is said to be ambiguous if a string can have more than one valid parse
tree.

• This causes confusion in interpreting the meaning of the program.

• For example, for the grammar:


E → E + E | E * E | id,
the expression id + id * id can be parsed in two ways — either as (id + id) * id or id +
(id * id).
• Such ambiguity must be removed for accurate parsing.

4-Marks Questions

1. Write a CFG for simple arithmetic expressions.

Here is a simple context-free grammar to represent arithmetic expressions involving addition


and multiplication:

E→E+T|T

T→T*F|F

F → (E) | id

• This grammar shows expressions (E) as a combination of terms (T) and factors (F).

• id represents identifiers or numbers.

• Brackets are also handled using (E) which means expressions can be nested.

• This grammar correctly handles operator precedence and associativity (multiplication


has higher precedence than addition).

2. Explain with an example how to eliminate left recursion.

Left recursion causes issues in top-down parsers and needs to be removed. Let’s take a
simple example to explain how to eliminate it.

Suppose we have this grammar:

A→Aα|β

• Here, A is left-recursive because it appears first on the right-hand side.

• To eliminate left recursion, we rewrite it as:

A → β A’

A’ → α A’ | ε

• This version removes the left recursion and can now be parsed using top-down
parsers.

• Example:

• Expr → Expr + Term | Term

is left-recursive and can be rewritten as:

Expr → Term Expr’

Expr’ → + Term Expr’ | ε


3. What are non-context-free constructs? Give examples.

Some programming language features cannot be described using context-free grammars.


These are called non-context-free constructs.

• These constructs require checking conditions that CFGs cannot handle.

• One example is variable declarations and usage. In most languages, a variable must
be declared before use, and CFGs can’t track such dependencies.

• Another example is indentation rules in Python. The alignment of code blocks


depends on white space, which is not handled by CFG.

• Also, checking that function parameters match during a function call is beyond the
power of CFG.

4. Draw a parse tree for the expression a + b * c.

To draw the parse tree, we assume the grammar is:

E→E+T|T

T→T*F|F

F → id

For the expression a + b * c, the tree structure will show operator precedence:

/\

E +

/ \

T T

| |

F T

| /\

a T *

| \

F F

| |
b c

• This tree shows that b * c is evaluated first (due to higher precedence of *) and then
added to a.

5-Marks Questions

1. Explain how CFG is used in syntax specification of programming languages.

Context-Free Grammar (CFG) plays a vital role in defining the syntax rules of programming
languages. It helps compilers understand and validate the structure of source code.

• CFG consists of four parts: a set of terminals, non-terminals, a start symbol, and
production rules. Terminals are the basic symbols (like keywords, operators), and
non-terminals are placeholders for patterns of terminals.

• CFG is used to define how statements and expressions should be written. For
example, a grammar rule might define how an arithmetic expression or an if-else
statement should look.

• Parsers use CFG to check if a given program follows the syntax rules. If the code
matches the rules, it is considered syntactically correct.

• CFG helps in generating parse trees that represent the structure of code. These trees
are further used in semantic analysis and code generation.

• By designing an unambiguous and well-structured CFG, language designers ensure


that compilers can accurately understand and compile programs.

2. Discuss ambiguity in grammars with suitable examples and parse trees.

Ambiguity in grammar occurs when a single input string can be parsed in more than one
way. This leads to multiple parse trees and different interpretations of the same program.

• Ambiguous grammars are problematic because the compiler cannot decide which
structure to follow. This can cause confusion in code execution.

• Example:
Grammar:

• E → E + E | E * E | id

Input string: id + id * id

Two possible parse trees:

o Parse tree 1: (id + id) * id

o Parse tree 2: id + (id * id)

• This ambiguity makes it hard to define operator precedence and associativity.


• To resolve this, the grammar is rewritten to enforce correct precedence.
Multiplication should have higher precedence than addition.

Example of unambiguous grammar:

E→E+T|T

T→T*F|F

F → id

• Using this, the parse tree will always give id + (id * id) for the above input.

Ambiguity must be avoided in grammar design to ensure consistent interpretation of code.

3. Explain techniques for writing grammars for programming languages.

Writing grammars for programming languages is a crucial part of language design. Good
grammar ensures that the syntax rules are clear, unambiguous, and easy to parse.

• One technique is starting with simple rules and gradually building complex
constructs. Start from expressions and then move to statements and blocks.

• Use unambiguous grammar to avoid confusion. For example, use separate rules for
different operator precedence levels.

• Remove left recursion from grammar, especially for top-down parsers, to prevent
infinite loops.

• Use factoring techniques like left factoring to make the grammar suitable for
predictive parsing.

• Ensure the grammar handles all valid constructs of the language, including loops,
conditionals, functions, and declarations.

• Also include error handling rules in the grammar to help the parser detect and
report mistakes.

• Test the grammar with various inputs to check if it handles precedence, associativity,
and nesting properly.

A well-designed grammar helps in building reliable and efficient compilers.


UNIT-3: Top-down Parsing and LR Parsing

2-Marks Questions

1. Define FIRST and FOLLOW sets.

• FIRST set of a non-terminal contains all the terminals that can appear as the first
symbol in some string derived from that non-terminal.

• If the non-terminal can derive epsilon (ε), then epsilon is also included in its FIRST
set.

• FOLLOW set of a non-terminal contains all the terminals that can appear immediately
to the right of that non-terminal in some derivation.

• These sets are used in constructing predictive parsing tables for LL(1) parsers.

2. What is a predictive parser?

• A predictive parser is a type of top-down parser that does not use backtracking.

• It uses lookahead symbols and a parsing table to make decisions.

• It predicts the production to use based on the current input symbol and non-
terminal.

• LL(1) parsers are common examples of predictive parsers.

3. What is shift-reduce parsing?

• Shift-reduce parsing is a type of bottom-up parsing technique.

• It uses a stack and an input buffer to process the input symbols.

• The parser shifts input symbols onto the stack until it can reduce them to a non-
terminal using a grammar rule.

• It repeats shifting and reducing until it reduces the entire input to the start symbol.

4. What are viable prefixes?

• Viable prefixes are the prefixes of the right sentential forms that can appear on the
stack of a shift-reduce parser.

• They represent partial derivations that are valid during parsing.

• A viable prefix always ends just before a handle.

• They help in constructing LR parsing tables and are recognized by LR(0) automata.
4-Marks Questions

1. Explain the LL(1) parsing table construction.

• LL(1) parsing uses a table-driven approach to parse input without backtracking.

• The table is constructed using the FIRST and FOLLOW sets of the grammar.

• For each production A → α, we do the following:

o For each terminal a in FIRST(α), add A → α in the table at M[A, a].

o If ε ∈ FIRST(α), then for each terminal b in FOLLOW(A), add A → α in M[A, b].

• The parser uses the table to decide which production to apply by looking at the
current input symbol and top of the stack.

• LL(1) table must not have any multiple entries; otherwise, the grammar is not LL(1).

2. Differentiate between SLR(1) and LALR(1) parsers.

• SLR(1) parser is a simple LR parser that uses FOLLOW sets to determine parsing
actions.

• LALR(1) parser uses lookaheads specific to items, making it more powerful and
precise.

• SLR(1) may fail for some grammars due to insufficient context in FOLLOW sets.

• LALR(1) combines states with same core items and merges lookahead information,
improving accuracy.

• LALR(1) parsers are widely used in practice (like in YACC), as they balance efficiency
and power.

• All SLR(1) grammars are LALR(1), but the reverse is not always true.

3. What is the role of YACC in parsing? Give examples.

• YACC (Yet Another Compiler Compiler) is a tool that generates parsers automatically.

• It is used for implementing bottom-up parsers based on LALR(1) technique.

• YACC takes grammar rules as input and generates C code for the parser.

• It simplifies parser development by handling shift-reduce conflicts and building parse


trees.

• Example: A simple calculator grammar written in YACC can parse expressions like a +
b * c.

• It works with a lexical analyzer like Lex to complete the front-end of a compiler.
4. Discuss error recovery in predictive parsing.

• Predictive parsers use a table-based approach, so errors can be caught early.

• When an unexpected symbol is found, the parser may stop and report a syntax error.

• One recovery method is Panic Mode, where symbols are discarded until a
synchronizing token (like ; or }) is found.

• Another method is Error Productions, where extra grammar rules are added to catch
specific errors.

• Error routines can also be written to suggest corrections, like missing brackets or
incorrect keywords.

• These techniques make the parser more user-friendly and helpful during compilation.

5-Marks Questions

1. Explain recursive descent parsing with a simple grammar example.

Recursive descent parsing is a top-down method of parsing where each non-terminal in the
grammar is represented by a function in the parser.

• It is simple to implement and works well with LL(1) grammars.

• The parser functions call each other recursively to match the input tokens.

• It does not require any parsing table.

• If the grammar has left recursion, it must be removed before using this method.

Example Grammar:

E → T E'

E' → + T E' | ε

T → id

• We create functions for E, E', and T.

• The E() function will call T() and E'().

• The parser reads tokens one by one and matches them with grammar rules.

• If all tokens are matched, parsing is successful.

Recursive descent is easy to write but limited to non-left-recursive grammars.


2. Describe the LR(0) automaton and its role in LR parsing.

LR(0) automaton is used to construct the canonical collection of LR(0) items, which helps in
building the LR parsing table.

• LR(0) items are grammar rules with a dot (•) showing the position of the parser.

• For example: A → α • β means the parser has seen α and expects β next.

• The automaton starts with an initial item and builds states by shifting the dot over
the symbols.

• Transitions between states represent shifts in parsing.

• The collection of all these states forms the LR(0) automaton.

• It is used in constructing ACTION and GOTO tables for the LR parser.

• These tables guide the parser during shift, reduce, and accept decisions.

LR(0) is the foundation of more powerful parsers like SLR, LR(1), and LALR(1).

3. Explain the shift-reduce parsing technique with an example.

Shift-reduce parsing is a bottom-up approach that uses a stack and input buffer.

• The parser keeps shifting input symbols onto the stack until it matches the right side
of a production rule.

• Then it reduces that set of symbols to the left-hand side non-terminal.

• This process continues until the stack contains only the start symbol.

Example Grammar:

E → E + id | id

Input: id + id

Steps:

• Shift id → Stack: id

• Reduce id to E → Stack: E

• Shift + → Stack: E +

• Shift id → Stack: E + id

• Reduce id to E → Stack: E + E

• Reduce E + E to E → Stack: E
Input is accepted.

Shift-reduce parsing is efficient and widely used in LR parsers.

4. Discuss the LR parsing algorithm and compare SLR(1), LR(1), and LALR(1).

The LR parsing algorithm uses two tables — ACTION and GOTO — along with a stack to parse
input from left to right, constructing a rightmost derivation in reverse.

• The parser starts in an initial state and reads input symbols.

• It either shifts the input to the stack or reduces using a grammar rule.

• The ACTION table tells whether to shift, reduce, or accept.

• The GOTO table tells which state to go after a reduction.

• LR parsing can handle a large class of grammars and is very powerful.

Comparison of SLR(1), LR(1), and LALR(1):

• SLR(1) is the simplest and uses FOLLOW sets for lookahead, but may reject valid
grammars.

• LR(1) is the most powerful and uses specific lookaheads for each item, but creates
large tables.

• LALR(1) combines LR(1) states with the same core items and merges lookaheads,
giving power close to LR(1) with smaller tables.

• LALR(1) is widely used in practical parser generators like YACC.


UNIT-4: Syntax-Directed Definitions (Attribute Grammars)

2-Marks Questions

1. What is an attribute grammar?

• An attribute grammar is a formal way to define the meaning of programming


language constructs by associating attributes with grammar symbols.

• It combines syntax rules with rules for computing values (attributes).

• These attributes can carry semantic information like data types, values, or memory
locations.

• Attribute grammars are used to define Syntax-Directed Definitions (SDDs) in


compilers.

2. Define synthesized and inherited attributes.

• Synthesized attributes are computed from the attributes of a symbol’s children in the
parse tree and passed upwards.

• Inherited attributes are computed from the attributes of the symbol's parent or
siblings and passed downwards or sideways in the tree.

• Synthesized attributes are mostly used in bottom-up parsing.

• Inherited attributes are commonly used in top-down parsing or in more complex


semantic analysis.

3. What is a dependency graph?

• A dependency graph is a visual representation of how attributes in a parse tree


depend on each other.

• Nodes in the graph represent attributes, and edges show which attributes are
needed to compute others.

• It helps in determining a valid order to evaluate all the attributes.

• Cycles in a dependency graph indicate that evaluation may not be possible.

4. Differentiate between S-attributed and L-attributed definitions.

• S-attributed definitions use only synthesized attributes. These are easy to evaluate
using bottom-up parsers.

• L-attributed definitions may use both synthesized and inherited attributes, but
inherited ones are restricted to come from the left side.
• S-attributed definitions are a subset of L-attributed definitions.

• L-attributed definitions are more general and suitable for top-down parsing.

4-Marks Questions

1. Explain the concept of dependency graph with an example.

• A dependency graph shows how the evaluation of attributes depends on one


another during parsing.

• It helps determine the correct order of evaluation without violating dependencies.

• For every production rule in a grammar, each attribute involved becomes a node in
the graph.

• An edge is drawn from attribute A to attribute B if B depends on the value of A.

• Example:
For the production E → E1 + T, suppose we want to compute E.val = E1.val + T.val.
The dependency graph will have arrows from E1.val and T.val to E.val.

• By analyzing this graph, the compiler can evaluate attributes in a correct sequence.

2. Write an SDD to evaluate arithmetic expressions.

• Syntax-directed definitions (SDDs) can evaluate arithmetic expressions like addition


and multiplication.

• Consider a grammar:

• E→E+T

• E→T

• T→T*F

• T→F

• F → digit

• Attributes like val can be used to store the evaluated value.

• SDD:

• E → E1 + T { E.val = E1.val + T.val }

• E→T { E.val = T.val }

• T → T1 * F { T.val = T1.val * F.val }


• T→F { T.val = F.val }

• F → digit { F.val = digit.lexval }

• These rules evaluate the final result by computing values as the parser processes the
expression.

3. Describe how synthesized attributes are evaluated.

• Synthesized attributes are evaluated by using the values of attributes from child
nodes in the parse tree.

• The evaluation proceeds from the leaves of the tree towards the root — this is called
bottom-up evaluation.

• Each non-terminal collects the values of its children to compute its own synthesized
attribute.

• These attributes are commonly used in S-attributed grammars.

• For example, in evaluating an expression E → E1 + T, the attribute E.val is synthesized


from E1.val and T.val.

4. Explain implementation of SDD using LR parser.

• Syntax-directed definitions can be implemented during LR parsing by attaching


semantic actions to grammar rules.

• These semantic actions are performed when a reduction happens in the LR parsing
process.

• LR parsers naturally support S-attributed definitions because they evaluate attributes


in a bottom-up manner.

• Each grammar rule is associated with code that computes the synthesized attribute
during reduction.

• The values are stored in a semantic stack alongside the parsing stack.

• This approach integrates attribute evaluation into the standard LR parsing flow
efficiently.
5-Marks Questions

1. Discuss S-attributed and L-attributed definitions with examples.

• S-attributed definitions use only synthesized attributes and are well-suited for
bottom-up parsers like LR parsers.

• In S-attributed grammars, attributes are passed from child to parent in the parse
tree.

• Example:

• E → E + T { E.val = E.val + T.val }

• T → digit { T.val = digit.lexval }

• L-attributed definitions include both synthesized and inherited attributes, but


inherited attributes must be passed from left to right.

• They are useful for top-down parsers like recursive-descent parsers.

• Example:

• A→BC

• C.i = B.val // inherited attribute passed from B to C

• A.val = C.val // synthesized attribute from C to A

• S-attributed grammars are simpler but less flexible, while L-attributed grammars are
more powerful and commonly used in semantic analysis.

2. Explain evaluation order of attributes in an SDD using dependency graphs.

• The evaluation order of attributes must follow the dependency rules between them
to ensure correctness.

• Dependency graphs help visualize which attributes depend on which others.

• The graph is constructed with nodes representing attributes and edges showing
dependencies.

• A topological sort of this graph gives the correct order in which attributes should be
evaluated.

• If there is a cycle in the graph, it means circular dependency and the attributes
cannot be evaluated.
• This process ensures that all required values are computed before using them in any
rule.

• For example, to compute E → E1 + T, we must compute E1.val and T.val before


calculating E.val.

3. How can recursive-descent parsers implement L-attributed definitions?

• Recursive-descent parsers are top-down parsers where each non-terminal


corresponds to a function.

• L-attributed definitions are ideal for these parsers because inherited attributes can
be passed as function parameters.

• Synthesized attributes are returned from functions.

• For example, if a rule is A → B C, and C has an inherited attribute, it can be passed


like C(i).

• The function for C will then use the inherited value i to compute its own attributes.

• This method allows clear and modular implementation of attribute evaluation.

• The order of calling functions naturally follows the left-to-right flow required for L-
attributed grammars.

Unit 5 – Semantic Analysis

2-Marks Questions

1. What is a symbol table?

A symbol table is like a dictionary used by the compiler to store information about
identifiers (like variables, functions, arrays, etc.).
It stores details like:

• Name of identifier

• Type (int, float, etc.)

• Scope (local/global)

• Memory location
It helps the compiler quickly check if a variable is declared, already used, or defined.
2. How is "scope" represented in semantic analysis?

Scope means where a variable or function is visible and can be used.


Semantic analysis uses a stack of symbol tables to represent scope:

• When entering a new block (like a function), a new table is pushed.

• When exiting the block, it is popped.


This way, it knows which variable is valid in which part of the code.

3. Define synthesized and inherited attributes in semantic analysis.

• Synthesized attributes: Values passed from children to parent in the parse tree.
→ Example: Calculating expression values.

• Inherited attributes: Values passed from parent to children or between siblings.


→ Example: Passing type info for declarations.

4. What is semantic error recovery?

Semantic error recovery means how the compiler handles errors in meaning, like:

• Using an undeclared variable

• Type mismatch in assignment


It uses strategies like:

• Error messages

• Guessing types

• Skipping code
so compilation can continue without stopping completely.

4-Marks Questions

1. Explain the different data structures used for symbol tables.

Common data structures:

• Linear list: Simple list, slow in lookup.

• Hash table: Fast access using hash function (most used).

• Binary search tree: Balanced trees give log(n) time.

• Stack-based tables: Used to manage scope levels.

Each structure helps manage identifiers efficiently during compilation.


2. Describe the semantic analysis of control-flow statements.

For control flow like if-else, loops, etc., semantic analysis:

• Checks types of conditions (should be boolean or int)

• Verifies if break/continue are used correctly

• Ensures that return matches function type


It ensures logical correctness, not just syntax.

3. How is the declaration of functions and variables handled in semantic analysis?

During semantic analysis:

• The compiler adds variables and functions to the symbol table.

• It checks for re-declarations (already declared).

• It validates data types and parameters of functions.

• It also sets the scope of the variable (local/global).

4. Explain S-attributed and L-attributed SDDs with examples related to arrays.

• S-attributed: Only uses synthesized attributes. → Example: A → B where type of A =


type of B.

• L-attributed: Uses inherited + synthesized attributes. → Example: Passing array


index or size from parent to child.

Array Example:
A → B[C]

• Inherited: pass type info of B to C (array size check)

• Synthesized: Final type of A is element type of B


5-Marks Questions

1. Explain how semantic analysis is performed using syntax-directed definitions for


expressions and assignments.

Semantic analysis uses Syntax-Directed Definitions (SDDs) to attach meaning to grammar


rules.
Example:

E → E1 + T

E.type = checkType(E1.type, T.type)

For assignments:

S → id = E

→ Check if id is declared

→ Match type of id and E

→ Add to symbol table if needed

So SDDs help validate type, operations, and correctness.

2. Discuss the treatment of arrays and structures during semantic analysis using attribute
grammars.

Arrays:

• Check array declaration and type (int a[10];)

• Index must be an integer

• Use inherited attribute to pass size, base type

Structures:

• Check each member type

• Store fields and their types in the symbol table

• Access using struct.field — ensure field exists

Attribute grammars pass this info and help in proper validation.

3. Explain scope and symbol tables in detail with examples.


Scope is where a variable or function is visible.
Types:

• Global Scope

• Local Scope
Symbol tables are used to manage identifiers within each scope.

Example:

int x = 5; // Global scope

void func() {

int x = 10; // Local scope (hides global x)

→ Two symbol tables: one for global, one for function.


On entering a block, a new symbol table is pushed. On exit, it is popped.

4. How is semantic error handling done? Give examples

Semantic error handling ensures meaningful code.


Common errors:

• Using undeclared variable


→ x = 5; without declaring x

• Type mismatch
→ int x = "abc";

• Function call mismatch


→ foo(5) but foo() defined with no parameters

Compiler gives:

• Clear error messages

• Line numbers

• Guesses (in IDEs)

Helps in fixing bugs before code generation.


UNIT-6: Intermediate Code Generation

2-Marks Questions

1. What is a quadruple in intermediate code?


A quadruple is a 4-field representation of intermediate code. It has the form:

(op, arg1, arg2, result)

Example: a = b + c becomes → (+, b, c, a)


Used in three-address code generation for easier optimization.

2. Define SSA form.


SSA (Static Single Assignment) form is a representation in which each variable is assigned
exactly once, and every variable is unique.

Example:

x1 = 10

x2 = x1 + 5

This helps in better optimization and analysis.

3. What is backpatching?
Backpatching is the process of delaying the insertion of jump targets (like labels) until the
target is known.

Used in control-flow constructs like if, while, etc.

4. What is short-circuit evaluation?


Short-circuit evaluation is used in Boolean expressions where the result can be determined
without evaluating the entire expression.

Example:
In if (a || b) → If a is true, b is not checked.
4-Marks Questions

1. Compare quadruples, triples, and indirect triples.

Type Format Advantage

Quadruple (op, arg1, arg2, result) Easy to reorder instructions

Triple (op, arg1, arg2) No need for result name

Indirect Triple Uses pointers to Triples Flexible for code modifications

Example for a = b + c * d

1. (*, c, d)

2. (+, b, result_of_1)

3. (=, result_of_2, a)

2. Explain translation of if-then-else statement in intermediate code.

Code:

if (a < b)

x = 1;

else

x = 2;

Intermediate Code:

if a < b goto L1

goto L2

L1: x = 1

goto L3

L2: x = 2

L3:

Here, labels manage flow of execution.


3. Write intermediate code for a simple expression involving arrays.

Code:

a[i] = b[j] + 5;

TAC (Three Address Code):

t1 = b[j]

t2 = t1 + 5

a[i] = t2

Quadruples:

(=, b[j], -, t1)

(+, t1, 5, t2)

(=, t2, -, a[i])

4. Explain the concept of flow graphs with an example.

A flow graph shows control flow between basic blocks in a program.

Code:

a = 5;

if (a < 10)

b = 1;

else

b = 2;

Flow Graph:

• B1: a = 5

• B2: b = 1

• B3: b = 2

• B4: End

Edges:

• B1 → B2 (if true)

• B1 → B3 (if false)

• B2, B3 → B4
5-Marks Questions

1. Explain various intermediate representations with diagrams and uses.

1. Three Address Code (TAC):

t1 = a + b

t2 = t1 * c

2. Quadruples:

(op, arg1, arg2, result)

3. Triples:

(index) (op, arg1, arg2)

4. Indirect Triples: Use pointers to triples for flexibility.

5. DAG (Directed Acyclic Graph):

• Represents expressions

• Eliminates common subexpressions

Use: All help in optimization, easy translation, and analysis.

2. Describe the translation of control-flow constructs like while-do, switch, and Boolean
expressions with short-circuit code.

While Loop:

while (i < 5)

i = i + 1;

L1: if i >= 5 goto L2

i=i+1

goto L1

L2:

Switch Statement:

switch(x)
case 1: y = 1;

case 2: y = 2;

if x == 1 goto L1

if x == 2 goto L2

goto L3

L1: y = 1; goto L3

L2: y = 2

L3:

Short-Circuit Boolean:

if (a && b)

if a == false goto Lfalse

if b == false goto Lfalse

goto Ltrue

3. Explain backpatching with an example of Boolean expression.

Backpatching is used to fill in jump labels during code generation.

Code:

if (a > b || c < d)

Step-by-step:

1. if a > b goto ___ [Tlist]

2. if c < d goto ___ [Tlist]

3. goto ___ [Flist]

Maintain lists of true and false conditions. After generating labels, we backpatch them.
4. Generate intermediate code for a sample program with expressions, arrays, and control
statements.

Code:

int a[10], b[10], i;

for(i = 0; i < 10; i++)

a[i] = b[i] + 2;

Intermediate Code:

i=0

L1: if i >= 10 goto L2

t1 = b[i]

t2 = t1 + 2

a[i] = t2

i=i+1

goto L1

L2:

Covers:

• Arrays

• Arithmetic

• Loop

• Condition Check
UNIT-7: Run-Time Environments (Simple & Easy for Exam)

2-MARKS QUESTIONS

1. What is an activation record?


It is a block of memory used to store all the information of a function when it is called.
It stores:

• Parameters

• Local variables

• Return address

• Temporary data

2. Define stack allocation.


Stack allocation means using a stack to store function calls.
When a function is called, a record is pushed, and when it ends, the record is popped.
This helps in handling recursive and nested calls easily.

3. What is the role of runtime environment?


It helps to manage memory during program execution.
It takes care of:

• Function calls and returns

• Variable storage

• Controlling flow during execution

4. How is non-local data accessed in nested procedures?


We use an access link to reach variables from outer functions.
This link connects the current function to the one in which it was defined (lexical parent).
4-MARKS QUESTIONS

1. Stack-based Allocation Strategy:

• When a function is called → push activation record to stack

• When it ends → pop the record

• This keeps data for each function separate

• Used for recursive functions

• Easy to manage memory in function calls

Example:
Calling fact(3) creates 3 records for fact(3), fact(2), fact(1).

2. Accessing Non-Local Variables (Without Nested Procedures):


If no nested functions are there,

• Non-local variables are usually global

• They are accessed using fixed memory locations or global symbol table

• No need for access links

Example:
In C language, we can access global variable int x; from any function.

3. Runtime Stack in Procedure Call/Return:

• Runtime stack holds activation records

• New function call → push to stack

• Function return → pop from stack

• Maintains correct flow of execution

• Handles recursion, local data, and return address

Think of stack as a call history that stores what’s needed for each function.
5-MARKS QUESTIONS

1. Runtime Environment + Stack Allocation (Easy Explanation):

• Runtime Environment manages how a program runs (especially memory).

• It handles variables, functions, calls, returns, etc.

• Stack allocation is used to store function data.

Memory Layout:

| Code Area |

| Global Variables |

| Heap (Dynamic) |

| Stack (Functions) |

Each function call pushes a new block (activation record) to stack.


When function ends, it pops that record.

This way, function calls are managed safely and efficiently.

2. Accessing Non-Local Data (With & Without Nested Procedures):

With Nested Procedures:

• Use access link to reach outer function’s variables

• Link connects current function’s record to its parent function

Example:
In Pascal:

procedure A

procedure B

// B can access A's variables using access link

Without Nested Procedures:

• Use global memory or symbol table

• Functions can access variables declared globally


3. Activation Record and Recursion:

When recursion happens:

• Each call gets its own activation record

• Keeps local variables separate for each call

• Stores return address, so the program knows where to go back

Example:

int fact(int n) {

if (n==0) return 1;

return n * fact(n-1);

Stack will be like:

| fact(3) |

| fact(2) |

| fact(1) |

| fact(0) |

Each level has its own record, so recursion works correctly.

UNIT-8: Machine Code Generation & Optimization (Easy for End Sem)

2-MARKS QUESTIONS

1. What is machine code generation?


It is the process of converting intermediate code into actual machine code that a computer
can run.
2. Define machine-independent code optimization.
These are code improvements done without depending on the target machine.
Example: removing unnecessary code, constant folding, etc.

3. Give one example of a peephole optimization.


Example:

MOV R1, R2

MOV R2, R1 → can be removed (unnecessary swap)

4. What is the purpose of optimization in compilation?


To make the program faster, smaller, and more efficient by removing or improving parts of
the code.

4-MARKS QUESTIONS

1. Simple Machine Code Generation (with example):

Intermediate Code:

t1 = a + b

t2 = t1 * c

Machine Code (Assembly style):

LOAD a, R1

ADD b, R1

MUL c, R1

STORE R1, t2

The code is converted into steps that the machine understands using registers.
2. Types of Machine-Independent Optimizations:

1. Constant Folding:
Replace constant operations at compile-time.
x=2+3→x=5

2. Dead Code Elimination:


Remove code that is never used.

3. if (false) { print("Hello"); } → removed

4. Common Subexpression Elimination:


Reuse repeated expressions.

5. a = b + c

6. d = b + c → use stored value of b+c

7. Strength Reduction:
Replace expensive operations with cheaper ones.
x=y*2→x=y+y

3. Phases in Code Generation:

1. Instruction Selection:
Choose the correct machine instructions for operations.

2. Register Allocation:
Assign variables to CPU registers.

3. Instruction Scheduling:
Arrange instructions to avoid delays and improve speed.

4. Machine-Dependent vs Machine-Independent Optimization:

Feature Machine-Dependent Machine-Independent

Based on Specific machine architecture Works on general intermediate code

Example Register usage optimization Dead code elimination

Portability Not portable Portable


5-MARKS QUESTIONS

1. Code Optimization – Importance:

Code Optimization improves the quality of code by:

• Reducing execution time

• Reducing memory usage

• Improving speed and performance

• Making efficient use of resources

It is applied after intermediate code generation and before machine code generation.

2. Machine-Independent Optimization Techniques (with Examples):

Constant Folding:

x = 10 * 5 → x = 50

Dead Code Elimination:

int x = 5;

x = 10; // ‘x = 5’ is dead → remove it

Loop Invariant Code Motion:


Move code that doesn’t change in loops to outside.

for(i=0; i<n; i++) { x = a + b; } → move x = a + b outside loop

Strength Reduction:

x=y*2→x=y+y

Common Subexpression Elimination:

a = b + c;

d = b + c; → reuse b+c result


3. Machine Code Generation for Arithmetic & Control Flow Statements:

Arithmetic Statement:

Intermediate Code:

t1 = a + b

t2 = t1 * c

Machine Code:

LOAD a, R1

ADD b, R1

MUL c, R1

STORE R1, t2

Control Flow Statement (if condition):

Intermediate Code:

if (a > b) goto L1

goto L2

L1: x = 1

L2: x = 0

Machine Code:

LOAD a, R1

SUB b, R1

JGT R1, L1

JMP L2

L1: MOV 1, x

L2: MOV 0, x

Code is translated step by step with proper instructions and labels.


Prepared & Shared by NovaSkillHub

Powered by Career Growth with Technology and Real Education

Dear Students,

The following questions for Units 1 to 8 of Compiler Design are carefully prepared to help
you cover all key topics for your semester preparation. Make sure to go through them
thoroughly these are designed to support your success!

Let’s grow together with smart learning and real education.

With Best Regards,

Ch Anil Kumar

NovaSkillHub

You might also like