0% found this document useful (0 votes)
85 views48 pages

DLCA - Solved - Question Bank-1

Uploaded by

aditya455678899
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views48 pages

DLCA - Solved - Question Bank-1

Uploaded by

aditya455678899
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

1) Explain Micro instruction Sequencing organization ?

Need for designing the micro-instruction sequencing technique :


The first purpose is to minimize the size of control memory because control memory is present
inside the processor.
The second purpose is to execute the micro-instructions as fast as possible. Which means the
address of the next micro-instruction can be calculated as fast as possible.
The factors which are responsible for reducing the size of control memory are –
Degree of parallelism i.e. how many microoperations which can be performed simultaneously.
Representation/encoding of control information.
The way of specifying the address of next microinstruction.
The number of microoperations executed in the processor depends upon the processor
architecture, and encoding of instructions makes it short. But the major concern is to calculate the
address of the next micro-instruction.
The address of the next micro-instruction can be –
The address of the next micro-instruction in the sequence i.e. one after the other.
Branch address(which can be conditional or unconditional).
Calculated on the basis of the opcode of the instruction.
The address of the first micro-instruction can be calculated once from the opcode of the
instruction which is present in the instruction register, then that address is loaded into CMAR
(Control Memory Address Register). CMAR passes the address to the decoder. The decoder
identifies the corresponding micro-instructions from the Control Memory.
A micro-instruction has two fields: a control field and an address field.
Control field –
Determines which control signals are to be generated.
Address field –
Determines the address of the next micro-instruction.
This address is further loaded into CMAR to fetch the next micro-instruction.
As we know, usually micro-instructions are not executed sequentially for a long time . Let’s say
after 4 or 5 micro-instructions the branch can usually occur. Therefore, our main motive is to
make the branching algorithm better so that the address of the next micro-instruction can be
calculated efficiently.
Therefore, micro-instruction sequencing is the method of determining the flow of the
microprogram.
So there are techniques which are based on the number of addresses utilised for sequencing –
Two address fields in each microinstruction (Dual address field).
Single address field(Single address field).
Variable format microinstructions
1. Dual address field –
Dual address field

In this approach, micro-instructions are not executed in a sequential manner.


The instruction register (IR) gives the address of the first micro-instruction.
Thereafter, each micro-instruction gives the address of the next micro-instruction.
If it is a conditional micro-instruction, it will contain two address fields.
One for the condition to be true and the other for false. Hence, it is called dual address field.
The multiplexer will decide the address that will be loaded into the control memory address
register (CMAR) based on the status flags.
Here, lots of control memory is wasted because at least one of the address fields is not required in
many(i.e. for sequential or unconditional) micro-instructions.
2. Single address field –
With some modifications and the added logic, the number of addresses is reduced to one. Here, a
new register called microprogram counter is used. In this case, the next microinstruction address
can be the address of the next sequential address or it can be the address generated using op-code
or it can be the address stored in the address field of the microinstruction.

Single address field.


In this approach, micro-instructions are executed in a sequential manner.
The instruction register (IR) gives the address of the first micro-instruction into CMAR.
Thereafter, the address is simply incremented.
Hence, every micro-instruction need not carry the address of the next one.
This is true so long as the micro-program is executed in a sequential manner.
For an unconditional branch, the micro-instructions include the branch address. This address will
be loaded into CMAR.
For a conditional branch, the micro-instruction contains the branch address for true condition. If
the condition is false, the current address in CMAR will be simply incremented.
This means even in the worst case, the micro-instruction will carry only one address.
Hence, it is called single address field.
The multiplexer will decide the address that will be loaded into the control memory address
register (CMAR) based on the status flags.
This method is commonly used. But the space provided in each micro-instruction in a single
address field is not quite useful if the instructions are executed sequentially.

3. Variable address format –


In this technique two formats are used. In such a technique, one bit is needed in the
microinstruction to differentiate between control microinstruction or a branching
microinstruction. The first format provides the control microinstruction(i.e. the bits are used to
generate control signals) , while the second format provides the branch logic and address(there
can be conditional or unconditional branch).
In the first format, the microinstruction contains control signals, then the next microinstruction
address is calculated either by using the op-code of the instruction register or it is the address of
the next microinstruction in sequence. In this approach, an extra cycle is needed for branch
microinstruction.

Variable instruction format


Q2. Explain Flynn's Classification.
1. Single Instruction, Single Data (SISD): This category represents the traditional von
Neumann architecture, where a single instruction stream operates on a single data stream.
This is the simplest and most common type of
architecture found in most general-purpose computers.

2. Single Instruction, Multiple Data (SIMD): SIMD architectures have a single


instruction stream that operates on multiple data streams simultaneously. In this type of
architecture, a single instruction is broadcasted to multiple processing units, and each
unit operates on a different piece of data. SIMD architectures are well-suited for
parallel processing tasks that can be broken down into identical computations on
multiple data elements.
3. Multiple Instruction, Single Data (MISD): MISD architectures involve multiple
instruction streams operating on a single data stream simultaneously. This type of
architecture is less common in practical implementations and is typically used in
specialized scenarios such as fault-tolerant systems or redundant processing.
4. Multiple Instruction, Multiple Data (MIMD): MIMD architectures support multiple
instruction streams operating on multiple data streams concurrently. Each instruction
stream can be executing different instructions, and the data streams can be independent
or shared between the instruction streams. MIMD architectures are commonly found in
modern multi-core processors and distributed computing systems, where different tasks
can be executed independently on separate processing units.

Q3) Explain Different Types Of Distributed And Centralized Bus Arbitration Methods.

• Distributed Bus Arbitration:

In distributed bus arbitration, the responsibility for resolving bus conflicts is distributed among
the devices

Connected to the bus. Each device has the ability to arbitrate for bus control independently.
Some common

Distributed bus arbitration methods include:

1) Daisy Chain: In this method, devices are connected in a linear fashion, forming a daisy
chain. When a

Device needs access to the bus, it sends a request signal to the next device in the chain. The
request signal
Propagates through the chain until it reaches a device that can grant the access. This device then
asserts a

Grant signal and gains control of the bus.

2) Token Passing: This method uses a special control token that circulates among the
devices connected to

The bus. Only the device holding the token has the right to access the bus. When a device wants
to access

The bus, it waits until the token arrives, gains control, performs its operation, and then passes the
token to

The next device.

3) Random Selection: In this method, devices contend for bus control randomly. Each
device generates a

Random number and compares it with the numbers generated by other devices. The device with
the lowest

Or highest number (depending on the protocol) gains control of the bus. Random selection
provides

Fairness among devices, but it can also result in unpredictable bus access times.

• Centralized Bus Arbitration:

In centralized bus arbitration, there is a dedicated controller or arbiter responsible for granting
bus access to

Devices. The arbiter receives requests from devices and makes the decision on which device
should be granted

Access to the bus. Some common centralized bus arbitration methods include:

1) Priority-Based: In this method, each device is assigned a priority level. The arbiter
grants bus access to
The device with the highest priority among the requesting devices. Priority levels can be fixed or

Dynamically assigned based on factors such as device type or criticality.

2) Round Robin: In this method, the arbiter grants bus access to devices in a sequential
manner. Each device

gets a turn to access the bus, and the arbiter cycles through the devices in a fixed order. This
method

ensures fairness as each device gets an equal opportunity to access the bus.

3) Reservation-Based: In this method, devices request bus access in advance by reserving


specific time slots. The arbiter allocates time slots to devices based on their requests.
Q4) Write A Short Note On Cache Coherency.
In a multiprocessor system, data inconsistency may occur among adjacent levels or
1.
within the same level of the memory hierarchy. In a shared memory multiprocessor with
a separate cache memory for each processor, it is possible to have many copies of any
one instruction operand: one copy in the main memory and one in each cache memory.
When one copy of an operand is changed, the other copies of the operand must be
changed also.
Example : Cache and the main memory may have inconsistent copies of the same object.
2. As multiple processors operate in parallel, and independently multiple caches may
possess different copies of the same memory block, this creates a cache coherence
problem. Cache coherence is the discipline that ensures that changes in the values of
shared operands are propagated throughout the system in a timely fashion

Q5) Explain Cache Memory in Computer Organization.


• Cache Memory is a special very high-speed memory. The cache is a smaller and faster
memory that stores copies of the data from frequently used main memory locations.
There are various different independent caches in a CPU, which store instructions and
data. The most important use of cache memory is that it is used to reduce the average
time to access data from the main memory.
• Characteristics of Cache Memory:-
o Cache memory is an extremely fast memory type that acts as a buffer between RAM
and the CPU.
o Cache Memory holds frequently requested data and instructions so that they are
immediately available to the CPU when needed.
o Cache memory is costlier than main memory or disk memory but more economical than
CPU registers.
o Cache Memory is used to speed up and synchronize with a high-speed CPU.

6) Differences between Computer Architecture and Computer Organization.

Computer Architecture Computer Organization


1 Architecture describes what the computer does. The Organization describes how it does it.
2 Computer Architecture deals with the Computer Organization deals with a
functional behavior of computer systems. structural relationship.
3 In the above figure, it’s clear that it deals with In the above figure, it’s also clear that it deals
high- with low-
level design issues. level design issues.
4 Architecture indicates its hardware. Whereas Organization indicates its
performance.
5 As a programmer, you can view architecture The implementation of the architecture is
as a series of instructions, addressing modes, called organization.
and registers.
Q7) Define Instruction Cycle.

1. Fetch: In this phase, the processor fetches the instruction from memory. The program
counter (PC) holds the
address of the next instruction to be fetched. The processor reads the instruction from memory
using the address provided by the PC and stores it in an instruction register (IR).
2. Decode: In this phase, the processor decodes the fetched instruction. It interprets the
opcode (operation code) portion of the instruction to determine the type of operation to
be performed and identifies the operands or
registers involved.
3. Execute: In this phase, the processor performs the operation specified by the
instruction. It may involve calculations, data manipulation, logical operations, or
control flow modifications based on the decoded instruction.
4. Store: In this phase, the result of the execution is stored in the appropriate location. It
could be a register, memory location, or an I/O device, depending on the instruction and
the architecture of the computer system
Q8) Explain Different Addressing Modes.
1. Immediate addressing: The operand is a constant value or immediate data
directly embedded within the instruction itself. It is useful for operations that
involve constants or immediate values.
2. Register addressing: The operand is the content of a specific register. This mode allows
direct access to registers in the processor, which are typically fast storage locations.
3. Direct addressing: The operand is the actual memory address where the data is
stored. The processor directly accesses the memory location specified in the
instruction.
4. Indirect addressing: The operand is a memory address that contains the actual memory
address where the data is stored. The processor accesses the memory location indirectly
by first obtaining the address from the specified memory location.
5. Indexed addressing: The operand is calculated by adding a constant offset or index
value to a base address. It is commonly used in array or table access, where the index
determines the position of the element.
6. Relative addressing: The operand is a memory address calculated relative to the current
program counter (PC) or instruction pointer. It is often used in branch instructions to
specify the target address relative to the current instruction.
7. Stack addressing: The operand is implicitly specified from the top of the stack. It is
commonly used in stack- based architectures, where operands are pushed onto and
popped from the stack.
8. Base/Offset addressing: The operand is obtained by adding a constant offset to a
base address specified in a register or memory location. It is useful for accessing
data structures like arrays, records, or objects.
9. Indirect indexed addressing: This mode combines indirect and indexed addressing.
The operand is obtained by first obtaining a memory address indirectly and then
adding an index value to that address.

Q9) Explain IEEE 754 Floating Point Representation.


• The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for
floating-point
computation which was established in 1985 by the Institute of Electrical and Electronics
Engineers (IEEE). The standard addressed many problems found in the diverse floating point
implementations that made them difficult to use reliably and reduced their portability. IEEE
Standard 754 floating point is the most common representation today for real numbers on
computers, including Intel-based PC’s, Macs, and most Unix platforms.
• IEEE 754 has 3 basic components:
1. The Sign of Mantissa –
This is as simple as the name. 0 represents a positive number while 1 represents a negative
number.
2. The Biased exponent –
The exponent field needs to represent both positive and negative exponents. A bias is added to the
actual exponent in order to get the stored exponent.
3. The Normalised Mantissa –
The mantissa is part of a number in scientific notation or a floating-point number, consisting of
its significant digits. Here we have only 2 digits, i.e. O and 1. So a normalised mantissa is one
with only one 1 to the left of the decimal.
Q10) Explain JK Flip Flop and SR Flip Flop.
➢ JK Flip-Flop:
1. A JK flip-flop is a clocked sequential logic device that can store one bit of binary data. It
has two inputs: J (set) and K (reset), and two outputs: Q (output) and Q̅ (complement of
the output). The JK flip-flop operates based on the current state and the input values, as
well as the rising or falling edge of a clock signal. Here are the main characteristics of a
JK flip-flop:
2. When both J and K inputs are 0, the flip-flop remains in its current state (hold condition).
3. When J and K inputs are both 1, the flip-flop toggles, meaning the output switches to its
opposite state. If the output was 0, it becomes 1, and vice versa.
4. When J is 1 and K is 0, the flip-flop sets to 1 (output is forced to 1).
5. When J is 0 and K is 1, the flip-flop resets to 0 (output is forced to 0).
SR Flip-Flop:
1. An SR flip-flop (Set-Reset flip-flop) is another type of sequential logic circuit used
for storing and controlling binary data. It also has two inputs: S (set) and R (reset), and
two outputs: Q and Q̅ . Here are the key characteristics of an SR flip-flop:
2. When both S and R inputs are 0, the flip-flop remains in its current state (hold condition).
3. When S is 1 and R is 0, the flip-flop sets to 1 (output is forced to 1).
4. When S is 0 and R is 1, the flip-flop resets to 0 (output is forced to 0).
5. When both S and R inputs are 1, the flip-flop is in an indeterminate or forbidden state,
and its behavior is unpredictable. This situation is called a "race condition" and should
be avoided in practical designs.

Q11) Draw Flowchart Of Booth Algorithm For Multiplication.


• Booth algorithm gives a procedure for
multiplying binary integers in signed 2’s
complement representation in efficient
way, i.e., less number of
additions/subtractions required. It operates
on the fact that strings of 0’s in the
multiplier require no addition but just
shifting and a string of 1’s in the multiplier
from bit weight 2^k to weight 2^m can be
treated as 2^(k+1 ) to 2^m. As in all
multiplication schemes, booth algorithm
requires examination of the multiplier bits
and shifting of the partial product. Prior to
the shifting, the multiplicand may be
added to the partial product, subtracted
from the partial product, or left unchanged.
Q12) SRAM VS DRAM
Q13) Explain micro program control unit versus hardware control unit and their
advantages and disadvantages

A control unit whose binary control variables are stored in memory is known as a
microprogrammed control unit.
In Microprogrammed Control, the control information is stored in the control memory and
is programmed to initiate the required sequence of micro-operations.
By creating a definite collection of signals at each system clock beat, a controller generates
the instructions to be executed. Each of these output signals causes a single micro-
operation, such as register transfer. As a result, defined micro-operations that can be
preserved in memory are formed from the sets of control signals.
Each bit in the microinstruction is connected to a single control signal. The control signal is
active when its bit is set. The control signal becomes inactive when it is cleared. The
internal control memory can store a sequence of these microinstructions. A microprogram-
controlled computer's control unit is a computer within a computer.
The block diagram of a Microprogrammed Control Organization is shown below.

The microprogrammed control performs the following steps:-


1.) It can execute any instruction. It should be divided into a sequence of consecutive
operations by the CPU. This set of operations is called microinstruction. The control signals
are required for the sequential micro-operations to complete.
2.) Control signals saved in the ROM are created to execute the instructions on the data
direction. These control signals can be used to control the micro-operations associated with
a microinstruction at any time step.
3.) The following microinstruction address is generated.
4.) The last two steps are repeated till all of the microinstructions associated with the
instruction in the set are executed.
The micro counter register generates the address supported by the control ROM. The micro
counter obtains its inputs from a multiplexer that selects the output of an address ROM, a
current address incrementer, and an address saved in the next address field of the current
microinstruction.
Advantages and Disadvantages are as follows
Control Unit :
The unit which directs the operation of the processor & is a part of the CPU is known
as Control Unit. It generates control signals for the operations of a computer.
Types of Control Unit :
There are two types of control units as follows.
Hardwired control unit
Micro-programmed control unit
Hardwired control unit :
To interpret the instructions & generate control signals for them, this control unit uses fixed
logic circuits. To generate signals, the fixed logic circuits use the contents of the control
step counter, Instruction Register (IR) & code flag, and some external input signals such as
interrupt signals. The figure below shows the architecture view of the Hardwired control
unit as follows.

Typical Hardwired Control Unit


The fixed logic circuit in the diagram is a combinational circuit made from decoders &
encoders. It generates the output based on the state of its input(s). The decoder decodes the
instruction loaded in IR (Instruction Register) & generates the signal that serves as an input
to the encoder. Also, external input & conditional codes act as an input to the encoder. The
encoder then accordingly generates the control signals based on the inputs. After the
execution of each instruction, another signal: the end signal is generated which resets the
state of control step counter & makes it ready for the next instruction.
Advantages :
• Because of the use of combinational circuits to generate signals, Hardwired
Control Unit is fast.
• It depends on number of gates, how much delay can occur in generation of control
signals.
• It can be optimized to produce the fast mode of operation.
• Faster than micro- programmed control unit.
• It does not require control memory.
Disadvantages

• The complexity of the design increases as we require more control signals to be


generated (need of more encoders & decoders)
• Modifications in the control signals are very difficult because it requires
rearranging of wires in the hardware circuit.
• Adding a new feature is difficult & complex.
• Difficult to test & correct mistakes in the original design.
• It is Expensive.
Q14) Difference Between Encoder and Decoder
Combinational Logic is the concept in which two or more input states define one or more
output states. The Encoder and Decoder are combinational logic circuits. In which we
implement combinational logic with the help of boolean algebra.

To encode something is to convert an unambiguous piece of information into a form of


code that is not so clearly understood and the device which performs this operation is
termed ad Encoder.

Encoder

An Encoder is a device that converts the active data signal into a coded message format or it
is a device that converts analogue signal to digital signals. It is a combinational circuit, that
converts binary information in the form of 2N input lines into N output lines which represent
N bit code for the input. When an input signal is applied to an encoder the logic circuitry
involved within it converts that particular input into coded binary output.

To decode is to perform the reverse operation: converting a code back into an unambiguous
form code and the device which performs this operation is termed a Decoder.

Decoder

A decoder is also a combinational circuit as an encoder but its operation is exactly reverse as
that of the encoder. A decoder is a device that generates the original signal as output from
the coded input signal and converts n lines of input into 2n lines of output. An AND gate can
be used as the basic decoding element because it produces a high output only when all inputs
are high.

Decoder

Encoder vs Decoder

ENCODER DECODER

Encoder circuit basically converts the Decoder performs reverse operation and
applied information signal into a coded recovers the original information signal from
digital bit stream. the coded bits.

In case of encoder, the applied signal is the Decoder accepts coded binary data as its input.
active signal input.

The number of inputs accepted by an The number of input accepted by decoder is


encoder is 2n. only n inputs.

The output lines for an encoder is n. The output lines of an decoder is 2n.

The encoder generates coded data bits as The decoder generates an active output signal
its output. in response to the coded data bits.

The operation performed is simple. The operation performed is complex.

The encoder circuit is installed at the The decoder circuit is installed at the receiving
transmitting end. side.

OR gate is the basic logic element used in AND gate along with NOT gate is the basic
it. logic element used in it.

It is used in E-mail, video encoders etc. It is used in Microprocessors, memory chips


etc.
Q1) Q15) Von- Neumann Model

Von-Neumann computer architecture:

Von-Neumann computer architecture design was proposed in 1945.It was later known as
Von-Neumann architecture.

Historically there have been 2 types of Computers:

1) Fixed Program Computers – Their function is very specific and they


couldn’t be reprogrammed, e.g. Calculators.
2) Stored Program Computers – These can be programmed to carry out many
different tasks, applications are stored on them, hence the name.

Modern computers are based on a stored-program concept introduced by John Von


Neumann. In this stored-program concept, programs and data are stored in the same
memory. This novel idea meant that a computer built with this architecture would be much
easier to reprogram.

The basic structure is like this,

It is also known as ISA (Instruction set architecture) computer and is having three basic
units:
• The Central Processing Unit (CPU)

• The Main Memory Unit

• The Input/Output Device Let’s consider them in detail.

1. Central Processing Unit-

The central processing unit is defined as the it is an electric circuit used for the
executing the instruction of computer program.

It has following major components:

1.Control Unit(CU)

2.Arithmetic and Logic Unit(ALU)

3.variety of Registers

• Control Unit –
A control unit (CU) handles all processor control signals. It directs all input and
output flow, fetches code for instructions, and controls how data moves around
the system.

• Arithmetic and Logic Unit (ALU) –


The arithmetic logic unit is that part of the CPU that handles all the calculations
the CPU may need, e.g. Addition, Subtraction, Comparisons. It performs Logical
Operations, Bit Shifting Operations, and Arithmetic operations.
Figure – Basic CPU structure, illustrating ALU

1. Registers – Registers refer to high-speed storage areas in the CPU. The data
processed by the CPU are fetched from the registers. There are different types of
registers used in architecture :-

• Accumulator: Stores the results of calculations made by ALU. It holds


the intermediate of arithmetic and logical operatoins.it act as a
temporary storage location or device.

• Program Counter (PC): Keeps track of the memory location of the next
instructions to be dealt with. The PC then passes this next address to the
Memory Address Register (MAR).

• Memory Address Register (MAR): It stores the memory locations of


instructions that need to be fetched from memory or stored in memory.

• Memory Data Register (MDR): It stores instructions fetched from


memory or any data that is to be transferred to, and stored in, memory.

Current Instruction Register (CIR): It stores the most recently fetched


instructions while it is waiting to be coded and executed.
• Instruction Buffer Register (IBR): The instruction that is not to be
executed immediately is placed in the instruction buffer register IBR.

2. Buses – Data is transmitted from one part of a computer to another, connecting all
major internal components to the CPU and memory, by the means of Buses.
Types:

o Data Bus: It carries data among the memory unit, the I/O devices, and
the processor.

• Address Bus: It carries the address of data (not the actual data) between
memory and processor.

a. Control Bus: It carries control commands from the CPU (and status
signals from other devices) in order to control and coordinate all the
activities within the computer.

3. Input/Output Devices – Program or data is read into main memory from the input
device or secondary storage under the control of CPU input instruction. Output
devices are used to output information from a computer. If some results are
evaluated by the computer and it is stored in the computer, then with the help of
output devices, we can present them to the user.

Von Neumann bottleneck –


Whatever we do to enhance performance, we cannot get away from the fact that
instructions can only be done one at a time and can only be carried out sequentially. Both of
these factors hold back the competence of the CPU. This is commonly referred to as the
‘Von Neumann bottleneck’. We can provide a Von Neumann processor with more cache,
more RAM, or faster components but if original gains are to be made in CPU performance
then an influential inspection needs to take place of CPU configuration.

This architecture is very important and is used in our PCs and even in Super Computers.

Q16) Memory Mapping and Its Types

The translation between the logical address space and the physical memory is known
as Memory Mapping. To translate from logical to a physical address, to aid in memory
protection also to enable better management of memory resources are objectives
of memory mapping.
During cache mapping, the block is not brought from the main memory but the main
memory block is simply copied to the cache. Cache memory generally tends to operate in
some different configurations,

1. Direct mapping
2. Fully associative mapping
3. Set associative mapping

1) Direct Mapping

In Direct mapped cache memory, each block mapped to exactly one location in cache
memory.

A particular block of main memory can map the line number of cache is given by - Cache
line number = (Block Address of Main Memory) modulo (Number of lines in Cache).

Direct Mapping of Cache

The direct-mapped cache is like rows in a table with three columns' main memory address
are bits for Offset, Index, and Tag. The size of the fields depends on the capacity of
memory and size of the block in the cache.
The least significant w bits are used to identify a word within a block of main memory. Tag
corresponds to the remaining bits are used to determine the proper block of main memory.
Line off-set or block is used to select a block to be accessed out of total blocks are
available according to the capacity of the cache.

The data block or cache line that contains the actual data fetched and stored, a tag with all
or part of the address of the data that was fetched, and a flag bit that shows the presence in
the row entry of a valid bit of data.

2) Associative Mapping

In this type of mapping, any main memory block can go in any line of the cache. So we
have to use proper replacement policy to replace a block from the cache if the required
block of main memory is not present in the cache. Here, the main memory is divided into
two fields: word field identifies which word in the block is needed and the tag field
identifies the block. It is considered to be the fastest and the most flexible mapping form of
cache mapping.

Associative Mapping of Cache


3) Set-associative Mapping

In this mapping technique, blocks of cache are grouped to form a set and a block of main
memory can go into any block of a specific set.

Set Associative Mapping of Cache

This form of mapping removes the drawbacks of direct mapping. In Set-associative


mapping, each word that is present in the cache can have two or more words in the main
memory for the same index address. Set associative cache mapping is a combination of
direct and associative cache mapping techniques.

This also reduces searching overhead present in the associative mapping. Here,
searching is restricted to the number of sets instead of the number of blocks
17) Grey Code

Grey code, also known as reflected binary code, is a binary numeral system where
two successive values differ in only one bit. Grey code is useful in minimizing
errors in digital communications and is commonly used in analog-to-digital
converters and error correction in digital systems.

For example, the 4-bit binary numbers and their corresponding Grey code
representations are:
18) BCD (Binary-Coded Decimal)

BCD is a class of binary encodings of decimal numbers where each decimal digit
is represented by a fixed number of binary digits, usually four or eight. The most
common encoding is the 4-bit encoding, also known as 8421 encoding.

For example:

Thus, the number 92 in BCD would be represented as 1001 0010.

19) Excess-3 Code

Excess-3 is a binary-coded decimal code that is derived from the natural BCD code by
adding 3 (0011 in binary) to each decimal digit and then encoding the result in binary.

For example:
So, the decimal number 7 in Excess-3 code is 1010.

These codes are widely used in various digital systems and applications to facilitate error
checking and digital communication.
20) List and Explain Characters And Hierarchy of memory
Memory Hierarchy is an enhancement to organize the memory such that it can minimize the
access time. The Memory Hierarchy was developed based on a program behavior known as
locality of references. The figure below clearly demonstrates the different levels of the
memory hierarchy

Memory Hierarchy is one of the most required things in Computer Memory as it helps in
optimizing the memory available in the computer. There are multiple levels present in the
memory, each one having a different size, different cost, etc. Some types of memory
like cache, and main memory are faster as compared to other types of memory
but they are having a little less size and are also costly whereas some memory
has a little higher storage value, but they are a little slower. Accessing of data is
not similar in all types of memory, some have faster access whereas some have
slower access.
Types of Memory Hierarchy
This Memory Hierarchy Design is divided into 2 main types:
• External Memory or Secondary Memory: Comprising of
Magnetic Disk, Optical Disk, and Magnetic Tape i.e.
peripheral storage devices which are accessible by the
processor via an I/O Module.
• Internal Memory or Primary Memory: Comprising of
Main Memory, Cache Memory & CPU registers. This is
directly accessible by the processor.

Memory Hierarchy Design

Memory Hierarchy Design

1. Registers

Registers are small, high-speed memory units located in the CPU. They
are used to store the most frequently used data and instructions.
Registers have the fastest access time and the smallest storage capacity,
typically ranging from 16 to 64 bits.

2. Cache Memory

Cache memory is a small, fast memory unit located close to the CPU.
It stores frequently used data and instructions that have been recently
accessed from the main memory. Cache memory is designed to
minimize the time it takes to access data by providing the CPU with
quick access to frequently used data.
3. Main Memory
Main memory, also known as RAM (Random Access Memory), is the
primary memory of a computer system. It has a larger storage capacity
than cache memory, but it is slower. Main memory is used to store data
and instructions that are currently in use by the CPU.
Types of Main Memory
• Static RAM: Static RAM stores the binary information in
flip flops and information remains valid until power is
supplied. It has a faster access time and is used in
implementing cache memory.
• Dynamic RAM: It stores the binary information as a charge
on the capacitor. It requires refreshing circuitry to maintain
the charge on the capacitors after a few milliseconds. It
contains more memory cells per unit area as compared to
SRAM.

4. Secondary Storage

Secondary storage, such as hard disk drives (HDD) and solid-state


drives (SSD), is a non-volatile memory unit that has a larger storage
capacity than main memory. It is used to store data and instructions that
are not currently in use by the CPU. Secondary storage has the slowest
access time and is typically the least expensive type of memory in the
memory hierarchy.

5. Magnetic Disk

Magnetic Disks are simply circular plates that are fabricated with either
a metal or a plastic or a magnetized material. The Magnetic disks work
at a high speed inside the computer and these are frequently used.

6. Magnetic Tape

Magnetic Tape is simply a magnetic recording device that is covered


with a plastic film. It is generally used for the backup of data. In the
case of a magnetic tape, the access time for a computer is a little slower
and therefore, it requires some amount of time for accessing the strip.
Characteristics of Memory Hierarchy
• Capacity: It is the global volume of information the
memory can store. As we move from top to bottom in the
Hierarchy, the capacity increases.
• Access Time: It is the time interval between the read/write
request and the availability of the data. As we move from top
to bottom in the Hierarchy, the access time increases.
• Performance: Earlier when the computer system was
designed without a Memory Hierarchy design, the speed gap
increased between the CPU registers and Main Memory due
to a large difference in access time. This results in lower
performance of the system and thus, enhancement was
required. This enhancement was made in the form of Memory
Hierarchy Design because of which the performance of the
system increases. One of the most significant ways to
increase system performance is minimizing how far down the
memory hierarchy one has to go to manipulate data.
• Cost Per Bit: As we move from bottom to top in the
Hierarchy, the cost per bit increases i.e. Internal Memory is
costlier than External Memory.
Advantages of Memory Hierarchy
• It helps in removing some destruction, and managing the
memory in a better way.
• It helps in spreading the data all over the computer system.
• It saves the consumer’s price and time.
System-Supported Memory Standards

Level 1 2 3 4
Name Register Cache Main Memory Secondary
Memory
Size <1 KB less than 16 <16GB >100 GB
MB
Implementation Multi-ports On- DRAM Magnetic
chip/SRAM (capacitor
memory)
Access Time 0.25ns to 0.5 to 25ns 80ns to 250ns 50 lakh ns
0.5ns
Bandwidth 20000 to 1 5000 to 15000 1000 to 5000 20 to 150
lakh MB
Managed by Compiler Hardware Operating Operating
System System
Backing From cache from Main from Secondary from ie
Mechanism Memory Memory
21) Pipeline Hazard and Dependencies
Dependencies and Data Hazard in pipeline in Computer Organization

In this section, we will learn about dependencies in a pipelined processor, which is described as
follows:

Dependencies in pipeline Processor

The pipeline processor usually has three types of dependencies, which are described as follows:

1. Structural dependencies
2. Data dependencies
3. Control dependencies

Because of these dependencies, the stalls will be introduced in a pipeline. A stall can be described
as a cycle without new input in the pipeline. In other words, we can say that the stall will happen
when the later instruction depends on the output of the earlier instruction.

Structural dependencies

Because of the resource conflict in the pipeline, structural dependency usually arises. The
resource conflict can be described as a situation where there is a cycle containing resources such
as ALU (arithmetical logical unit), memory, or register. In resource conflict, more than one
instruction tries to access the same resource

Example:

Instructions 1 2 3 4 5
/ Cycle

I1 IF(Mem) ID EX Mem

I2 IF(Mem) ID EX

I3 IF(Mem) ID EX

I4 IF(Mem) ID

The above table contains the four instructions I 1, I2, I3, and I4, and five cycles 1, 2, 3, 4, 5. In
cycle 4, there is a resource conflict because I1 and I4 are trying to access the same resource. In
our case, the resource is memory. The solution to this problem is that we have to keep the
instruction on wait as long as the required resource becomes available. Because of this wait, the
stall will be introduced in pipelines like this:

Instructions 1 2 3 4 5 6 7 8
/ Cycle

I1 IF(Mem) ID EX Mem WB

I2 IF(Mem) ID EX Mem WB

I3 IF(Mem) ID EX Mem WB

I4 - - - IF(Mem)

Solutions for Structural dependency

With the help of a hardware mechanism, we can minimize the structural dependency stalls in a
pipeline. The mechanism is known as renaming.

Remaining: In this mechanism, the memory will be divided into two independent modules,
which are known as Data memory (DM) and Code memory (CM). Here, all the instructions are
contained with the help of CM, and all the operands which are required for the instructions are
contained by the DM.

Instructions 1 2 3 4 5 6 7
/ Cycle

I1 IF(CM) ID EX DM WB

I2 IF(CM) ID EX DM WB

I3 IF(CM) ID EX DM WB

I4 IF(CM) ID EX DM

I5 IF(CM) ID EX

I6 IF(CM) ID

I7 IF(CM)

Control Dependency (Branch Hazards)

When we transfer the control instructions, the control dependency will occur at that time. These
instructions can be JMP, CALL, BRANCH, and many more. On many instruction architectures,
when the processor wants to add the new instruction into the pipeline, the processor does not
know the target address of these new instructions. Because of this drawback, unwanted
instructions are inserted into the pipeline

For example:

For this, we will assume a program and take the following sequence of instructions like this:

100: I1
101: I2
102: I3
.
.
250: BI1

Expected Output is described as follows:

I1 → I2 → BI1

Note: After the ID stage, the processor is able to know the target address of JMP instruction.

Instructions 1 2 3 4 5 6
/ Cycle

I1 IF ID EX MEM WB

I2 IF ID(PC:250) EX MEM WB

I3 IF ID EX MEM

BI1 IF ID EX

The output sequence is described as follows:

I1 → I2 → I3 → BI1

So the above example shows that the expected output and output sequence are not equal to each
other. It shows that the pipeline is not correctly implemented.

We can correct that problem with the help of stopping the instruction fetch as long as we get the
target address of branch instruction. For this, we will implement the delay slot as long as we get
the target address, which is described in the following table:
Instructions 1 2 3 4 5 6
/ Cycle

I1 IF ID EX MEM WB

I2 IF ID(PC:250) EX MEM WB

Delay - - - - - -

BI1 IF ID EX

The output sequence is described as follows:

I1 → I2 → Delay (Stall) → BI1

In the above example, we can see that there is no operation performed by the delay slot. That's
why this output sequence and the expected output are not equal to each other. But because of
this slot, a stall will be introduced in the pipeline.

Solution for Control Dependency

In the control dependency, we can eliminate the stall in the pipelines with the help of a method
known as Branch prediction. The prediction about which branch will be taken is done at the
1st stage of branch prediction. The branch prediction contains the 0 branch penalty.

Branch Penalty: Branch penalty can be described as the number of stalls that are introduced at
the time of branch operation in the pipelined.

Data Dependency (Data Hazards)

For this, we will assume an ADD instruction S, and three registers, which are described as
follows:

1. S: ADD R1, R2, R3


2. Addresses read by S = I(S) = {R2, R3}
3. Addresses written by S = O(S) = {R1}

In the following way, the instruction S2 will depend on instruction S1:

1. [I(S1) ? O(S2)] ? [O(S1) ? I(S2)] ? [O(S1) ? O(S2)] ? ?

The above condition is known as the Bernstein condition. In this condition, there are three
cases, which are described as follows:
Flow (data) Dependence: Suppose this dependency contains O(S1) ? I(S2), S1 → S2. In this
case, when S2 reads something, only after that, S1 write.

Anti Dependence: Suppose this dependency contains I(S1) ? O(S2), S1 → S2. In this case,
before S2 overwrite S1, the S1 will read something.

Output Dependence: Suppose this dependency contains O(S1) ? O(S2), S1 → S2. In this case,
both S1 and S2 write on the same memory location.

For example: Here, we will assume that we have two instructions I1, and I2, like this:

I1: ADD R1, R2, R3


I2: SUB R4, R1, R2

The condition of data dependency will occur when the above instructions I 1, I2 are executed in a
pipelined processor. It shows that before I 1 writes the data, the I2 tries to read it. As a result, the
instruction I2 incorrectly gets the old value from I1, which is described in the following table:

Instructions / 1 2 3 4
Cycle

I1 IF ID EX DM

I2 IF ID (Old EX
value)

Here we will use the operand forwarding so that we can minimize the stalls in data dependency.

Operand Forwarding: In this forwarding, we will use the interface registers which exist
between the stages. These registers are used to contain the intermediate output. With the help of
intermediate registers, the dependent instruction is able to directly access the new value.

To explain this, we will take the same example:

I1: ADD R1, R2, R3


I2: SUB R4, R1, R2

Instructions / Cycle 1 2 3 4

I1 IF ID EX DM

I2 IF ID EX
Data Hazards

Due to the data dependency, data hazards have occurred. If the data is modified in different
stages of a pipeline with the help of instructions that exhibit data dependency, in this case, the
data hazard will occur. When the instructions are read/write the registers that are used by some
other instructions, in this case, the instruction hazards will occur. Because of the data hazard,
there will be a delay in the pipeline. The data hazards are basically of three types:

1. RAW
2. WAR
3. WAW

To understand these hazards, we will assume we have two instructions I1 and I2, in such a way
that I2 follows I1. The hazards are described as follows:

RAW:

RAW hazard can be referred to as 'Read after Write'. It is also known as Flow/True data
dependency. If the later instruction tries to read on operand before earlier instruction writes it, in
this case, the RAW hazards will occur. The condition to detect the RAW hazard is when O n and
In+1 both have a minimum one common operand.

For example:

I1: add R1, R2, R3


I2: sub R5, R1, R4

There is a RAW hazard because subtraction instruction reads output of the addition. The hazard
for instructions 'add R1, R2, R3' and 'sub R5, R1, R4' is described as follows:

Instructions / 1 2 3 4 5 6
Cycle

I1 IF ID EX MEM WB

I2 IF ID EX MEM WB

The RAW hazard is very common.

WAR
WAR can be referred to as 'Write after Read'. It is also known as Anti-Data dependency. If the
later instruction tries to write an operand before the earlier instruction reads it, in this case, the
WAR hazards will occur. The condition to detect the WAR hazard is when I n and On+1 both have
a minimum one common operand.

For example:

The dependency is described as follows:

add R1, R2, R3


sub R2, R5, R4

Here addition instruction creates a WAR hazard because subtraction instruction writes R2, which
is read by addition. In a reasonable (in-order) pipeline, the WAR hazard is very uncommon or
impossible. The hazard for instructions 'add R1, R2, R3' and 'sub R2, R5, R4' are described as
follows:

Instructions / 1 2 3 4 5 6
Cycle

I1 IF ID EX MEM WB

I2 IF ID EX MEM WB

When the instruction tries to enter into the write back stage of the pipeline, at that time, all the
previous instructions contained by the program have already passed through the read stage of
register and read their input values. Now without causing any type of problem, the write
instruction can write its destination register. The WAR instructions contain less problems as
compared to the WAW because in WAR, before the write back stage of a pipeline, the read stage
of a register occur.

WAW

WAW can be referred to as 'Write after Write'. It is also known as Output Data dependency.
If the later instruction tries to write on operand before earlier instruction writes it, in this case,
the WAW hazards will occur. The condition to detect the WAW hazard is when On and On+1 both
have a minimum one common operand.

For example:

The dependency is described as follows:

add R1, R2, R3


sub R1, R2, R4
Here addition instruction creates a WAW hazard because subtraction instruction writes on the
same register. The hazard for instructions 'add R1, R2, R3' and 'sub R1, R2, R4' are described as
follows:

Instructions 1 2 3 4 5 6 7
/ Cycle

I1 IF ID EX MEM MEM2 MEM3 WB

I2 IF ID EX MEM WB

In the write back stage of a pipeline, the output register of instruction will be written. The order
in which the instruction with WAW hazard appears in the program, in the same order these
instructions will be entered the write back stage of a pipeline. The result of these instructions
will be written into the register in the right order. The processor has improved performance as
compared to the original program because it allows instructions to execute in different orders.

Effects of WAR and WAW

The WAR hazards and WAW hazards occur because the process contains a finite number of
registers. Because of this reason, these hazards are also known as the name dependencies.

The processor will use the different registers to generate the output of each instruction if it
contains an infinite number of registers. There is no chance of occurring the WAR and WAW
hazards in this case.

The WAR and WAW hazards will not cause the delay if a processor uses the same pipeline for
all the instructions and executes these instructions in the same order in which they appear in the
program. This is all because of the process of instructions flow through a pipeline.

Q22) Delayed Branch And Branch Prediction ?


In computer architecture, handling branches effectively is crucial for
maintaining the efficiency of instruction pipelines. Two key techniques used to
address the challenges posed by branch instructions are delayed branching and
branch prediction. Here’s a detailed comparison between these two approaches:
Delayed Branch
Concept:
- In delayed branching, the branch instruction's effect is delayed by a fixed
number of cycles, known as the delay slots. Instructions following the branch
are executed before the branch is taken.
- The idea is to fill these delay slots with useful instructions to avoid pipeline
stalls.
Advantages:
1.*Simplicity: The hardware implementation is straightforward since it does not
require complex prediction mechanisms.
2. Compiler Optimization: Compilers can optimize code by reordering
instructions to fill the delay slots with useful work, potentially reducing the
performance penalty of branches.

Disadvantages :
1. Limited Flexibility : Effectiveness heavily depends on the ability to find
suitable instructions to fill the delay slots, which isn't always possible.
2. Increased Compiler Complexity : Compilers must perform additional work
to identify and move instructions into delay slots, which can increase complexity
and compilation time.
3. Wasted Slots : If no useful instructions can be found for the delay slots,
these slots may be filled with NOPs (no-operations), leading to wasted cycles.

Branch Prediction
Concept :
- Branch prediction involves guessing the outcome of a branch instruction before
it is known for sure and speculatively executing subsequent instructions based
on the prediction.
- Modern processors use sophisticated branch prediction algorithms, including
static and dynamic techniques, to improve prediction accuracy.

Advantages :
1. Increased Parallelism : Enables the pipeline to stay full by speculatively
executing instructions, potentially leading to significant performance gains.
2. Adaptive to Workloads : Dynamic branch predictors can adapt to the
branching patterns of different workloads, improving accuracy over time.
3. Reduced Pipeline Stalls : By predicting the branch outcome and continuing
execution, branch prediction can minimize the number of pipeline stalls and
keep the CPU busy.
Disadvantages :
1. Complexity : Implementing accurate and efficient branch predictors adds
significant complexity to the CP prediction U design.
2. Mis- Penalty : Incorrect predictions lead to flushing the pipeline and re-
executing instructions, which can incur a significant performance penalty.
3. Power Consumption : Additional logic for branch prediction consumes more
power, which is a critical consideration for modern processors, especially in
mobile and embedded systems.

Comparison
Performance :
- Delayed Branch : Performance improvement is limited by the compiler's
ability to fill delay slots.
- Branch Prediction : Can lead to substantial performance improvements,
especially with accurate predictors and deep pipelines.

Implementation Complexity :
- Delayed Branch : Simpler hardware but requires sophisticated compiler
support.
- Branch Prediction : Complex hardware design but offers greater flexibility and
adaptability.

Efficiency :
- Delayed Branch : Efficiency depends on the presence of suitable instructions
for delay slots.
- Branch Prediction : Efficiency depends on the accuracy of the predictor and
the ability to minimize mis-prediction penalties.

Adaptability :
- Delayed Branch : Less adaptable to changing workloads and branching
patterns.
Branch Prediction : Highly adaptable, especially with dynamic predictors that
learn and adjust based on runtime behavior.

In summary, delayed branching is a simpler, compiler-dependent technique that


can be effective in some scenarios but is limited by the need for suitable
instructions to fill delay slots. Branch prediction, while more complex and
power-intensive, generally offers better performance and adaptability, making it
the preferred choice in modern high-performance processors.

Q23)Draw Four Stage Instruction Pipeline

• A program consists of several number of instructions.


• These instructions may be executed in the following two ways-

• Non-Pipelined Execution
• Pipelined Execution
1. Non-Pipelined Execution-

In non-pipelined architecture,
• All the instructions of a program are executed sequentially one after the other.
• A new instruction executes only after the previous instruction has executed completely.
• This style of executing the instructions is highly inefficient.

Example-

Consider a program consisting of three instructions.


In a non-pipelined architecture, these instructions execute one after the other as-

If time taken for executing one instruction = t, then-

Time taken for executing ‘n’ instructions = n x t

2. Pipelined Execution-
In pipelined architecture,
• Multiple instructions are executed parallely.
• This style of executing the instructions is highly efficient.

Now, let us discuss instruction pipelining in detail.

Instruction Pipelining-

Instruction pipelining is a
technique that implements a form
of parallelism called as instruction
level parallelism within a single
processor.

• A pipelined
processor does not wait until the previous instruction has executed
completely.
• Rather, it fetches the next instruction and begins its execution.

Pipelined Architecture-

In pipelined architecture,
• The hardware of the CPU is split up into several functional units.
• Each functional unit performs a dedicated task.
• The number of functional units may vary from processor to processor.
• These functional units are called as stages of the pipeline.
• Control unit manages all the stages using control signals.
• There is a register associated with each stage that holds the data.
• There is a global clock that synchronizes the working of all the stages.
• At the beginning of each clock cycle, each stage takes the input from its register.
• Each stage then processes the data and feed its output to the register of the next stage.
Four-Stage Pipeline-

In four stage pipelined architecture, the execution of each instruction is completed in following
4 stages-

4. Instruction fetch (IF)


5. Instruction decode (ID)
6. Instruction Execute (IE)
7. Write back (WB)

To implement four stage pipeline,


• The hardware of the CPU is divided into four functional units.
• Each functional unit performs a dedicated task.

Stage-01:

At stage-01,
o First functional unit performs instruction fetch.
o It fetches the instruction to be executed.

Stage-02:

At stage-02,
o Second functional unit performs instruction decode.
o It decodes the instruction to be executed.

Stage-03:

At stage-03,
➢ Third functional unit performs instruction execution.
➢ It executes the instruction.

Stage-04:

At stage-04,
10. Fourth functional unit performs write back.
11. It writes back the result so obtained after executing the instruction.

Execution-

In pipelined architecture,
5. Instructions of the program execute parallely.
6. When one instruction goes from nth stage to (n+1)th stage, another instruction goes from
(n-1)th stage to nth stage.

Phase-Time Diagram-
3. Phase-time diagram shows the execution of instructions in the pipelined architecture.
4. The following diagram shows the execution of three instructions in four stage pipeline
architecture.

Time taken to execute three instructions in four stage pipelined architecture = 6 clock cycles.

NOTE-

In non-pipelined architecture,
Time taken to execute three instructions would be

= 3 x Time taken to execute one instruction

= 3 x 4 clock cycles
= 12 clock cycles
Clearly, pipelined execution of instructions is far more efficient than non-pipelined
execution.

You might also like