COMPUTER ORGANISATION
● Line size matlab block size !
● Performance gain = speed up = Tbada / Tchota
● RISC is preferable for pipelined CPU as simple instructions are there in RISC.
● Memory mapped IO : Same memory for IO and Data but some addresses are
assigned to I/o devices.
● I/o mapped IO : Different memories for IO and Data but different control signals
are generated for IO and Memory.
● Total time required in programmed IO : Time to read status register + Data
Transfer time
● In interleaving mode of DMA transfer, the CPU gives the control of system
buses to DMA only when the CPU is idle. So the percentage time the CPU is
blocked will be 0%.
● In case of Burst mode the percentage time CPU blocked will be = Ty / (Tx+Ty)
Ty : data Transfer time , Tx : data preparation time.
Tx is dependent on the internal speed of a device.
Ty is dependent on the speed of memory.
● In case of cycle stealing mode, the prepared data is being stored in the I/o
buffer parallel with the data transfer so the percentage time CPU block will
become = Ty / Tx
● A ram chip has a capacity of 2048 words of 16 bits each (2K × 16). The number
of 3 × 8 decoders with enable lines needed to construct a (64K × 32 RAM) from
2K × 16RAM is ?
Here decoders will be used only for the vertical arrangement.
● SRAM : Low Idle power consumption but high operational power consumption.
● DRAM : High idle power consumption but low operational power consumption.
● Throughput = No of tasks / total time = n / (k+n-1)tp
In ideal case, throughput = 1 / tp
● Instruction fetch : Memory se instruction ko CPU k andar Instruction register
me laana.
● Instruction decode: Konsa operation hai usko decode karna aur address
calculation ( ALU operations k case me operands ka address calculate hota hai
aur Branch instructions me target address calculate hota hai).
● Operand fetch : operands ko CPU k registers me copy krte hai.
● Execution : ALU operations, Load and store operations, condition check and
updation of PC (in case of branch type instructions).
● Write back : Result written back to destination. (Write back is of no use in case
of branch type instructions and load & store instructions).
● Efficiency in pipeline = S / K
K: no of stages in pipeline and S is speedup. We know that S max= K. Now
efficiency is = Kitna speed up aaya / Kitna maximum possible tha = S / K
Throughput in pipeline = No of tasks performed / Total time taken
● Structural hazards : Pipeline ki same stage me do different instructions nhi
execute kr skte (two instructions of the pipeline require the same resource).
● Data hazard : RAW, WAR, WAW
● Control hazard : Branch vagera ki vajah se stalls hote hai
● Control bus: unidirectional or bidirectional? Why?
● Operand forwarding can not handle all RAW hazards.
● Register renaming can eliminate ALL WAR and WAW hazards.
● Control hazard penalties can be reduced by dynamic branch prediction but can
not completely eliminate the penalties.
● All pipeline concepts : https://gateoverflow.in/1391/gate-cse-2005-question-68
● Data hazards and data dependencies are two different things.
https://gateoverflow.in/342932/overflow-series-computer-organization-
architecture-question
● The trap is a non-maskable interrupt as it deals with the ongoing process in the
processor. The trap is initiated by the process being executed due to lack of
data required for its completion. Hence the trap is non maskable.
● Remember flag bits in the control word always need logn bits
irrespective of horizontal and vertical microprogramming.
● Pipeline latency is the time taken by the first instruction to complete all
the stages.
● If question mentions nothing about which mode we should use, then by default we
consider Burst mode. Moreover, if the question mentions, "constantly transferring
data to memory using DMA or the disk is actively transferring 100% of the time ”,
then also we need to consider Burst Mode.
● Software interrupts are always vectored interrupts ( every system call has a specific
name) whereas hardware interrupts can be vectored or non vectored.
Good questions from test series :
https://gateoverflow.in/380482/go-classes-test-series-2023-and-architecture-test-question
https://gateoverflow.in/380480/go-classes-test-series-2023-and-architecture-test-question
https://gateoverflow.in/381863/go-classes-test-series-2023-and-architecture-test-question
https://gateoverflow.in/381851/go-classes-test-series-2023-and-architecture-test-question
https://gateoverflow.in/381865/go-classes-test-series-2023-and-architecture-test-question
https://gateoverflow.in/381875/go-classes-test-series-2023-and-architecture-test-question
https://gateoverflow.in/342961/overflow-series-computer-organization-architecture-question
https://gateoverflow.in/342949/overflow-series-computer-organization-architecture-question
https://gateoverflow.in/342928/overflow-series-computer-organization-architecture-question
https://gateoverflow.in/342924/overflow-series-computer-organization-architecture-question
https://gateoverflow.in/342922/overflow-series-computer-organization-architecture-a
https://gateoverflow.in/342918/overflow-series-computer-organization-architecture-question
https://gateoverflow.in/342916/overflow-series-computer-organization-architecture-question
https://gateoverflow.in/342908/overflow-series-computer-organization-architecture-question
Decoders will be used only for vertical arrangement
Samajh hi nahi aaya ques :
GATE CSE 1988 | Question: 9iii - GATE Overflow for GATE CSE
I thought it would be base register :
(D) PC relative addressing is the best option. For Base register addressing, we have to
change the address in the base register while in PC relative there is absolutely no
change in code needed.
Accha ques MSQ :
Absolute addressing is another name for direct addressing mode; ans is b
Thoda confused in option 2 :
In auto increment addressing mode, the base address is incremented after operand
fetch. This is useful in fetching elements from an array. But this has no effect in self-
relocating code (where code can be loaded to any address) as this works on the basis of
an initial base address.
An additional ALU is desirable for better execution especially with pipelining, but never a
necessity. Amount of increment depends on the size of the data item accessed as there
is no need to fetch a part of the data. So, the answer must be C only.
Theoryyy :
Flags are tested during conditional calls and jump is not affected or changed !
Nice concept :
Because CPU is faster than memory. Fetching instructions from memory would require a
considerable amount of time while CPU is much faster. So, prefetching the instructions
to be executed can save a considerable amount of waiting time.
RISC systems commonly uses Expanding opcode technique to have fixed size
instructions.
Instruction size should be considered only while calculating cycles for instruction fetch
not in execution :
GATE CSE 2004 | Question: 64 - GATE Overflow for GATE CSE
RFE (Return From Exception) is a privileged trap instruction that is executed when an
exception occurs, so an exception is not allowed to execute. (D) is the correct option.
A trap instruction used to switch from the user mode of the system to the kernel mode.
1. It must be a trap instruction - Definitely it must be a trap instruction as RFE is an
explicit privileged instruction causing a switch from kernel to user mode
2. It must be a privileged instruction - Yes, because RFE can be executed only in
supervisor/kernel mode
3. An exception cannot be allowed to occur during execution of an RFE instruction
- Yes, because as soon as a trap/interrupt starts being processed all other
traps/interrupts are disabled until the current instruction execution is complete.
Good question, try once !
https://gateoverflow.in/3476/Gate-it-2007-question-41
Little endian and Big Endian :
Concept pata tha phir bhi galat kr diya : (
Indexed mode can not be used for position independent or relocatable code because
address to instruction k andar hi hota hai !
Effective Address = Content of Index Register + Address part of the instruction
Forgot to take the next microinstruction address into consideration !
Remember flag bits in the control word always need logn bits irrespective of
horizontal and vertical microprogramming.
Nano Programmed Control Unit :
Time taken in nano programmed control unit = 2 control memory access time
In order to execute an instruction, the processor fetches the instruction from the
program memory and places it in the instruction register(IR). This instruction is then
decoded by the instruction decoder, which is a combinatorial circuit. It decodes the
instruction and generates the corresponding control signal.
During the execution of a microprogram, the µPC is always incremented every
time a new microinstruction is fetched from the microprogram memory, except
in the following situations:
i. When an End instruction is encountered, the µPC is loaded with the address
of the first CW in the microprogram for the instruction fetch cycle.
ii. When a new instruction is loaded into the IR, the µPC is loaded with the
starting address of the microprogram for that instruction.
iii. When a branch microinstruction is encountered, and the branch condition
is satisfied, the µPC is loaded with the branch address.
Structural hazard is a condition when multiple instructions get executed in
parallel which are already in the process of a pipeline that shares the same
resource. The result will be executed in a sequence rather than in a parallel way
for a section of the pipeline. Structural hazards sometimes are also called
resource hazards. Control hazards will execute when the pipeline takes a false
decision on the branch predictions.
On calculating the speed up, I got 4.5. Now the twist is what options to be
chosen ?
Ideal speed up is 4.5 and in all cases(including hazards,etc) speed up cannot
be more than 4.5. Therefore you have to choose option a and b.
Not able to solve it (I thought some more info in the ques for non pipelined processor
should be there)
Clock frequency becomes low means the time period of clock becomes high. When this time
period increases beyond the time period in which the volatile memory contents must be
refreshed, we lose those contents. So, clock frequency can't go below this value.
Pipelining ka ques theoretical : GATE CSE 2015 Set 3 | Question: 47 - GATE Overflow for GATE
CSE
Good concept :
Very good question For revising data path : GATE CSE 2001 | Question: 2.13 - GATE Overflow for
GATE CSE
Doubtful answer check once : GATE CSE 2005 | Question: 79 - GATE Overflow for GATE CSE
Know the concept but not able to solve before revising concepts :
GATE CSE 2000 | Question: 12 - GATE Overflow for GATE CSE
Know more about Hazards ;
1. Data hazard
2. Control hazard
3. Structural hazard as only one ALU is there
So, (D)
Must read the solution :
GATE CSE 2005 | Question: 68 - GATE Overflow for GATE CSE
Learn about delayed branch : GATE CSE 2008 | Question: 76 - GATE
Overflow for GATE CSE
GATE CSE 2013 | Question: 45 - GATE Overflow for GATE CSE
Isme WO of I4 & IF of I9 can be overlapped (I missed this thing in this ques)
(B) I and III
I - False Bypassing can't handle all RAW hazards, consider when any instruction
depends on the result of LOAD instruction, now LOAD updates register value at Memory
Access Stage (MA), so data will not be available directly on Execute stage.
II - True, register renaming can eliminate all WAR Hazard.
III- False, It cannot completely eliminate, though it can reduce Control Hazard Penalties
Doubtful answer, Learn about split phase : GATE CSE 2015 Set 2 | Question:
44 - GATE Overflow for GATE CSE
I don’t know how to calculate throughput increase :
GATE CSE 2016 Set 1 | Question: 32 - GATE Overflow for GATE CSE
What is Minimum average latency ?
GATE CSE 2015 Set 3 | Question: 51 - GATE Overflow for GATE CSE
Not able to solve before revising :
GATE CSE 2016 Set 2 | Question: 33 - GATE Overflow for GATE CSE
Extra cycles pipelining question :
GATE CSE 2018 | Question: 50 - GATE Overflow for GATE CSE
Hazards aur unke elimination methods padh lo behen jiii !
Got confused in MIPS ka formula :
Didn’t know what to do with 40% load and store instructions. I multiplied
the cache cycles formula with 0.4 :(
Total instruction fetch= n
Apart from this 40% load and store instruction will need extra memory
access. Therefore total memory access = n + 0.4n = 1.4 n
New Formula :
If we have a pipeline of p stages, then what can be the maximum speedup that can
be achieved?
Answer : p
The maximum speedup that can be obtained in a pipelining is having the upper bound of the
number of stages. This is possible theoretically only but in practical, there are many types of
dependencies and hardware limitations.
Gain some knowledge about load and store instructions ;
Synchronous me max(all stages)+buffer delay
Asynchronous me Add each stage delay
Store wale instruction ki dependency count krni thi !
That's how you have to count BNE branch wali condition me dependencies :
GATE CSE 2001 | Question: 9 - GATE Overflow for GATE CSE
In the last part of the ques : Every block in a set has a tag field so in our case, every set has 4
blocks. Therefore we are multiplying by 4
mux is not required in direct mapped cache coz we have only one comparator
(IF IT IS 2-way SET ASSOCIATIVE THEN COMPARATOR WILL BE 2 AND WE
NEED A MUX OF 2-TO-1 TO DECIDE HIT/MISS) so mux delay=0.
Check what 128 lines mean here? Whether there are 132 cache blocks or 128 sets?
GATE CSE 2007 | Question: 10 - GATE Overflow for GATE CSE
Something new that you haven't studied till now:
Please read this : https://gateoverflow.in/14480/formula-write-back-
write-through-access-time-parallel-serial
Write allocate and no write allocate : https://www.youtube.com/watch?
v=JETrCQrRtIU
think about the significance of 1100H why it is given in ques
GATE CSE 2007 | Question: 81 - GATE Overflow for GATE CSE
Tough questions: GATE CSE 2008 | Question: 72 - GATE Overflow for GATE CSE
GATE CSE 2008 | Question: 73 - GATE Overflow for GATE CSE
What i did wrong here:since L1 is requesting we need to consider L1 block size and not
L2 block size. I considered L2 cache size.
GATE CSE 2010 | Question: 48 - GATE Overflow for GATE CSE
Forgot to multiply k : GATE CSE 2013 | Question: 20 - GATE Overflow for GATE CSE
Something new : GATE CSE 2016 Set 2 | Question: 32 - GATE Overflow for GATE CSE
Conceptual ques :
Write through ka basic concept : (write ka koi hit rate nhi hota…always memory is accessed)
While calculating effective hit rate in write through cache, only read operations will be considered
Yaha pe jo RAM ki capacity di rehti hai vo Bits me hoti hai na ki bytes me…
Yaha par 32 bytes jo overhead diya hai vo mene add kr diya ie: no of bytes/sector = 512+32.
Par gadhii sector ki capacity ki thodi change ho jayegi…the ques is saying ki format me kuch
bytes use ho gye to kitni capacity bachi…that means you have to subtract that 32 ie: 512-32 is
right.
Very good question :
See what is the significance of giving word size in the question.
Bhaiisahab zara block size par bhi gor kariyeee !!
T block is taken as 8*80
DMA I/O questions that I was not able to solve : GATE CSE 2004 | Question: 68 - GATE Overflow
for GATE CSE
GATE CSE 2005 | Question: 70 - GATE Overflow for GATE CSE
GATE CSE 2021 Set 2 | Question: 20 - GATE Overflow for GATE CSE
GATE IT 2004 | Question: 51 - GATE Overflow for GATE CSE
GATE CSE 1996 | Question: 25 - GATE Overflow for GATE CSE
If question mentions nothing about which mode we should use, then by default we consider
Burst mode. Moreover, if the question mentions, "constantly transferring data to memory using
DMA or the disk is actively transferring 100% of the time ”, then also we need to consider Burst
Mode.
INTR is a signal which if enabled then microprocessor has interrupt enabled it receives
high INR signal & activates INTA signal, so another request can’t be accepted till CPU is
busy in servicing interrupt. Hence (A) is correct option
GATE CSE 2005 | Question: 69 - GATE Overflow for GATE CSE
The minimum performance gain for interrupt mode happens for the smallest unit of data
transfer – which here is 1 byte.
State whether the following statements are TRUE or FALSE
1)In a microprocessor-based system, if a bus (DMA) request and an interrupt request
arrive simultaneously, the microprocessor attends first to the bus request.
The HOLD input has a higher priority than the INTR or NMI interrupt inputs.
So the answer is true
2)Data transfer between a microprocessor and an I/O device is usually faster in
memory-mapped-I/O scheme than in I/O-mapped -I/O scheme.
True
It will take extra time in IO mapped IO because of the control signal.
3)The data transfer between memory and I/O devices using programmed I/O is faster
than interrupt-driven I/O.
False because in programmed I/O, the CPU will check the I/O devices' status according
to the written program.
Answer is (B).
In synchronous I/O process performing I/O operation will be placed in a blocked state till
the I/O operation is completed. An ISR will be invoked after the completion of I/O
operation and it will place the process from block state to ready state.
In asynchronous I/O, Handler function will be registered while performing the I/O
operation. The process will not be placed in the block state and the process continues to
execute the remaining instructions. when the I/O operation completed signal
mechanism is used to notify the process that data is available.
Look at the significance of Unsigned EVEN INTEGER in this ques :
GATE IT 2006 | Question: 41 - GATE Overflow for GATE CSE
A stack pointer is a small register that stores the address of the last program request in
a stack.
And a nested function (or nested procedure or subroutine) is a function which is defined
within another function, the enclosing function. So if there is no stack pointer register
then No nested subroutine call possible, hence option B is correct.
See yaha par ‘tag bits’ nhi pucha tha… they asked for the number of tags !