0% found this document useful (0 votes)
29 views1 page

2 Pipelining (II) Consider Two Pipelined Machines

The document discusses two pipelined machines, Machine I and Machine II, implementing the MIPS ISA with five pipeline stages. Machine I relies on the compiler for instruction ordering and does not implement hardware interlocking, while Machine II uses hardware data forwarding to resolve dependencies. The document includes a code segment and poses questions regarding instruction execution, code size, cycle count, and performance comparison between the two machines.

Uploaded by

yatharth.anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views1 page

2 Pipelining (II) Consider Two Pipelined Machines

The document discusses two pipelined machines, Machine I and Machine II, implementing the MIPS ISA with five pipeline stages. Machine I relies on the compiler for instruction ordering and does not implement hardware interlocking, while Machine II uses hardware data forwarding to resolve dependencies. The document includes a code segment and poses questions regarding instruction execution, code size, cycle count, and performance comparison between the two machines.

Uploaded by

yatharth.anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Find solutions for your homework Search

engineering computer science computer science questions and answers 2 pipelining (ii) consider two pipelined…
machines implementing mips isa, machine i and machine ii: both machines have the following !ve pipeline stages (very
similarly to the basic 5-stage pipelined mips processor we discussed in lectures), and one alu: 1. fetch (one clock cycle) 2.
Question: 2 Pipelining (II) Consider Two Pipelined Machines
decode (one clock cycle) 3. execute (one clock cycle) 4. memory (one clock
Implementing MIPS ISA, Machine I And Machine II: Both Machines
Have The Following Five Pipeline Stages (Very Similarly To The Basic 5…
Stage Pipelined MIPS Processor We Discussed In Lectures), And One
ALU: 1. Fetch (One Clock Cycle) 2. Decode (One Clock Cycle) 3. Execute
2Pipelining(II)
(One Clock Cycle) 4. Memory (One Clock
ConsidertwopipelinedmachinesimplementingMIPSISA,MachineIandMachineII:
Bothmachineshavethefollowingfivepipelinestages(verysimilarlytothebasic5-stagepipelinedMIPS
processorwediscussedinlectures),andoneALU:
1.Fetch(oneclockcycle)
2.Decode(oneclockcycle)

3.Execute(oneclockcycle)
4.Memory(oneclockcycle)

5.Write-back(oneclockcycle).
MachineIdoesnotimplementinterlockinginhardware.Itassumesallinstructionsareindependentand
reliesonthecompilertoorderinstructionssuchthatthereissufficientdistancebetweendependent
instructions.'Thecompilereithermovesotherindependentinstructionsbetweentwodependentinstruc-
tions,ifitcanfindsuchinstructions,orotherwise,insertsnops.Assumeinternalregisterfileforwarding
(aninstructionwritesintoaregisterinthefirsthalfofacycleandanotherinstructioncancorrectly
accessthesameregisterinthenexthalfofthecycle.Assumethattheprocessorpredictsallbranches
asalways-taken.
MachineIIimplementsdataforwardinginhardware.Ondetectionofaflowdependence,itforwardsan
operandfromthememorystageorfromthewrite-backstagetotheexecutestage.Theloadinstruction
(lw)canonlybeforwardedfromthewrite-backstagebecausedatabecomesavailableinthememory
stagebutnotintheexecutestagelikefortheotherinstructions.Assumeinternalregisterfileforwarding
(aninstructionwritesintoaregisterinthefirsthalfofacycleandanotherinstructioncanaccessthe
sameregisterinthenexthalfofthecycle).'Thecompilerdoesnotreorderinstructions.Assumethatthe
processorpredictsallbranchesasalways-taken.
Considerthefollowingcodesegment:
Copy:1w $2,100($5)

(b)WhenthegivencodesegmentisexecutedonMachineII,dependenciesbetweeninstructionsareresolved
inhardware.Explainwhendataisforwardedandwhichinstructionsarestalledandwhentheyare
stalled.(10points)

(c)CalculatethemachinecodesizeofthecodesegmentsexecutedonMachineI(part(a))andMachineII
(part(b)).EvaluateMachineCodeSizeinBytes(assume1Word=4Bytes)(10points)

(d)CalculatethenumberofcyclesittakestoexecutethecodesegmentonMachineIandMachineII.(55points)

(e)Whichmachineisfasterforthiscodesegment?Explain.(10points)

Show transcribed image text

Expert Answer

Transcribed image text: 2 Pipelining (II) Consider two pipelined machines implementing MIPS
ISA, Machine I and Machine II: Both machines have the following !ve pipeline stages (very
similarly to the basic 5-stage pipelined MIPS processor we discussed in lectures), and one ALU:
1. Fetch (one clock cycle) 2. Decode (one clock cycle) 3. Execute (one clock cycle) 4. Memory (one
clock cycle) 5. Write-back (one clock cycle). Machine I does not implement interlocking in
hardware. It assumes all instructions are independent and relies on the compiler to order
instructions such that there is sufficient distance between dependent instructions. The compiler
either moves other independent instructions between two dependent instruc- tions, if it can !nd
such instructions, or otherwise, inserts nops. Assume internal register !le forwarding (an
instruction writes into a register in the !rst half of a cycle and another instruction can correctly
access the same register in the next half of the cycle). Assume that the processor predicts all
branches as always-taken. Machine II implements data forwarding in hardware. On detection of
a #ow dependence, it forwards an operand from the memory stage or from the write-back
stage to the execute stage. The load instruction (lw) can only be forwarded from the write-back
stage because data becomes available in the memory stage but not in the execute stage like for
the other instructions. Assume internal register !le forwarding (an instruction writes into a
register in the !rst half of a cycle and another instruction can access the same register in the
next half of the cycle). The compiler does not reorder instructions. Assume that the processor
predicts all branches as always-taken. Consider the following code segment: Copy: lw $2,100
($5) SW Copy: lw $2,100 ($5) $2, 200($6) addi $1, $1, 1 bne $1, $25, Copy Initially, $5 = 0, $6 = 0,
$1 = 0, and $25 = 25. (a) When the given code segment is executed on Machine I, the compiler
has to reorder instructions and insert nops if needed. Write the resulting code that has minimal
modi!cations from the original. (25 points) (b) When the given code segment is executed on
Machine II, dependencies between instructions are resolved in hardware. Explain when data is
forwarded and which instructions are stalled and when they are stalled. (10 points) (c) Calculate
the machine code size of the code segments executed on Machine I (part (a)) and Machine II
(part (b)). Evaluate Machine Code Size in Bytes (assume 1 Word = 4 Bytes) (10 points) (d)
Calculate the number of cycles it takes to execute the code segment on Machine I and Machine
II. (55 points) (e) Which machine is faster for this code segment? Explain. (10 points)

Previous question Next question

COMPANY

LEGAL & POLICIES

CHEGG PRODUCTS AND SERVICES

CHEGG NETWORK

CUSTOMER SERVICE

© 2003-2023 Chegg Inc. All rights reserved.

This question hasn't been solved yet

Ask an expert

You might also like