0% found this document useful (0 votes)

45 views

Instruction Count and Cpi

Uploaded by

thuw1310

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

Instruction Count and Cpi

Uploaded by

thuw1310

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

INSTRUCTION COUNT AND CPI

4.
Instruction Count for a program
 Determined by program, ISA and compiler

Average cycles per instruction

 Determined by CPU hardware
 If different instructions have different CPI
o Average CPI affected by instruction mix
Formula:

Solution: To solve this problem, we need to calculate the CPI (Cycles per
Instruction) for each code sequence and then determine which one is faster.
Given information:
Instruction class A has a CPI of 1.
Instruction class B has a CPI of 3.
Instruction class C has a CPI of 4.
Code sequence 1 requires 2 million instructions of class A, 1 million instructions of
class B, and 2 million instructions of class C.
Code sequence 2 requires 4 million instructions of class A, 3 million instructions of
class B, and 1 million instructions of class C.
a. Calculating the CPI for each code sequence:
Code sequence 1:
CPI = (2 million × 1) + (1 million × 3) + (2 million × 4) / (2 million + 1 million + 2
million)
CPI = 2 + 3 + 8 / 5 = 2.6
Code sequence 2:
CPI = (4 million × 1) + (3 million × 3) + (1 million × 4) / (4 million + 3 million + 1
million)
CPI = 4 + 9 + 4 / 8 = 2.125
b. Determining the faster code sequence:
The CPI of code sequence 2 (2.125) is lower than the CPI of code sequence 1 (2.6),
which means that code sequence 2 is faster.
To calculate the difference in speed:
Speed difference = (CPI of code sequence 1 - CPI of code sequence 2) / CPI of code
sequence 2 × 100%
Speed difference = (2.6 - 2.125) / 2.125 × 100% = 22.35%
Therefore, code sequence 2 is 22.35% faster than code sequence 1.
5.
Theory:
Instruction Count for a program
 Determined by program, ISA and compiler

Average cycles per instruction

 Determined by CPU hardware
 If different instructions have different CPI
 Average CPI affected by instruction mix

Formula:
Solution:
A. Calculating the total execution time for the program on 1, 2, 4, and 8
processors:

Single processor (p = 1):

Arithmetic instructions per processor = 2.56 × 10^9
Load/store instructions per processor = 1.28 × 10^9
Branch instructions per processor = 256 × 10^6

Total execution time = [(2.56 × 10^9 × 1) + (1.28 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 9.6 seconds
Relative speedup=9.6/9.6=1

Dual processors (p = 2):

Arithmetic instructions per processor = (2.56 × 10^9) / (0.7 × 2) = 1.82 × 10^9
Load/store instructions per processor = (1.28 × 10^9) / (0.7 × 2) = 0.91 × 10^9
Branch instructions per processor = 256 × 10^6

Total execution time = [(1.82 × 10^9 × 1) + (0.91 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 7.02 seconds
Relative speedup=9.6/7.02=1.37

Quad processors (p = 4):

Arithmetic instructions per processor = (2.56 × 10^9) / (0.7 × 4) = 0.91 × 10^9
Load/store instructions per processor = (1.28 × 10^9) / (0.7 × 4) = 0.46 × 10^9
Branch instructions per processor = 256 × 10^6

Total execution time = [(0.91 × 10^9 × 1) + (0.46 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 3.86 seconds
Relative speedup=9.6/3.86=2.49

Octa processors (p = 8):

Arithmetic instructions per processor = (2.56 × 10^9) / (0.7 × 8) = 0.46 × 10^9
Load/store instructions per processor = (1.28 × 10^9) / (0.7 × 8) = 0.23 × 10^9
Branch instructions per processor = 256 × 10^6

Total execution time = [(0.46 × 10^9 × 1) + (0.23 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 2.25 seconds
Relative speedup=9.6/2.25=4.27
B.if the CPI of arithmetic instructions is doubled to 2, the total execution time for
each processor configuration would increase proportionally.
If the CPI of arithmetic instructions is doubled to 2, the total execution time for
each processor configuration would increase proportionally.
New total execution time (single processor) = [(2.56 × 10^9 × 2) + (1.28 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (single processor) = 10.88seconds

New total execution time (dual processors) = [(1.82 × 10^9 × 2) + (0.91 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (dual processors) = 7.954 seconds

New total execution time (quad processors) = [(0.91 × 10^9 × 2) + (0.46 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (quad processors) = 4.297 seconds

New total execution time (octa processors) = [(0.46 × 10^9 × 2) + (0.23 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (octa processors) = 2.468 seconds
C .For 4 processors: If the program is parallelized to run the over multiple cores
then the number of instructions for the arithmetic and load store per processor is
divided by the 0.7 multiply by the number of processor p and the branch
instruction remain same. There are four processors: 1,2,4,8. Therefore,
2560000000x1/0.7x4 + 1280000000x12/0.7×4 + 256000000x5= 7720000000
Now calculate the execution time with the help of following method:
7720000000 /2×10*9 =3.86 sec
Reducing CPI of a single processor to match the performance of 4 processors:
Calculate the clock cycle with the help of following method:
clock cycle =2560000000×1+1280000000 x a+256000000×5 =
2560000000+1280000000 x a+1280000000 =3840000000+1280000000 x a
Now calculate the execution time with the help of following method:
execution time for a processor = clock cycle/ clock rate
Therefore:
CPU execution time = 3840000000+1280000000x a/2×109
3.86 = 1.92 + 0.64 x a
a = 3.03
The reduced CPI is calculated as follows:
Reduced CPI = a / Original CPI for load instructions= 3.03/ 12 = 0.25 (or)25%
Thus, the reduced CPI of load/store instructions is 25%
6
Theory:
Propagation of electronic signals:
o The assumption that electronic signals can travel at a constant speed of
300,000 km/s.
o The time it takes for an electronic signal to travel a certain distance is
directly proportional to the distance.
Clock rate and period:
o The relationship between clock rate (frequency) and clock period, where
the period is the inverse of the frequency.
o The requirement that the time for an electronic signal to travel from one
edge of the chip to the other should be less than or equal to the clock
period.
Practical limitations of chip design:
o The feasibility of chip manufacturing and the size constraints based on
current technology.
o The tradeoff between chip size and clock rate, where higher clock rates
require smaller chip dimensions.
Solution:
Limitation on the diameter for a 1 GHz clock rate:
The time it takes for an electronic signal to travel from one edge of the chip to the
other should be less than or equal to the period of the clock (1/clock rate).
Time for an electronic signal to travel from one edge to the other = Diameter /
(300,000 km/s)
Period of the clock = 1 / (1 GHz) = 1 × 10*-9 s
Diameter ≤ (300,000 km/s) × (1 × 10*-9 s) = 0.3 m = 30 cm
Therefore, the maximum diameter of the chip for a 1 GHz clock rate is 30 cm.

Limitation on the diameter for a 1 THz clock rate:

The time it takes for an electronic signal to travel from one edge of the chip to the
other should be less than or equal to the period of the clock (1/clock rate).
Time for an electronic signal to travel from one edge to the other = Diameter /
(300,000 km/s)
Period of the clock = 1 / (1 THz) = 1 × 10*-12 s
Diameter ≤ (300,000 km/s) × (1 × 10*-12 s) = 0.3 mm
Therefore, the maximum diameter of the chip for a 1 THz clock rate is 0.3 mm.

Feasibility:

A chip with a diameter of 30 cm operating at 1 GHz is feasible, as the dimensions

are within practical limits.
A chip with a diameter of 0.3 mm operating at 1 THz is not feasible, as the
dimensions are extremely small and would be challenging to manufacture with
current technology.
In conclusion, the maximum diameter of the chip is limited by the time it takes for
an electronic signal to travel from one edge of the chip to the other, relative to
the clock rate. While a 1 GHz chip with a 30 cm diameter is feasible, a 1 THz chip
with a 0.3 mm diameter is not feasible with current technology.

Manual Solution For RISC-V Edition
100% (5)
Manual Solution For RISC-V Edition
100 pages
Solution Chapter 1
91% (22)
Solution Chapter 1
2 pages
Problem 1 A) Considering The Number of Instructions Here To Be A Constant A
No ratings yet
Problem 1 A) Considering The Number of Instructions Here To Be A Constant A
13 pages
Computer Organization and Design Chap1-5
No ratings yet
Computer Organization and Design Chap1-5
52 pages
Homework 1
No ratings yet
Homework 1
18 pages
Chap 2 Exercises With Solutions
No ratings yet
Chap 2 Exercises With Solutions
7 pages
Computer Component Performance-Nguyễn Hoàng Long - BI11-157
100% (1)
Computer Component Performance-Nguyễn Hoàng Long - BI11-157
9 pages
Discussion Session 4-11
No ratings yet
Discussion Session 4-11
12 pages
HW 1
No ratings yet
HW 1
4 pages
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
Week 10 Part 02 - Processor Performance (Answers)
No ratings yet
Week 10 Part 02 - Processor Performance (Answers)
35 pages
02 Performance
No ratings yet
02 Performance
13 pages
Sample Questions
No ratings yet
Sample Questions
5 pages
Problem1 - Pablo Lird
No ratings yet
Problem1 - Pablo Lird
5 pages
Basic Performance Equation and Problems
40% (5)
Basic Performance Equation and Problems
4 pages
Computer Performance
No ratings yet
Computer Performance
27 pages
A1_sol_2020.pdf
No ratings yet
A1_sol_2020.pdf
13 pages
08 Perf Pipeline i
No ratings yet
08 Perf Pipeline i
65 pages
PS1_Exercises
No ratings yet
PS1_Exercises
32 pages
SEN307 Lecture 8
No ratings yet
SEN307 Lecture 8
16 pages
Sheet1 Computer
No ratings yet
Sheet1 Computer
2 pages
COA ASsignment
No ratings yet
COA ASsignment
7 pages
Chapter 01
No ratings yet
Chapter 01
20 pages
Module 3.3 - Problems On Performance
No ratings yet
Module 3.3 - Problems On Performance
54 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
Chapter 1 Notes
No ratings yet
Chapter 1 Notes
28 pages
L7 Performance
No ratings yet
L7 Performance
11 pages
Computer Organization The Role of Performance
No ratings yet
Computer Organization The Role of Performance
45 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
56 pages
Intro
No ratings yet
Intro
14 pages
4 Performance
No ratings yet
4 Performance
27 pages
09 Perf
No ratings yet
09 Perf
22 pages
Exercise 3 & 12
No ratings yet
Exercise 3 & 12
17 pages
2 CPU Performance
No ratings yet
2 CPU Performance
35 pages
Lesson 3 - Computing For Performance
No ratings yet
Lesson 3 - Computing For Performance
38 pages
Week 2 - Lecture 2 - Performance Measurement
No ratings yet
Week 2 - Lecture 2 - Performance Measurement
25 pages
1 Computer - Component Performance
No ratings yet
1 Computer - Component Performance
4 pages
Assignment 1 2020coa
No ratings yet
Assignment 1 2020coa
5 pages
Solution
No ratings yet
Solution
14 pages
DavidK - O Aboagye07170712
No ratings yet
DavidK - O Aboagye07170712
4 pages
Sheet 1
No ratings yet
Sheet 1
6 pages
Sheet2 - Solution (design)
No ratings yet
Sheet2 - Solution (design)
6 pages
Chapter-7 Practice Questions For Performance
No ratings yet
Chapter-7 Practice Questions For Performance
9 pages
CompArch Studcopy4units
No ratings yet
CompArch Studcopy4units
22 pages
PP 1
No ratings yet
PP 1
41 pages
2. Performance
No ratings yet
2. Performance
23 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
52 pages
Performance
No ratings yet
Performance
51 pages
CH01 Solution PDF
No ratings yet
CH01 Solution PDF
8 pages
Solutions For Homework 1
No ratings yet
Solutions For Homework 1
3 pages
clock
No ratings yet
clock
5 pages
計組題目解答
No ratings yet
計組題目解答
34 pages
COAL- Week 5 - Chap 2 (William Stallings)
No ratings yet
COAL- Week 5 - Chap 2 (William Stallings)
52 pages
CAO Fall 2024 Lecture 06 Design Metrics Performance Evaluation
No ratings yet
CAO Fall 2024 Lecture 06 Design Metrics Performance Evaluation
41 pages
Lecture 8
No ratings yet
Lecture 8
43 pages
Performance of Processor1
No ratings yet
Performance of Processor1
9 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Analog Dialogue, Volume 45, Number 4: Analog Dialogue, #4
From Everand
Analog Dialogue, Volume 45, Number 4: Analog Dialogue, #4
Analog Dialogue
No ratings yet
An Introduction To Data Acquisition
From Everand
An Introduction To Data Acquisition
Jason King
No ratings yet
Digital Spectral Analysis MATLAB® Software User Guide
From Everand
Digital Spectral Analysis MATLAB® Software User Guide
S. Lawrence Marple, Jr.
No ratings yet
Replacing A Controller Module in A 32xx Data
No ratings yet
Replacing A Controller Module in A 32xx Data
31 pages
Microcontroller Microcontroller Microcontroller Microcontroller Based Systems Based Systems Lab Lab
100% (1)
Microcontroller Microcontroller Microcontroller Microcontroller Based Systems Based Systems Lab Lab
25 pages
rgpv-syllabus-grading-ee-403-digital-electronics-logic-design-1 (1)
No ratings yet
rgpv-syllabus-grading-ee-403-digital-electronics-logic-design-1 (1)
2 pages
Dec50143 PW3
No ratings yet
Dec50143 PW3
10 pages
Alcatel-Lucent OmniPCX Enterprise DPT1
100% (1)
Alcatel-Lucent OmniPCX Enterprise DPT1
18 pages
CD Course Delivery
No ratings yet
CD Course Delivery
10 pages
EEE303-Week06 - Mux Demux
No ratings yet
EEE303-Week06 - Mux Demux
64 pages
HP z800 Workstation Datasheet (2010.08-Aug) (Xeon X5600 Series)
No ratings yet
HP z800 Workstation Datasheet (2010.08-Aug) (Xeon X5600 Series)
2 pages
T2 Worksheet 2
No ratings yet
T2 Worksheet 2
3 pages
B.Tech 2-1 R22
No ratings yet
B.Tech 2-1 R22
5 pages
Introduction To The PIC32 - The Basics, Getting Started, IO Ports and The First Program
100% (1)
Introduction To The PIC32 - The Basics, Getting Started, IO Ports and The First Program
17 pages
Sol 11
No ratings yet
Sol 11
8 pages
Ece3002 Vlsi System Deign Cocob 2019
No ratings yet
Ece3002 Vlsi System Deign Cocob 2019
3 pages
Msi MS-10581
No ratings yet
Msi MS-10581
44 pages
Data FTP PLC FBs Manual Manual 1 Hardware Chapter 2 PDF
No ratings yet
Data FTP PLC FBs Manual Manual 1 Hardware Chapter 2 PDF
3 pages
Vlsi Design Unit 2 2019
No ratings yet
Vlsi Design Unit 2 2019
37 pages
Component of A System Unit - Desktop PC
No ratings yet
Component of A System Unit - Desktop PC
8 pages
Computer Organization Questions and Answers Set-2
No ratings yet
Computer Organization Questions and Answers Set-2
6 pages
6 CIRI Ladder Diagram
No ratings yet
6 CIRI Ladder Diagram
3 pages
Lec11 PDF
No ratings yet
Lec11 PDF
45 pages
MODULE-5 - Basic-Processing-Unit
No ratings yet
MODULE-5 - Basic-Processing-Unit
76 pages
3 Aoi Circuit
No ratings yet
3 Aoi Circuit
3 pages
Computer Generation
100% (1)
Computer Generation
38 pages
LabPlanningManual 04 en
No ratings yet
LabPlanningManual 04 en
8 pages
Asus K46C
No ratings yet
Asus K46C
62 pages
Fast Binary Counters and Compressors Generated by Sorting Network
No ratings yet
Fast Binary Counters and Compressors Generated by Sorting Network
11 pages
Trouble Code 1.3 - Ma
No ratings yet
Trouble Code 1.3 - Ma
72 pages
Iot Majrr
No ratings yet
Iot Majrr
25 pages
CP Assignment
No ratings yet
CP Assignment
31 pages
Tut DE2 Sdram Verilog
No ratings yet
Tut DE2 Sdram Verilog
14 pages

Instruction Count and Cpi

Uploaded by

Instruction Count and Cpi

Uploaded by

INSTRUCTION COUNT AND CPI