Instruction Count and Cpi
Instruction Count and Cpi
Solution: To solve this problem, we need to calculate the CPI (Cycles per
Instruction) for each code sequence and then determine which one is faster.
Given information:
Instruction class A has a CPI of 1.
Instruction class B has a CPI of 3.
Instruction class C has a CPI of 4.
Code sequence 1 requires 2 million instructions of class A, 1 million instructions of
class B, and 2 million instructions of class C.
Code sequence 2 requires 4 million instructions of class A, 3 million instructions of
class B, and 1 million instructions of class C.
a. Calculating the CPI for each code sequence:
Code sequence 1:
CPI = (2 million × 1) + (1 million × 3) + (2 million × 4) / (2 million + 1 million + 2
million)
CPI = 2 + 3 + 8 / 5 = 2.6
Code sequence 2:
CPI = (4 million × 1) + (3 million × 3) + (1 million × 4) / (4 million + 3 million + 1
million)
CPI = 4 + 9 + 4 / 8 = 2.125
b. Determining the faster code sequence:
The CPI of code sequence 2 (2.125) is lower than the CPI of code sequence 1 (2.6),
which means that code sequence 2 is faster.
To calculate the difference in speed:
Speed difference = (CPI of code sequence 1 - CPI of code sequence 2) / CPI of code
sequence 2 × 100%
Speed difference = (2.6 - 2.125) / 2.125 × 100% = 22.35%
Therefore, code sequence 2 is 22.35% faster than code sequence 1.
5.
Theory:
Instruction Count for a program
Determined by program, ISA and compiler
Formula:
Solution:
A. Calculating the total execution time for the program on 1, 2, 4, and 8
processors:
Total execution time = [(2.56 × 10^9 × 1) + (1.28 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 9.6 seconds
Relative speedup=9.6/9.6=1
Total execution time = [(1.82 × 10^9 × 1) + (0.91 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 7.02 seconds
Relative speedup=9.6/7.02=1.37
Total execution time = [(0.91 × 10^9 × 1) + (0.46 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 3.86 seconds
Relative speedup=9.6/3.86=2.49
Total execution time = [(0.46 × 10^9 × 1) + (0.23 × 10^9 × 12) + (256 × 10^6 × 5)] /
(2 × 10^9)
Total execution time = 2.25 seconds
Relative speedup=9.6/2.25=4.27
B.if the CPI of arithmetic instructions is doubled to 2, the total execution time for
each processor configuration would increase proportionally.
If the CPI of arithmetic instructions is doubled to 2, the total execution time for
each processor configuration would increase proportionally.
New total execution time (single processor) = [(2.56 × 10^9 × 2) + (1.28 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (single processor) = 10.88seconds
New total execution time (dual processors) = [(1.82 × 10^9 × 2) + (0.91 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (dual processors) = 7.954 seconds
New total execution time (quad processors) = [(0.91 × 10^9 × 2) + (0.46 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (quad processors) = 4.297 seconds
New total execution time (octa processors) = [(0.46 × 10^9 × 2) + (0.23 × 10^9 ×
12) + (256 × 10^6 × 5)] / (2 × 10^9)
New total execution time (octa processors) = 2.468 seconds
C .For 4 processors: If the program is parallelized to run the over multiple cores
then the number of instructions for the arithmetic and load store per processor is
divided by the 0.7 multiply by the number of processor p and the branch
instruction remain same. There are four processors: 1,2,4,8. Therefore,
2560000000x1/0.7x4 + 1280000000x12/0.7×4 + 256000000x5= 7720000000
Now calculate the execution time with the help of following method:
7720000000 /2×10*9 =3.86 sec
Reducing CPI of a single processor to match the performance of 4 processors:
Calculate the clock cycle with the help of following method:
clock cycle =2560000000×1+1280000000 x a+256000000×5 =
2560000000+1280000000 x a+1280000000 =3840000000+1280000000 x a
Now calculate the execution time with the help of following method:
execution time for a processor = clock cycle/ clock rate
Therefore:
CPU execution time = 3840000000+1280000000x a/2×109
3.86 = 1.92 + 0.64 x a
a = 3.03
The reduced CPI is calculated as follows:
Reduced CPI = a / Original CPI for load instructions= 3.03/ 12 = 0.25 (or)25%
Thus, the reduced CPI of load/store instructions is 25%
6
Theory:
Propagation of electronic signals:
o The assumption that electronic signals can travel at a constant speed of
300,000 km/s.
o The time it takes for an electronic signal to travel a certain distance is
directly proportional to the distance.
Clock rate and period:
o The relationship between clock rate (frequency) and clock period, where
the period is the inverse of the frequency.
o The requirement that the time for an electronic signal to travel from one
edge of the chip to the other should be less than or equal to the clock
period.
Practical limitations of chip design:
o The feasibility of chip manufacturing and the size constraints based on
current technology.
o The tradeoff between chip size and clock rate, where higher clock rates
require smaller chip dimensions.
Solution:
Limitation on the diameter for a 1 GHz clock rate:
The time it takes for an electronic signal to travel from one edge of the chip to the
other should be less than or equal to the period of the clock (1/clock rate).
Time for an electronic signal to travel from one edge to the other = Diameter /
(300,000 km/s)
Period of the clock = 1 / (1 GHz) = 1 × 10*-9 s
Diameter ≤ (300,000 km/s) × (1 × 10*-9 s) = 0.3 m = 30 cm
Therefore, the maximum diameter of the chip for a 1 GHz clock rate is 30 cm.
Feasibility: