0% found this document useful (0 votes)
36 views19 pages

Lecture 3

This document discusses various metrics for measuring computer performance, including response time, CPU execution time, clock rate, clock period, and CPU time. It explains that CPU execution time depends on the number of clock cycles and clock cycle time. Performance can be improved by reducing the number of clock cycles needed, decreasing the clock cycle time, or increasing the clock rate. Examples are provided to illustrate how to calculate performance improvements based on these metrics.

Uploaded by

Anam Ghaffar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views19 pages

Lecture 3

This document discusses various metrics for measuring computer performance, including response time, CPU execution time, clock rate, clock period, and CPU time. It explains that CPU execution time depends on the number of clock cycles and clock cycle time. Performance can be improved by reducing the number of clock cycles needed, decreasing the clock cycle time, or increasing the clock rate. Examples are provided to illustrate how to calculate performance improvements based on these metrics.

Uploaded by

Anam Ghaffar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Computer Performance

Execution time
 Time can be defined in different ways, depending on what
we are measuring:
 Response time : Total time to complete a task,
including time spent executing on the CPU, accessing
disk and memory, waiting for I/O and other processes,
and operating system overhead.

 CPU execution time : Total time a CPU spends


computing on a given task (excludes time for I/O or
running other programs). This is also referred to as
simply CPU time.

Advanced Computer Architecture


3
Clock
 Computer Designers use clock ticks to measure how fast
the hardware can perform basic functions

 Clock rate (frequency) = cycles per second.


 Measured in Hertz (1 Hz = 1 cycle/s).

 Clock period is the time between ticks of the clock and is


measured in seconds per cycle.
 Period = 1/frequency

 Example: A 200 MHz (MegaHertz) clock has a clock period


of 5nanoseconds

Advanced Computer Architecture


4
Time & Clock Metrics
 Determine effect of design change on performance

 CPU execution time = CPU clock cycles X clock cycle time


 CPU execution time = CPU clock cycles/clock rate

 For some program running on machine X,


PerformanceX = 1 / Execution timeX

 "X is n times faster than Y"


PerformanceX / PerformanceY = n

Problem:
 machine A runs a program in 20 seconds
 machine B runs the same program in 25 seconds
Advanced Computer Architecture
5
How to improve performance?

seconds cycles seconds


 
program program cycle

 So, to improve performance (everything else being equal)


you can either

reduce
_______ the # of required cycles for a program, or
decrease
_______ the clock cycle time or, said another way,
increase
_______ the clock rate.

execution time  number of cycles  clock cycle time

Advanced Computer Architecture


6
Example
 Our favorite program runs in 10 seconds on computer A,
which has a 400 Mhz. clock. We are trying to help a
computer designer build a new machine B, that will run
this program in 6 seconds. The designer can use new (or
perhaps more expensive) technology to substantially
increase the clock rate, but has informed us that this
increase will affect the rest of the CPU design, causing
machine B to require 1.2 times as many clock cycles as
machine A for the same program. What clock rate
should we tell the designer to target?

Advanced Computer Architecture


7
Example
Let C = number of cycles
Execution time = C X clock cycle time = C/ clock rate

On computer A,
C/ 400 MHz = C/ 400 X 106 = 10 seconds => C = 400 X 107

On computer B, number of cycles = 1.2 X C


What should be B’s clock rate so that our favorite program has
smaller execution time?

1.2 X C/ clock rate = 6 => 1.2 X 400 X 107 / 6 = clock rate


I.e. clock rate = 800 MHz

Advanced Computer Architecture


8
CPU performance equation
 An alternative to "number of clock cycles" is "number of
instructions executed" or Instruction Count ( IC ).
 Given both the "number of clock cycles" and IC of a
program, the average Clocks Per Instruction ( CPI ) is given
by:

CPU Clock Cycles of a Program


CPI 
IC

IC  CPI
CPU time  IC  CPI  Clock cycle time 
Clock rate

Advanced Computer Architecture


9
Example
 Suppose we have two implementations of the same
instruction set architecture. Machine A has a clock cycle
time of 1 ns (nanoseconds) and a CPI of 2.0 for some
program, and machine B has a clock cycle time of 2 ns and
a CPI of 1.2 for the same program.

 Which machine is faster for this program, and how much


faster is it?

Advanced Computer Architecture


10
Solution
 We know that each machine executes the same number of
instructions for the same program; let’s call this number I
 CPU clock cyclesA = I x 2.0 CPU
 clock cyclesB = I x 1.2

CPU timeA = CPU clock cyclesA x Clock cycle timeA
= I x 2.0 x 1 ns = 2 x I ns
 CPU timeB = I x 1.2 x 2 ns = 2.4 x I ns
 Machine A is faster. How much?

 Machine A is 1.2 times faster than machine B


Advanced Computer Architecture
11
Example
 A compiler designer is trying to decide between two code
sequences for a particular machine. Based on the hardware
implementation, there are three different classes of
instructions: Class A, Class B, and Class C, and they require
one, two, and three cycles (respectively).
 The first code sequence has 5 instructions: 2 of A, 1 of B,
and 2 of C
 The second sequence has 6 instructions: 4 of A, 1 of B, and
1 of C.

 Which sequence will be faster? How much?


 What is the CPI for each sequence?

Advanced Computer Architecture


12
Solution

 Sequence 2 is 11 % faster than sequence 1

Advanced Computer Architecture


13
Remember
 A given program will require
– some number of instructions (machine instructions)
– some number of cycles
– some number of seconds

 We have a vocabulary that relates these quantities:


– cycle time (seconds per cycle)
– clock rate (cycles per second)
– CPI (cycles per instruction)

Advanced Computer Architecture


14
CPU Performance
 For a given instruction set architecture, increases in CPU
performance can come from three sources:
1. lower the instruction count or generate instructions with a
lower average CPI
2. lower the CPI
3. Increases in clock rate

Advanced Computer Architecture


15
Does doubling the clock rate
double the performance?

Advanced Computer Architecture


16
Amdahl’s law
 The performance improvement to be gained from using
some faster mode of execution is limited by the fraction of
the time the faster mode can be used.

 This implies that the time consumed by events whose


performance is not improved limits the effect of any
improvement.
 Lowest performer restricts all others.

 Execution Time After Improvement = Execution Time


Unaffected + ( Execution Time Affected / Amount of
Improvement )

Advanced Computer Architecture


17
Example
 Trip from point A to point B in two parts

A 20 C 50/20/4/1.7/0.3 B

A-C Trip C-B Trip Total Time C-B Speedup Overall Speedup
20 50 70 1 1
20 20 40 2.5 1.75
20 4 24 12.5 2.9
20 1.7 21.7 29.4 3.2
20 0.3 20.3 166.66 3.4

Advanced Computer Architecture


18
Example

Example: "Suppose a program runs in 100 seconds on


a machine, with multiply responsible for 80 seconds
of this time. How much do we have to improve the
speed of multiplication if we want the program to run
4 times faster?"

Advanced Computer Architecture


19
Answer
 Execution time after improvement = (100 – 80 seconds) +
(80 seconds / n)

 Since we want the performance to be 4 times faster, the


new execution time should be 25 seconds, giving:
 25 seconds = (20 seconds) + (80 seconds / n)
5 = (80 seconds / n)
n = 80/5 = 16 times
 5 times improvement?
 There is no amount by which we can enhance multiply to
achieve a fivefold increase in performance, if multiply
accounts for only 80% of the workload.

Advanced Computer Architecture


20

You might also like