0% found this document useful (0 votes)

39 views

CH02 COA10e

Uploaded by

minhhoangcong4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views

CH02 COA10e

Uploaded by

minhhoangcong4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 67

+

William Stallings
Computer Organization
and Architecture
10th Edition

© 2016 Pearson Education, Inc., Hoboken,

NJ. All rights reserved.
+ Chapter 2
Performance Issues
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
3

2.1 Designing for Performance 46

2.2 Multicore, Mics, and GPGPUs 52
2.3 Two Laws that Provide Insight:
Ahmdahl’s Law and Little’s Law 53
2.4 Basic Measures of Computer
Performance 56
2.5 Calculating the Mean 59
+ 2.6 Benchmarks and Spec 67
2.7 Key Terms, Review Questions,
and Problems 74
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Designing for Performance
 The cost of computer systems continues to drop dramatically, while the performance
and capacity of those systems continue to rise equally dramatically
 Today’s laptops have the computing power of an IBM mainframe from 10 or 15 years
ago
 Processors are so inexpensive that we now have microprocessors we throw away
 Desktop applications that require the great power of today’s microprocessor-based
systems include:
 Image processing
 Three-dimensional rendering
 Speech recognition
 Videoconferencing
 Multimedia authoring
 Voice and video annotation of files
 Simulation modeling

 Businesses are relying on increasingly powerful servers to handle transaction

and database processing and to support massive client/server networks that
have replaced the huge mainframe computer centers of yesteryear
 Cloud service providers use massive high-performance banks of servers to
satisfy high-volume, high-transaction-rate applications for a broad spectrum of
clients
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ Microprocessor Speed
Techniques built into contemporary processors include:
Pipelining Processor moves data or instructions into a conceptual
(kĩ thuật pipe with all stages of the pipe processing
đường ống) simultaneously
Bộ xử lý di chuyển dữ liệu hoặc hướng dẫn vào một ống khái niệm với tất cả các giai đoạn xử lý đường ống đồng
thời

Brach Processor looks ahead in the instruction code fetched

Prediction from memory and predicts which branches, or groups of
instructions, are likely to be processed next
Superscalar This is the ability to issue more than one instruction in
execution every processor clock cycle. (In effect, multiple parallel
pipelines are used.)
Data flow Processor analyzes which instructions are dependent on
analysis each other’s results, or data, to create an optimized
schedule of instructions
Speculative Using branch prediction and data flow analysis, some
execution processors speculatively execute instructions ahead of
their actual appearance in the program execution,
© 2016 Pearson Education, holding
Inc., Hoboken, NJ.the results
All rights reserved. in temporary locations, keeping
+
Microprocessor Speed
What is Pipelining?

 Implementation where multiple instructions are

simultaneously overlapped in execution
 Instruction processing has N different stages
 Overlap different instructions working on different stages

 Pipelining is not new

 Ford’s Model-T assembly line
 Laundry – Washer/Dryer
 IBM Stretch [1962]
 Since the ’70s nearly all computers have been pipelined

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Speeding up through pipelining
 Laundry Example
 Ann, Brian, Cathy, Dave each have
one load of clothes to wash, dry,
and fold

 Washer takes 30 minutes

 Dryer takes 40 minutes
 “Folder” takes 20 minutes
 [2] part 4.5 4th
+
Sequential Laundry
+
Pipelined Laundry-Start work
ASAP
+
Pipelining Lessons
+
Implementation

 First, let’s think about how different instructions get

executed

Every instruction in this RISC subset can be implemented in at most

5 clock cycles. The 5 clock cycles are as …
+
Performance
Increase the number of bits that are retrieved at
Balance one time by making DRAMs “wider” rather than
“deeper” and by using wide bus data paths
 Adjust the
organization
Change the DRAM interface to make it more
and
efficient by including a cache or other buffering
architecture to
scheme on the DRAM chip
compensate
for the mismatch
among the Reduce the frequency of memory access by
capabilities of the incorporating increasingly complex and efficient
various cache structures between the processor and
components main memory
 Architectural
examples Increase the interconnect bandwidth between
include: processors and memory by using higher speed
buses and a hierarchy of buses to buffer and
structure data flow
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Improvements in Chip
Organization and Architecture
 Increase hardware speed of processor
 Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, increasing clock rate
 Propagation time for signals reduced

 Increase size and speed of caches

 Dedicating part of processor chip
 Cache access times drop significantly

 Change processor organization and architecture

 Increase effective speed of instruction execution
 Parallelism

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Problems with Clock Speed and
Login Density
 Power
 Power density increases with density of logic and clock
speed
 Dissipating heat

 RC delay
 Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
 Delay increases as the RC product increases
 As components on the chip decrease in size, the wire
interconnects become thinner, increasing resistance
 Also, the wires are closer together, increasing capacitance

 Memory latency
 Memory speeds lag processor speeds
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

The use of multiple processors on
Multicore the same chip provides the
potential to increase performance
without increasing the clock rate
Strategy is to use two simpler
processors on the chip rather than
one more complex processor
With two processors larger
caches are justified
As caches became larger it
made performance sense
to create two and then
three levels of cache on a
chip

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Many Integrated Core (MIC)
Graphics Processing Unit
(GPU)
MIC GPU
 Leap in performance as well  Core designed to perform
as the challenges in parallel operations on
developing software to graphics data
exploit such a large number
of cores  Traditionally found on a
plug-in graphics card, it is
 The multicore and MIC used to encode and render
strategy involves a 2D and 3D graphics as well
homogeneous collection of as process video
general purpose processors
on a single chip  Used as vector processors
for a variety of applications
that require repetitive
computations
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
2.4 Basic Measures of
+ Computer Performance

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Performance [2]

 Defining Performance [2, part 1.6]

FIGURE 1.14 The capacity, range, and speed for a number of

commercial airplanes. Th e last column shows the rate at which
the airplane transports passengers, which is the capacity times the
cruising speed (ignoring range and takeoff and landing times).
[2] Patterson, D.A., and Hennessy, J.L. , “Computer
Organization and Design: The Hardware/Software Interface”,
Morgan Kaufmann Publishers, 5th dition, 2014
+
Performance

Example: Do the following

changes to a computer system
increase throughput, decrease
response time, or both?
1.Replacing the processor in a
computer with a faster version
2. Adding additional processors to
a system that uses
+
Performance
 Answer: Decreasing response time almost
always improves throughput. Hence, in case 1,
both response time and throughput are
improved. In case 2, no one task gets work
done faster, so only throughput increases.
 If,however, the demand for processing in the
second case was almost as large as the
throughput, the system might force requests to
queue up. In this case, increasing the
throughput could also improve response time,
since it would reduce the waiting time in the
queue. Thus, in many real computer systems,
changing either execution time or throughput
+

THROUGHPUT = the total amount of work done in a given time

RESPONSE TIME—the time between the start and completion of a
task

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Performance
 Throughput and Response Time

 IfX is n times faster than Y, then the

execution time on Y is n times longer than it
ison X:
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Performance [2], 1.6

 CPU Performance and Its Factors

 Instruction Performance
+
Performance- Exercise
 CPU Performance and Its Factors (self study)
 Consider three different processors P1, P2, and
P3 executing the same instruction set with the
clock rates and CPIs given in the following
table
+
Performance

 1.3.1Which processor has the highest

performance expressed in instructions
per second?
 1.3.2If the processors each execute a
program in 10 seconds, find the
number of cycles and the number of
instructions
+
Measuring Performance
 CPU execution time = CPU time
 is the time the CPU spends computing for
this task and does not include time spent
waiting for I/O or running other programs
 CPU time= user CPU time + system CPU
time

The CPU time

spent in the
CPU time spent in operating system
the program performing tasks
on behalf of the
program
Table 2.1 Performance Factors and System Attributes

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Measuring Performance

 clock cycle Also called tick, clock tick, clock period,

clock, cycle. The time for one clock period, usually of
the processor clock, which runs at a constant rate.
 clock period The length of each clock cycle.
 Ex: Designers refer to the length of a clock period
both as the time for a complete clock cycle (e.g., 250
picoseconds, or 250 ps) and as the clock rate (e.g., 4
gigahertz, or 4 GHz)
+
Measuring Performance

 or

CPI= clock cycles per instruction

+
Measuring Performance
 Improving Performance: example
 Our favorite program runs in 10 seconds on
computer A, which has a 2 GHz clock. We are
trying to help a computer designer build a
computer, B, which will run this program in 6
seconds. The designer has determined that a
substantial increase in the clock rate is
possible, but this increase will affect the rest of
the CPU design, causing computer B to require
1.2 times as many clock cycles as computer A
for this program. What clock rate should we
tell the designer to target?
+
+
Measuring Performance

 The Classic CPU Performance Equation

+
Measuring Performance:
Comparing Code Segments
 A compilerdesigner is trying to decide between two code
sequences for a particular computer. Th e hardware designers
have supplied the following facts:

 For a particular high-level language statement, the compiler

writer is considering two code sequences that require the
following instruction counts:

 Which code sequence executes the most instructions? Which

will be faster? What is the CPI for each sequence?
+

 We can use the equation for CPU clock

cycles based on instruction count and
CPI to find the total number of clock
cycles for each sequence
+
Performance
 1.3.3[10] <1.4> We are trying to reduce
the time by 30% but this leads to an
increase of 20% in the CPI. What clock rate
should we have to get this time reduction?
 Forproblems below, use the information in
the following table.
+
Performance

 1.3.4Find the IPC (instructions per

cycle) for each processor.
 1.3.5
Find the clock rate for P2 that
reduces its execution time to that of P1.
 1.3.6Find the number of instructions
for P2 that reduces its execution time
to that of P3.
+
Performance
 [2]
part 1.8 million instructions per
second (MIPS)
A measurement of program execution
speed based on the number of millions
of instructions. MIPS is computed as the
instruction count divided by the
product of the execution time and 106.
+

A benchmark program is run on a 40

MHz processor. The executed
program consists of 100,000
instruction executions, with the
following instruction mix and clock
cycle count

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Performance
Machine A Machine B Machine C
Program 1 (secs) 1 10 20
Program 2 (secs) 1000 100 20
Total time (secs) 1001 110 40

 Which is better?
 By how much?
 Are the programs equally important?
+
Performance
MIPS

easy to understand

cannot compare computers with different instruction sets

(1 CISC instruction may equal many RISC! ???)

MIPS varies between programs on the same computer

Finally, and most importantly, if a new program
executes more instructions but each instruction is faster,
MIPS can vary independently from performance
+
Performance
 [3] part 2.5 PERFORMANCE ASSESSMENT
 MFLOPS :
 millions of floating-point operations per
second

 BENCHMARKS:
 Measures such as MIPS and MFLOPS have
proven inadequate to evaluating the
performance of processors. Because of
differences in instruction sets, the instruction
execution rate is not a valid means of
comparing the performance of different
+
Performance
 Some Aggregate Job Mix Options
 Arithmetic mean –To obtain a reliable
comparison of the performance of
various computers, it is preferable to run
a number of different benchmark
programs on each machine and then
average the results. For example, if m
different benchmark program, then a
simple arithmetic mean can be
calculated as follows where Ri is the high-level
i
language instruction execution
rate for the ith benchmark
program.
+
Performance
30 mph for the first 10 miles

90 mph for the next 10 miles

Average speed? (30+90)/2 =
60mph
WRONG!

Average speed = total distance

/ total time
20/(10/30+10/90) = 45mph
+  Gene Amdahl

 Deals with the potential speedup of

a program using multiple
processors compared to a single
Amdahl’s processor

Law
 Illustrates the problems facing
industry in the development of
multi-core machines
 Software must be adapted to a
highly parallel execution
environment to exploit the power
of parallel processing

 Can be generalized to evaluate and

design technical improvement in a
computer system

+
Little’s Law
 Fundamental and simple relation with broad applications
 Can be applied to almost any system that is statistically
in steady state, and in which there is no leakage
 Queuing system
 If server is idle an item is served immediately, otherwise an
arriving item joins a queue
 There can be a single queue for a single server or for multiple
servers, or multiple queues with one being for each of
multiple servers

 Average number of items in a queuing system equals

the average rate at which items arrive multiplied by the
time that an item spends in the system
 Relationship requires very few assumptions
 Because of its simplicity and generality it is extremely useful
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Calculating the Mean

The three
The use of benchmarks to common
compare systems involves formulas
calculating the mean value of
a set of data points related to used for
execution time calculating
a mean are:

• Arithmetic
• Geometric
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
• Harmonic
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
 An Arithmetic Mean (AM) is an
appropriate measure if the sum of all
the measurements is a meaningful
and interesting value Arithmeti
 The AM is a good candidate for c
comparing the execution time
performance of several systems
For example, suppose we were interested in using a system
for large-scale simulation studies and wanted to evaluate
several alternative products. On each system we could run
the simulation multiple times with different input values for
Mean
each run, and then take the average execution time across
all runs. The use of
multiple runs with different inputs should ensure that the
results are not heavily biased by some unusual feature of a
given input set. The AM of all the runs is a good measure of
+ the system’s performance on simulations, and a good
number to use for system comparison.
 The AM used for a time-based variable, such as
program execution time, has the important
property that it is directly proportional to the
total time
 If the total time doubles, the mean value
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
doubles
Table 2.2
A Comparison
of Arithmetic
and
Harmonic
Means for
Rates

 Desirablecharacteristics of a
benchmark program:

1. It is written in a high-level language,

making it portable across different
machines
2. It is representative of a particular kind of
programming domain or paradigm, such as
systems programming, numerical
programming, or commercial programming
3. It can be measured easily
4. It has wide distribution
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
System Performance Evaluation
Corporation (SPEC)
 Benchmark suite
 A collection of programs, defined in a high-level language
 Together attempt to provide a representative test of a
computer in a particular application or system
programming area

 SPEC
 An industry consortium
 Defines and maintains the best known collection of
benchmark suites aimed at evaluating computer systems
 Performance measurements are widely used for comparison
and research purposes

+  Best known SPEC benchmark suite

 Industry standard suite for

processor intensive applications
SPEC  Appropriate for measuring
performance for applications that
spend most of their time doing
computation rather than I/O
CPU2006  Consists of 17 floating point
programs written in C, C++, and
Fortran and 12 integer programs
written in C and C++

 Suite contains over 3 million lines of

code

 Fifth generation of processor

intensive suites from SPEC

Table 2.5

SPEC
CPU2006
Integer
Benchmarks

(Table can be found on page 69 in the

SPEC
CPU2006
Floating-
Point
Benchmarks

(Table can be found on page 70

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. in the textbook.)
+
Terms Used in SPEC
Documentation
 Benchmark  Peak metric
 A program written in a high-level  This enables users to attempt to
language that can be compiled optimize system performance by
and executed on any computer optimizing the compiler output
that implements the compiler  Speed metric
 System under test  This is simply a measurement of the
time it takes to execute a compiled
 This is the system to be evaluated
benchmark
 Reference machine
 Used for comparing the ability of
a computer to complete single
 This is a system used by SPEC to tasks
establish a baseline performance
for all benchmarks  Rate metric
 Each benchmark is run and  This is a measurement of how many
measured on this machine to tasks a computer can accomplish in
establish a reference time for a certain amount of time
that benchmark  This is called a throughput,
capacity, or rate measure
 Base metric  Allows the system under test to
 These are required for all reported execute simultaneous tasks to
results and have strict guidelines take advantage of multiple
forEducation,
© 2016 Pearson compilation
Inc., Hoboken, NJ. All rights reserved. processors
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ Summary Performance
Issues
Chapter 2

 Designing for performance  Basic measures of computer

 Microprocessor speed performance
 Performance balance
 Clock speed
 Improvements in chip
 Instruction execution rate
organization and  Calculating the mean
architecture
 Arithmetic mean
 Multicore
 Harmonic mean
 MICs
 Geometric mean
 GPGPUs
 Amdahl’s Law  Benchmark principles
 Little’s Law  SPEC benchmarks
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
33 pages
Chapter 2
No ratings yet
Chapter 2
34 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
2. ünite
No ratings yet
2. ünite
33 pages
Chapter 1 Solution
No ratings yet
Chapter 1 Solution
35 pages
التحليل
No ratings yet
التحليل
32 pages
CH02 COA10e.performance Issues
No ratings yet
CH02 COA10e.performance Issues
19 pages
L5-L6-Performance Issues
No ratings yet
L5-L6-Performance Issues
47 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
Chapter 11
No ratings yet
Chapter 11
33 pages
Chapter 2
No ratings yet
Chapter 2
15 pages
SP23 CS 212 Week 2
No ratings yet
SP23 CS 212 Week 2
23 pages
2.Week
No ratings yet
2.Week
35 pages
Chapter 2
No ratings yet
Chapter 2
34 pages
chapter 2
No ratings yet
chapter 2
14 pages
Chapter 2 Notes NBCAS511
No ratings yet
Chapter 2 Notes NBCAS511
10 pages
4 - Performance Issues
No ratings yet
4 - Performance Issues
48 pages
af933808-8d23-4c09-8d97-d44d4c730d12
No ratings yet
af933808-8d23-4c09-8d97-d44d4c730d12
49 pages
Performance Issues
No ratings yet
Performance Issues
19 pages
02. Performance
No ratings yet
02. Performance
57 pages
CA01_2024S2
No ratings yet
CA01_2024S2
30 pages
LEC 2
No ratings yet
LEC 2
31 pages
CCS 1202 Lecture 2_Computer Evolution and Performance
No ratings yet
CCS 1202 Lecture 2_Computer Evolution and Performance
32 pages
Computer Architecture
No ratings yet
Computer Architecture
56 pages
CMP2008 L1
No ratings yet
CMP2008 L1
47 pages
Chapter Two
No ratings yet
Chapter Two
33 pages
Instructor: L. N. Bhuyan
No ratings yet
Instructor: L. N. Bhuyan
32 pages
COA Midterm
No ratings yet
COA Midterm
13 pages
Ca02 2014 PDF
No ratings yet
Ca02 2014 PDF
79 pages
CH02-COA11e
No ratings yet
CH02-COA11e
34 pages
LEC 2
No ratings yet
LEC 2
31 pages
Lecture - 4 - Performance
No ratings yet
Lecture - 4 - Performance
31 pages
19bce0531 VL2021220104072 Da 1 PDF
No ratings yet
19bce0531 VL2021220104072 Da 1 PDF
16 pages
Lecture1 2
No ratings yet
Lecture1 2
30 pages
HPC -1
No ratings yet
HPC -1
40 pages
COA - Module-5
No ratings yet
COA - Module-5
35 pages
Computer Architecture Design and Performance
No ratings yet
Computer Architecture Design and Performance
381 pages
RTSEC Documentation
No ratings yet
RTSEC Documentation
4 pages
Modle 01 - HPC Introduction To Pipeline
No ratings yet
Modle 01 - HPC Introduction To Pipeline
124 pages
Defining Computer Architecture
No ratings yet
Defining Computer Architecture
6 pages
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
No ratings yet
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
78 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
@vtucode - in 21CS643 Module 2 2021 Scheme
No ratings yet
@vtucode - in 21CS643 Module 2 2021 Scheme
49 pages
Multicore Processor Report
100% (1)
Multicore Processor Report
19 pages
COMP-unit-1
No ratings yet
COMP-unit-1
52 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
24 pages
Computer Architecture
100% (1)
Computer Architecture
125 pages
Unit I-Basic Structure of A Computer: System
No ratings yet
Unit I-Basic Structure of A Computer: System
64 pages
Aula Ch1
No ratings yet
Aula Ch1
40 pages
CSC 313 Module 3 Pipelining
No ratings yet
CSC 313 Module 3 Pipelining
59 pages
Intro
No ratings yet
Intro
14 pages
CSC232 - Chp1 (Compatibility Mode)
No ratings yet
CSC232 - Chp1 (Compatibility Mode)
50 pages
08 Perf Pipeline i
No ratings yet
08 Perf Pipeline i
65 pages
Lecture1 Introduction to Parallel Computing_2025
No ratings yet
Lecture1 Introduction to Parallel Computing_2025
38 pages
Mastering System Programming with C: Files, Processes, and IPC
From Everand
Mastering System Programming with C: Files, Processes, and IPC
Larry Jones
No ratings yet
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
From Everand
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering C: Advanced Techniques and Tricks
From Everand
Mastering C: Advanced Techniques and Tricks
Ted Norice
No ratings yet
Mastering C: Advanced Techniques and Best Practices
From Everand
Mastering C: Advanced Techniques and Best Practices
Adam Jones
No ratings yet
Essential Apache Beam: Definitive Reference for Developers and Engineers
From Everand
Essential Apache Beam: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
6809 e
No ratings yet
6809 e
5 pages
Certificate of Certificate of Available Common-Use and Consumables
No ratings yet
Certificate of Certificate of Available Common-Use and Consumables
8 pages
Pic 16f877a Microcontroller
No ratings yet
Pic 16f877a Microcontroller
222 pages
Care and Handling of Your HP Personal Media Drive
No ratings yet
Care and Handling of Your HP Personal Media Drive
4 pages
Unit-2 Slides COA
No ratings yet
Unit-2 Slides COA
70 pages
The History of Operating Systems Sutori
No ratings yet
The History of Operating Systems Sutori
3 pages
Class 1 Computer Paper
100% (3)
Class 1 Computer Paper
8 pages
Final Important Questions of OS 30.05.2024
No ratings yet
Final Important Questions of OS 30.05.2024
5 pages
Experiment 1
No ratings yet
Experiment 1
5 pages
Delta HMI Seminar DOP-B & DOPSoft Introduction
No ratings yet
Delta HMI Seminar DOP-B & DOPSoft Introduction
29 pages
Sales Guide: Workforce Enterprise Series - Second Edition
No ratings yet
Sales Guide: Workforce Enterprise Series - Second Edition
28 pages
Informatics Practices Xi
No ratings yet
Informatics Practices Xi
184 pages
V-ECU Hardware Warning: Language Code Product Group No. Date Applies To
100% (2)
V-ECU Hardware Warning: Language Code Product Group No. Date Applies To
3 pages
CS401 Mcqs MidTerm by Vu Topper RM
No ratings yet
CS401 Mcqs MidTerm by Vu Topper RM
52 pages
Pic16f8x PDF
No ratings yet
Pic16f8x PDF
126 pages
Assembly Language 1
No ratings yet
Assembly Language 1
4 pages
M04 Adminstrate Network & H.P
No ratings yet
M04 Adminstrate Network & H.P
103 pages
Dell Inspiron Micro 3050 Service Manual
No ratings yet
Dell Inspiron Micro 3050 Service Manual
65 pages
Linux & Computer Systems: CS553 Homework #1
No ratings yet
Linux & Computer Systems: CS553 Homework #1
4 pages
Name: - Date: - Score
No ratings yet
Name: - Date: - Score
3 pages
Auto ID 2006 PDF
No ratings yet
Auto ID 2006 PDF
24 pages
Share - IBM 9032 Model ESCON DIRECTOR
No ratings yet
Share - IBM 9032 Model ESCON DIRECTOR
56 pages
Unit 2 - Number System Notes
No ratings yet
Unit 2 - Number System Notes
25 pages
Micro Course Pack
No ratings yet
Micro Course Pack
23 pages
8086 Microprocessor: Lec. 3: 8086 Intel Microprocessor Omar Zyad
No ratings yet
8086 Microprocessor: Lec. 3: 8086 Intel Microprocessor Omar Zyad
21 pages
Ni Quiz Computer System
No ratings yet
Ni Quiz Computer System
4 pages
POLO Directory
No ratings yet
POLO Directory
3 pages
Customer Inquiry Report
No ratings yet
Customer Inquiry Report
3 pages
Dell Latitude E5400 (Foose 14 UMA, 07236-1)
No ratings yet
Dell Latitude E5400 (Foose 14 UMA, 07236-1)
58 pages
COA Notes
No ratings yet
COA Notes
110 pages

CH02 COA10e

Uploaded by

CH02 COA10e

Uploaded by

+

© 2016 Pearson Education, Inc., Hoboken,

2.1 Designing for Performance 46

 Businesses are relying on increasingly powerful servers to handle transaction

Brach Processor looks ahead in the instruction code fetched

 Implementation where multiple instructions are

 Pipelining is not new

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Washer takes 30 minutes

 First, let’s think about how different instructions get

Every instruction in this RISC subset can be implemented in at most

 Increase size and speed of caches

 Change processor organization and architecture

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Defining Performance [2, part 1.6]

FIGURE 1.14 The capacity, range, and speed for a number of

Example: Do the following

THROUGHPUT = the total amount of work done in a given time

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 IfX is n times faster than Y, then the

 CPU Performance and Its Factors

 1.3.1Which processor has the highest

The CPU time

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 clock cycle Also called tick, clock tick, clock period,

CPI= clock cycles per instruction

 The Classic CPU Performance Equation

 For a particular high-level language statement, the compiler

 Which code sequence executes the most instructions? Which

 We can use the equation for CPU clock

 1.3.4Find the IPC (instructions per

A benchmark program is run on a 40

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

cannot compare computers with different instruction sets

MIPS varies between programs on the same computer

90 mph for the next 10 miles

Average speed = total distance

 Deals with the potential speedup of

 Can be generalized to evaluate and

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Average number of items in a queuing system equals

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

1. It is written in a high-level language,

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Industry standard suite for

 Suite contains over 3 million lines of

 Fifth generation of processor

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

(Table can be found on page 69 in the

(Table can be found on page 70

 Designing for performance  Basic measures of computer

You might also like