Digital Design &
Computer Organization
BSC302
Pradeep H K
Dept. of ISE,
JNNCE, Shivamogga
Text Book
Computer Organization,Carl Hamacher, Zvonko Vranesic and
Safwat Zaky, 5th Edition, McGrawHill
Module 3: Basic Structure of Computers: Functional Units ,Basic Operational Concepts, Bus structure, Performance Processor
Clock, Basic Performance Equation, Clock Rate, Performance Measurement. Machine Instructions and Programs: Memory
Location and Addresses, Memory Operations, Instruction and Instruction sequencing, Addressing Modes.
1.2, 1.3, 1.4, 1.6, 2.2, 2.3, 2.4, 2.5
Module 4: Input/output Organization: Accessing I/O Devices, Interrupts–Interrupt Hardware, Enabling and Disabling Interrupts,
Handling Multiple Devices, Direct Memory Access: Bus Arbitration, Speed, size and Cost of memory systems. Cache Memories–
Mapping Functions.
4.1, 4.2.1, 4.2.2, 4.2.3, 4.4, 5.4, 5.5.1
Module 5: Basic processing unit , Pipeline
7.1, 7.2, 8.1
Contents
- Introduction to Computers
- Various Application of Computers
- Generation of computers
- Classes of Computing Applications
- Understanding Program Performance
- High-Level Language to the Language of Hardware
3
Contents
- Inside the Computer
- Communication with Other Computers
- Technologies for Building Processors and Memories
- Real Stuff : Manufacturing Pentium 4 Chips
- Basic Operational Concepts
Definition of a Computer
What is it?
A Computer is a machine capable of manipulating data in
complex, programmable ways.
Generations
• Generation Zero, 1642-1940, defined by Mechanical computers
• First generation Computer, 1940-1956, defined by Vacuum
tubes
• Second generation Computer, 1956-1963, defined by Transistors
• Third generation Computer, 1964-1971, defined by Integrated
Circuits (IC)
• Fourth generation computer, 1971-today, defined by
Microprocessors
• Fifth generation computer, Present and Beyond, defined by
Artificial Intelligence (AI), Cloud computing…
Generation 0 :
Mechanical Computers (1642-1945)
Abacus
Pascal’s Mechanical Calculator
GENERATION 0 :
MECHANICAL COMPUTERS (1642-1945)
Leibniz Calculator
GENERATION 0 :
MECHANICAL COMPUTERS (1642-1945)
Programmable devices
Jacquard’s loom
GENERATION 0 :
MECHANICAL COMPUTERS (1642-1945)
Babbage’s Difference Engine
First Generation(1940-1956)
StoredProgram Concept(John von Neumann)
Vacuum tubes
Second Generation(1956-1963)
Transistor
Development of programming languages (COBOL,
FORTRAN)
Third Generation(1964-1973)
ICtechnology
Microprogramming, parallelism, pipelining
Cache memory, Virtual memory
Fourth Generation(1973-1985)
VLSI technology
Microprocessors
LAN, WAN, Internet
Beyond the Fourth Generation(1985-??)
AI
Parallel processing
Networking
Client – Server
Distributed computing
Cloud computing
Classes of Computing Applications
Desktop Computers
Workstations
Mainframes
Supercomputers
Minicomputers
Servers
Embedded computers
Desktop Computer
A Computer designed for use by an individual, usually
incorporating a graphics display, keyboard and mouse
Servers
A computer used for running larger programs for multiple
users often simultaneously and typically accessed only
via network
Supercomputers
A class of computers with the highest performance and
cost; they are configured as servers and typically cost
millions of dollars
Tsubame
Earth simulator
Param
Supercomputer
Param 8000(1991)
Embedded computer
A computer inside another device used for running one
predetermined application or collection of software
Handheld/Pocket PC PDA Tablet PC
Desktop Laptop
Workstation
Mainframe
Supercomputer
Computer Architecture Computer Organization
Concerned with the structure and behaviour of the Concerned with how the hardware components are
computer system as seen by the user. connected together to form a computer system.
Blueprint for design. It is decided after the architecture.
It Involves logical components such as Instruction It involves physical units such as circuit design,
Set, Addressing Modes etc. adders, signals, peripherals, etc.
Describes how a computer system is designed. Describes how a computer system works.
Interface between hardware and software. Interconnection of components.
Deals with high-level design issues. Deals with low-level design issues
Defines the physical aspects of the computer
Logical aspects of a computer system.
system.
Functional behaviour of the computer system. Deals with the various structural relationships.
Also called an instruction set architecture(ISA). Also called microarchitecture.
Concerned with - How to do?(implementation of the
Concerned with - What to do? (Instruction Set)
architecture)
Understanding Program Performance
The performance of a program depends on a combination
of the effectiveness of the algorithms used in the program,
the software systems used to create and translate the
program into machine instructions, and the effectiveness
of the computer in executing those instructions
Computer System
Application software
System software
Hardware
System Software
Software that provides services that are commonly useful,
including operating system, compilers and assemblers
Operating System
Supervising program that manages the resources of a
computer for the benefit of the programs that run on that
machine
Complier
A Program that translates high-level language statements
into assembly language statements
Specific to hardware
Specific to OS
Not portable
Cross compilers – Compile in one H/W and OS, execute
in different H/W and OS
High-Level Language to the Language of
Hardware
Assembler – A program that translates a symbolic version
of instructions into the binary version
Assembly language A symbolic representation of machine
instruction
High-Level Language to the Language of
Hardware
A portable language such as C, Fortran, or Java composed
of words and algebraic notation that can be translated by
a complier into assembly language
Under the covers
Five Classic components of computer
- Input
- Output
- Memory
- Datapath
- Control Processor
Input
A mechanism through which the computer is fed
information, such as the keyboard, mouse, scanners,
etc…
Output Device
A Mechanism that conveys the result of a computation to
a user or another computer
Opening the Box
Motherboard – A plastic board containing packages of
ICs, including processor, cache memeory, and connectors
for I/O devices such as networks and disks
IC- Also called Chip A device combining dozens to
millions of transistors
Opening the Box
Memory – The storage area in which programs are kept when they
are running and that contains the data needed by the running
programs
Central Processor Unit (CPU)
Also called processor. The active part of the computer, which
contains the datapath and control and which adds numbers, tests
numbers, signals I/O devices to activate, and so on..
Datapath – The component of the processor that performs
arithmetic operations
Control – The component of the processor that commands
the datapath, memory, and I/O devices according to the
instructions of the program
DRAM – Memory built as an integrated circuit, it provides
random access to any location
Cache Memory – A small, fast memory that acts as a buffer
for a slower, larger memory
A safe place for data
Memory – The storage area in which programs are kept
when they are running and that contains the data needed
by the running programs
Volatile Memory – Storage, such as DRAM, that only
retains only if it is receiving power
Nonvolatile memory – A form of memory that retains data
even in the absence of a power source and that is used to
store programs between runs. Magnetic disk is nonvolatile
and DRAM is not.
Primary Memory : Also called main memory. Volatile
memory used to hold programs while they are running;
typically consists of DRAM in today’s computers
Secondary Memory : Nonvolatile memory used to store
programs and data between runs; typically consists of
magnetic disks in today's computers
Magnetic disk ( Also called Hard disk )
A form of nonvolatile secondary memory composed of
rotating platters coated with a magnetic recording material
Communicating with other computers
Network
LAN
WAN
Advantages
- Communication
- Resource Sharing
- Nonlocal access
Technologies for Building processors and
Memories
Silicon – A natural element which is a semiconductor
Semiconductor – A substance that does not conduct
electricity well
Real Stuff: Manufacturing Pentium 4 Chips
Silicon ingot
Slicer
Blank wafers
Patterned wafers
Wafer tester
Dicer
Tested dies
Moore’s law in Microprocessors
1000
100 2X growth in 1.96 years!
Transistors (MT) 10
P6
Pentium® proc
1 486
386
0.1 286
8085 8086
0.01 8080
8008
4004
0.001
1970 1980 1990 2000 2010
Year
Transistors on Lead Microprocessors double every 2 years
Courtesy, Intel
Metrics to evaluate a Computer
Metrics
Speed – delay, frequency
Power Dissipation
Energy to perform a function
Cost
Scalability
Reliability
Functional Units
Memory
Input ALU
Interconnecti
on Network
Control
Output Unit
Basic functional units of a computer.
Information Handled by a Computer
Instructions/machine instructions
Govern the transfer of information within a computer as well as between the computer
and its I/O devices
Specify the arithmetic and logic operations to be performed
Program
Data
Used as operands by the instructions
Source program
Encoded in binary code – 0 and 1
Memory Unit
Storeprograms and data
Two classes of storage
Primary storage
Fast
Programs must be stored in memory while they are being executed
Large number of semiconductor storage cells
Processed in words
Address
RAM and memory access time
Memory hierarchy – cache, main memory
Secondary storage –
Larger
Cheaper
No power to hold data
Memory Unit
Memory Hierarchy
Fast, expensive, volatile, SlowSlow, cheap, non-volatile
Computer Memory Hierarchy
Computer Memory Hierarchy
Number of bits in each word is called word length of the computer.
Programs must reside in the memory during execution. Instructions and data
can be written into the memory or read out under the control of processor.
Memory in which any location can be reached in a short and fixed amount of
time after specifying its address is called random-access memory (RAM).
The time required to access one word in called Memory Access Time.
Memory which is only readable by the user and contents of which can’t be
altered is called read only memory (ROM)
Caches are the small fast RAM units, which exist between CPU and RAM
Arithmetic and Logic Unit (ALU)
Mostcomputer operations are executed in ALU of the
processor.
Load the operands into memory
Bring them to the processor
Perform operation in ALU
Store the result back to memory or retain in the processor.
Registers
Fast control of ALU
Control Unit
All computer operations are controlled by the control unit.
The timing signals that govern the I/O transfers are also generated by
the control unit.
Control unit is usually distributed throughout the machine instead of
standing alone.
Operations of a computer:
Accept information in the form of programs and data through an input unit and store it in
the memory
Fetch the information stored in the memory, under program control, into an ALU, where
the information is processed
Output the processed information through an output unit
Control all activities inside the machine through a control unit
Control Unit
FETCH-DECODE-EXECUTE
All computer operations are
controlled by the control unit.
The timing signals that govern
the I/O transfers are also
generated by the control unit.
Central Processing Unit
CPU and Clock cycles
The operations of a computer
Accept information in the form of programs and data through an input
unit and store it in the memory
Fetch the instruction stored in the memory, under program control,
decode it in IR and process it an ALU
Output the processed information through an output unit
All activities pertaining to processing and data movement inside the
computer machine are governed by Control Unit.
Basic Operational Concepts
To perform a given task, an appropriate program consisting
of a list of instruction is stored in the memory. Individual
instructions are brought from the memory into the
processor, which executes the specified operation.
Add LOCA, R0
Load LOCA, R1
Add R1, R0
Memory
MAR MDR
Control
PC R
0
R
1
IR Processor
ALU
R
n- 1
n general purpose
registers
Connections between the processor and the memory
Terms used
IR - instruction currently being executed
PC - memory address of next instruction to be
fetched and executed
MAR - address of location to be accessed
MDR or MBR - data to be written into or read out of the
addressed location
Interrupts and ISRs
Functions…
Instruction Register : contains the instruction that is
being executed. Its output is available to the control
circuits, that generates the timing signals for control of
the actual processing circuits needed to execute the
instruction.
Program Counter : is a register, that contains the
memory address of the instruction currently being
executed. During the execution of the current instruction,
the contents of program counter is updated to
correspond to the address of the next instruction.
Memory Address Register (MAR) : holds the address of
the memory location to or from which data is to be
transferred.
Memory Data Register (MDR): contains the data to be
written into or read-out of the addressed memory location.
General- Purpose Registers(GPR) : are used for
holding data, intermediate results of operations. They are
also known as scratch-pad registers.
Steps involving instruction fetch and
execution
INSTRUCTION FETCH
Execution of a program starts by setting the PC to point to
the first instruction of the program.
The contents of PC are transferred to the MAR and a
Read control signal is sent to the memory
Theaddressed word (here it is the first instruction of the
program) is read out of memory and loaded into the MDR
The contents of MDR are transferred to the IR for
instruction decoding
INSTRUCTION EXECUTION
The operation field of the instruction in IR is examined to
determine the type of operation to be performed by the
ALU
The specified operation is performed by obtaining the
operand(s) from the memory locations or from GP
registers.
Fetching the operands from the memory requires sending the
memory location address to the MAR and initiating a Read cycle.
The operand is read from the memory into the MDR and then
from MDR to the ALU.
The ALU performs the desired operation on one or more
operands fetched in this manner and sends the result either to
memory location or to a GP register.
The result is sent to MDR and the address of the location where
the result is to be stored is sent to MAR and Write cycle is
initiated.
Thus, the execute cycle ends for the current instruction and the
PC is incremented to point to the next instruction for a new fetch
cycle.
Bus Structures
BUS – Group of lines (wires) that serves as a connecting
path for several devices
Single-bus structure
Data Bus : It is used for transmission of data. The number
of data lines correspond to the number of bits in a word.
Address Bus: it carries the address of the main memory
location from where the data can be accessed.
Control Bus: it is used to indicate the direction of data
transfer and to coordinate the timing of events during the
transfer
Bus Structures
Single-bus structure
Two-bus structure
Input Output Memory Processor
Single-bus structure
• Only two units can actively use the bus at any given time
• Devices connected to bus vary in speed
Advantages of Single-Bus Structure
Low Cost
Flexibility for attaching peripheral devices
Draw Back
Low operating speed
Found in small computers such as minicomputers and
microcomputers.
TWO – BUS STRUCTURE
I/O bus
Input
Processor
Memory
Output
The bus is said to perform two distinct functions by
connecting the I/O units with memory and processor unit
with memory. The processor interacts with the memory
through a memory bus and handles input/output functions
over I/O bus.
The main advantage of this structure is good operating speed
but on account of more cost.
Performance
Performance - measure of how quickly the computer can
execute programs
Speed – Design of Hardware and its machine Language
Best Performance – Design of compiler, Machine
instruction set, and the hardware in a coordinated way
• Execution depends on all units in a computer system
• Processor Time depends on the hardware involved in the execution of
individual machine Instruction
Main Cache
memory memory Processor
Bus
Processor Clock
Processor circuits are controlled by timing signal called
clock
Clock cycle – regular time interval
If P – length of one clock cycle effects processor
performance
Hertz (Hz) – cycles per second
500 millions cycles per second – 500 MHz
1250 millions cycles per second – 1.25 GHz
The System clock speed and instruction Cycle
Clock Rate
Increase clock rate
Improve the integrated-circuit (IC) technology to make the
circuits faster
Reduce the amount of processing done in one basic step
(however, this may increase the number of basic steps
needed)
Basic Performance Equation
NxS
T =
R
T – processor time(program execution time)
N – number of instruction executions
S – avg. no. of basic steps to execute 1 machine instruction
R – clock rate
Performance Measurement
T is difficult to compute.
Measure computer performance using benchmark programs.
System Performance Evaluation Corporation (SPEC) selects and publishes representative
application programs for different application domains, together with test results for many
commercially available computers.
Compile and run (no simulation)
Reference computer (SPARCstation 10/40 (40MHz SuperSPARC with no L2 cache)
Performance Assessment
Performance Assessment
• If you were running a program on two different processors, we
would say that the faster is the one that gets the job done first.
• Execution time: The total time required for the computer to complete
a task
• includes disk accesses, memory accesses, I/O activities, operating system overhead, CPU execution
Exampl
e
• If computer A runs a program in 10 seconds and computer B runs the
same program in 15 seconds, how much faster is A than B?
• CA = 10 s
• CB = 15 s
• Performance A = 15s / 10s = 1.5 * Performance B
• Computer A is therefore 1.5 times faster than B.
Performance Assessment
• All computers are governed by a clock that determines when events take
place in the hardware.
• These discrete time intervals are called clock cycles.
• The rate of clock pulses is known as the clock rate, or clock speed (Hertz)
which is the inverse of the clock period. For example, 1 GHz processor
receives 1 billion pulse / sec
CPU Performance and Its Factor
Example
• Our favorite program runs in 10 seconds on computer A, which has a
2 GHz clock. We are trying to help a computer designer build a
computer, B, which will run this program in 6 seconds. The designer
has determined that a substantial increase in the clock rate is
possible, but this increase will affect the rest of the CPU design,
causing computer B to require 1.2 times as many clock cycles as
computer A for this program. What clock rate should we tell the
designer to target?
CPU clock cycles for a program
CPU execution time
= Clock rate
For a program
Example…
Find the number of clock cycle required for the program on A
10 sec = ( CPU clock cycles of A) / ( 2 x 109 cycles/second )
CPU clock cycles of A = 10 sec * 2 x 109 cycles/second
= 20 x 109 cycles = ?? Hz
CPU time for B
CPU time = ( 1.2 x CPU clock cycles of A ) /( Clock rate )
CPU time of B = ( CPU clock cycles of B ) / ( Clock Rate B )
CPU time of B = ( 1.2 x 20 x 109 cycles ) / ( 6 sec )
= 4 Ghz
Performance Assessment
• The term clock cycles per instruction, which is the average number of
cycles each instruction takes to execute, is often abbreviated as CPI.
Performance Assessment –
Millions of Instructions per Second (MIPS)
Rate
Million Instructions Per Second (MIPS) is the common measure of
performance for a processor is the rate at which instructions
are executed, expressed as MIPS or referred to as MIPS rate
Problem -1
A benchmark program runs on a system having clock rate of 40MHz. The
program consists of 100000 executable instructions with following
instruction mix and clock cycle count for each instruction type.
Cycles Per Instruction
Instruction Type Instruction Count (IC)
(CPI)
Integer Arithmetic 45000 1
Data Transfer 32000 2
Floating Point 15000 2
Control Transfer 8000 2
Determine the effective CPI, MIPS and execution time for the program
Solution -1
• To calculate
MIPS
• MIPS = (40 x 106 ) / (1.55 x
106)
Solution -1
• To calculate Execution Time
• Execution Time = Instruction Count x CPI x Cycle
Time
= Ic x CPI x (1/f)
= (100000 x 1.55) /(40x106)
= 0.003875
= 3.875 ms
Problem - 2
A Netcom systems developed two computer systems C1 and C2.,where C1 has
machine instructions for floating point (FP) operations as part of its processor ISA
and C2 does NOT have floating point instructions as part of its processor ISA. Since
C2 does not have floating point instructions, all floating-point instructions will be
implemented in Software level with non-FP instructions. You can assume that both
systems are operating at a clock speed of 300 Mhz. We are trying to run the SAME
program in both the systems which has the following proportion of commands:
Problem – 2…
1. Find the MIPS for both C1 and C2.
2. Assume that there are 9000 instructions in the program that is
getting executed on C1 and C2. What will be the CPU program
execution time on each system C1 and C2 ?
3. For the two systems to have the fastest speed and at the same time have
equal speed, what would be the possible mixture of the instructions that
would be required in the program? WHY?
Solution-2
a) Find the MIPS for both C1 and C2.
For C1:
• CPI =
0.16*6 + 0.1*8 + 0.08*10 + 0.66*3 = 4.54
• MIPS =
300 * 10^6 / (4.54 * 10^6) = 66.08
For C2:
• CPI =
0.16*20 + 0.1 * 32 + 0.08 * 66 + 0.66 * 3 = 13.66
• MIPS =
300 * 10^6 / (13.66 * 10^6) = 21.96
Solution-2
b)Assume that there are 9000 instructions in the program that is getting
executed on C1 and C2. What will be the CPU program execution time on
each system C1 and C2 ?
• CPU time for C1 of the program execution
= No of instructions * CPI / Clock rate
= 9000 * 4.54 / (300*10^6)
= 0.136 ms
• CPU time for C2 of the program execution
= No of instructions * CPI / Clock rate
= 9000 * 13.66/ (300 * 10^6)
= 0.41 ms
Solution-2
c)For the two systems to have the fastest speed and at the same time
have equal speed, what would be the possible mixture of the instructions
that would be required in the program? WHY?
• For both C1 and C2 should be equally fast,
• Have a program that does NOT have any floating point instructions as CPI for
non-floating point instructions is same between C1 and C2.
Instruction Set
RISC ( Reduced Instruction Set Computers)
CISC ( Complex Instruction Set Computers)
Multiprocessors and Multicomputers
Shared-memory multiprocessor systems
Message-passing multicomputer