Computer
Organization
and Architecture
Chapter 2
Computer Evolution and Performance
Lecture 03
The Second generation:
Transistors
2 Generation
nd
The invention of the transistor was one of the most important
developments leading to the personal computer revolution. The
transistor was invented in 1947 and announced in 1948 by Bell
Laboratory engineers
Transistors
• Replaced vacuum tubes
• Smaller
• Cheaper
• Less heat dissipation
• Solid State device
• Made from Silicon (Sand)
• Computer system built with transistors was also much smaller,
faster, and more efficient than a computer system built with
vacuum tubes.
The second generation also saw the introduction of
More complex arithmetic and logic units and control units.
High-level programming languages.
System software with the computer.
The system software provided the ability to
load program
Move data to peripherals
Libraries to perform common computations
Mini Computers
• DEC (Digital Equipment Corporation)- 1957
• Produced PDP-1
• The idea of mini computers began
IBM 7094 Configuration
Data Channels: A data channel is an independent I/O module
with its own processor and instruction set. CPU doesnot execute
detailed I/O instructions.
Instructions are stored in memory, which are executed by
processors in data channel.
CPU initiates an I/O transfer by sending control signal to the
data channel. The data channel after performing its task signals
the CPU.
This relieves the CPU processing.
Multiplexer: it is the central termination point for data channels,
the CPU and memory.
It schedules access to the memory from CPU and data channels.
BRIEF HISTORY OF
COMPUTERS
The Third Generation: Integrated
Circuits
Microelectronics
• Literally - “small electronics”
• A computer is made up of gates, memory cells and
interconnections
• These can be manufactured on a semiconductor
• e.g. silicon wafer
Since the basic functioning of a digital computer is
storage, movement, processing and control functions.
Two fundamental types of components are required
i.e. gates and memory cells.
Gate: is a device that implements a simple Boolean
logical functions. Such as if A AND B ARE TRUE THEN C
IS TRUE.
Memory Cell: is a device that can store one bit of
data.
By interconnecting large number of these devices, we
can construct a computer.
These gates and memory cells are constructed of
simple digital electronic components.
Data Storage: provided by memory cells
Data Processing: Provided by gates
Data Movement: the paths among components are used to
move data from memory to memory and from memory
through gates to memory.
Control: the path among components can carry control
signals.
For example: a gate will have one or two data inputs plus a
control signal input that activates the gate. When the control
signal is ON, the gate performs its function on the data inputs
and produces output .
ICs Fabrication
The integrated circuit exploits the facts that
digital electronic components such as transistors, resistors and
conductors can be fabricated from a semi conductor such as
silicon.
The entire circuit is fabricated in a tiny piece of silicon rather than
assemble discrete components made from separate pieces of
silicon into the same circuit.
Fig 2.7 Relationship among
Wafer, Chip and Gate
A thin wafer of silicon is divided intro matrix of small areas,
each a few millimeters square. The identical circuit pattern
is fabricated in each area and the wafer is broken up into
chips.
Each chip consists of many gates and memory cells and a
number of input output attachment points.
This chip is packaged in housing that protects it and
provides pins for attachment to devices beyond the chip.
These packages can then be interconnected on Printed
circuit board to produce larger or complex circuits.
Moore’s Law
Increased density of components on chip
Gordon Moore – co-founder of Intel
Number of transistors on a chip will double every year
Since 1970’s development has slowed a little
Number of transistors doubles every 18 months
Cost of a chip has remained almost unchanged
Higher packing density means shorter electrical paths,
giving higher performance
Smaller size gives increased flexibility
Reduced power and cooling requirements
Fewer interconnections increases reliability
Growth in CPU Transistor
Count
IBM 360 series
• 1964
• Replaced (& not compatible with) 7000 series
• First planned “family” of computers
• Similar or identical instruction sets
• Similar or identical O/S
• Increasing speed
• Increasing number of I/O ports (i.e. more terminals)
• Increased memory size
• Increased cost
How could such family
concept be implemented?
Based on three factors: basic speed, size and degree of
simultaneity
For example, greater speed in the execution of a given
instruction would be gained by the use of more complex
circuitry in the ALU, allowing sub-operations to be carried out
in parallel.
Another way of increasing speed was to increase the width of
data path between main memory and the CPU.
DEC PDP-8
• 1964
• First minicomputer
• Did not need air conditioned room
• Small enough to sit on a lab bench
• Could not do everything the mainframe could
• $16,000 cost- cheap enough
• $100k+ for IBM 360
• BUS STRUCTURE
DEC - PDP-8 Bus Structure
Generations of Computer
Vacuum tube - 1946-1957
Transistor - 1958-1964
Small scale integration - 1965 on
Up to 100 devices on a chip
Medium scale integration - to 1971
100-3,000 devices on a chip
Large scale integration - 1971-1977
3,000 - 100,000 devices on a chip
Very large scale integration - 1978 -1991
100,000 - 100,000,000 devices on a chip
Ultra large scale integration – 1991 -
Over 100,000,000 devices on a chip
Memory
• In 1950s and 1960s, computer memory was constructed from
tiny rings of ferromagnetic material.
• Ring was called core.
• It was expensive, bulky and used destructive readout- a simple
act of reading a core erased the data stored in it. So circuits
were installed to restore the data as soon as it had been
extracted.
Semiconductor Memory
• In 1970s Fairchild produced semiconductor memory.
• Size of a single core
• i.e. 1 bit of magnetic core storage
• Holds 256 bits
• Non-destructive read
• Much faster than core
• Capacity approximately doubles each year
Intel
• 1971 - 4004
• First microprocessor
• All CPU components on a single chip
• 4 bit – lt can add two 4-bit numbers and can multiply on.ly by
repeated addition
• Followed in 1972 by 8008
• 8 bit
• Both designed for specific applications
• 1974 - 8080
• Intel’s first general purpose microprocessor
Microprocessor Speed-
Speeding it up
Pipelining- with pipelining, a processor can
simultaneously work on multiple instructions.
For example, while one instruction is being executed
the computer is decoding the next instruction.
Branch prediction- the processor looks ahead in the
instruction code fetched from memory and predicts
which branches or group of instructions are likely to
be processed next. The processor pre fetch the
correct instructions and buffer them and is kept busy.
• Data flow analysis- the processor analyzes which
instructions are dependent on each other’s result or
data, to create an optimized schedule of instructions.
This prevents unnecessary delay.
• Speculative execution- using branch prediction and
data flow analysis, some processors speculatively
execute instructions ahead of their actual appearance
in the program execution, holding the results in
temporary locations. This enables the processor to
keep its execution engines as busy as possible.
Performance Balance
• Processor speed increased
• Memory capacity increased
• Memory speed lags behind processor speed
Solutions
• Increase number of bits retrieved at one time.
• Change DRAM interface by introducing Cache.
• Reduce frequency of memory access
• More complex cache and cache on processor chip or cache close
to processor chip
• Increase the interconnection bandwidth between processor
and memory by using high speed buses.
Another Area of Design Focus-I/O
Devices
• Peripherals with intensive I/O demands
• Large data throughput (Throughput is a measure of
how many units of information a system can process in a
given amount of time) demands
• Processors can handle the data pumped out by
these devices
• Problem is moving data between processor and
peripheral.
• Solutions:
• Caching
• Buffering
• Higher-speed interconnection buses
• Multiple-processor configurations
Typical I/O Device Data Rates
Key is Balance
• Processor components
• Main memory
• I/O devices
• Interconnection structures
Improvements in Chip
Organization and
Architecture
• Increase hardware speed of processor
• Fundamentally due to shrinking logic gate size
• More gates, packed more tightly, increasing clock rate
• Propagation time for signals reduced
• Increase size and speed of caches
• Dedicating part of processor chip
• Cache access times drop significantly
• Change processor organization and architecture
• Increase effective speed of execution
• Parallelism
Problems with Clock Speed
and Logic Density
Power
Power density increases with density of logic and clock speed
Dissipating heat
RC delay
Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
Delay increases as RC product increases
Wire interconnects thinner, increasing resistance
Wires closer together, increasing capacitance
Memory latency
Memory speeds lag processor speeds
Solution:
More emphasis on organizational and architectural approaches
Intel Microprocessor
Performance
Increased Cache Capacity
• Typically two or three levels of cache between processor and
main memory
• Chip density increased
• More cache memory on chip
• Faster cache access
• Pentium chip devoted about 10% of chip area to cache
• Pentium 4 devotes about 50%
More Complex Execution
Logic
• Enable parallel execution of instructions
• Pipeline works like assembly line
• Different stages of execution of different instructions at same
time along pipeline
• Superscalar allows multiple pipelines within single processor
• Instructions that do not depend on one another can be executed
in parallel
Diminishing Returns
• Internal organization of processors complex
• Can get a great deal of parallelism
• Further significant increases likely to be relatively modest
• Benefits from cache are reaching limit
• Increasing clock rate runs into power dissipation problem
• Some fundamental physical limits are being reached
New Approach – Multiple
Cores
Multiple processors on single chip
Large shared cache
Within a processor, increase in performance
proportional to square root of increase in complexity
If software can use multiple processors, doubling
number of processors almost doubles performance
So, use two simpler processors on the chip rather than
one more complex processor
With two processors, larger caches are justified
Power consumption of memory logic less than processing logic
x86 Evolution (1)
8080
first general purpose microprocessor
8 bit data path
Used in first personal computer – Altair
8086 – 5MHz – 29,000 transistors
much more powerful
16 bit
instruction cache, prefetch few instructions
8088 (8 bit external bus) used in first IBM PC
80286
16 Mbyte memory addressable
up from 1Mb
80386
32 bit
Support for multitasking
80486
sophisticated powerful cache and instruction pipelining
built in maths co-processor
x86 Evolution (2)
Pentium
Superscalar
Multiple instructions executed in parallel
Pentium Pro
Increased superscalar organization
Aggressive register renaming
branch prediction
data flow analysis
speculative execution
Pentium II
MMX technology
graphics, video & audio processing
Pentium III
Additional floating point instructions for 3D graphics
x86 Evolution (3)
Pentium 4
Note Arabic rather than Roman numerals
Further floating point and multimedia enhancements
Core
First x86 with dual core
Core 2
64 bit architecture
Core 2 Quad – 3GHz – 820 million transistors
Four processors on chip
x86 architecture dominant outside embedded systems
Organization and technology changed dramatically
Instruction set architecture evolved with backwards compatibility
~1 instruction per month added
500 instructions available
See Intel web pages for detailed information on processors