0% found this document useful (0 votes)
49 views

Arm Programming

The address bus is used to designate the source or destination of the data on the bus itself. The address bus carries the memory address or location from where the data needs to be fetched or to where the data needs to be stored. It identifies the particular location in memory. Hence, the address bus is used to designate the source or destination of the data. The data bus is used for transmitting data between CPU and memory/I/O devices. The control bus transfers control signals. The system bus is a generic term and does not precisely indicate the specific function. Therefore, the correct option is 3. Address bus.

Uploaded by

mnmn
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Arm Programming

The address bus is used to designate the source or destination of the data on the bus itself. The address bus carries the memory address or location from where the data needs to be fetched or to where the data needs to be stored. It identifies the particular location in memory. Hence, the address bus is used to designate the source or destination of the data. The data bus is used for transmitting data between CPU and memory/I/O devices. The control bus transfers control signals. The system bus is a generic term and does not precisely indicate the specific function. Therefore, the correct option is 3. Address bus.

Uploaded by

mnmn
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 84

Unit 1-Lecture1-8.6.

2023
ARM Processor and Programming
(22ES4CCAPP)
Credits : 3-0-1 Hours:40
Text books:
1. Computer Organization and Architecture, Carl Hamacher, Zvonko
Vranesic, McGraw-Hill,20012.
2. ARM System Developer’s Guide, Sloss, Symes, WrightMorgan
Kaufmann Publishers, Elsevier,2005
3. ARM Assembly Language- Fundamentals and Techniques, William
Hohl, CRC press, Taylor and Frncis,2009
Course Outcomes
CO1: Ability to understand and explain, the functional blocks of a computer --
and peripherals, performance of a processor, memory and I/O systems, ARM
processor, interrupts and exceptions, Stacks and subroutines

CO2: Ability to apply the knowledge of assembly/C code to develop PO1


assembly/embedded C programs to perform a specific task
CO3: Ability to analyse/debug the given code to perform a specific task PO2
CO4: Ability to design and develop the logic to interface memory, I/O and PO3
peripherals to ARM controller
CO5: Ability to conduct experiments by simulating assembly and Embedded C PO5, PO9
code using IDE and interface the hardware modules to ARM development
board and develop codes for specific applications
CO6: Ability to implement a mini-project to develop solutions to PO8, PO10,
the given problem using simulation tools PO12
Unit 1
• Overview of computing systems:
• Basic structure of computers
- function units of a computer
- bus structure
- performance of the processor
- memory location and addresses
- memory and I/O systems
- basic processing unit
- pipelining
- computer peripherals
What You Will Learn in this course
• How programs are translated into the machine language
-And how the hardware executes them
• The hardware/software interface
• What determines program performance
-And how it can be improved
• How hardware designers improve performance
Understanding Performance
Algorithm
• Determines number of operations executed
Programming language, compiler, architecture
• Determine number of machine instructions executed per operation
Processor and memory system
• Determine how fast instructions are executed
I/O system (including OS)
• Determines how fast I/O operations are executed
Functional units of a computer
• A computer consists of 5 functionally independent main parts:
• Input unit
• Memory unit
• Arithmetic and logical unit
• Output unit
• Control unit
Unit 1-Leccture 2-9.6.2023

Recap
• Introduction to the course
• Functional block diagram of a computer
Information Handled by a Computer

Instructions/machine instructions
• Govern the transfer of information within a computer as well as between the
computer and its I/O devices
• Specify the arithmetic and logic operations to be performed
• Program
Data
• Used as operands by the instructions
• Source program
Encoded in binary code – 0 and 1s
Programs and data in memory

Two classes of storage

Primary storage
• Fast
• Programs must be stored in memory while they are being executed
• Large number of semiconductor storage cells
• Processed in words
• Memory hierarchy – cache, main memory
Secondary storage – larger and cheaper
Memory

• Like a pages of a note book with space for a fixed number of binary
numbers on each line.
• The pages are made of semiconductor materials
• Each line is an 8 bit register that can store 8 binary bits and several
of these registers are arranged in a sequence called memory
• Memory stores instructions and data in binary form and provides
this information to the µp whenever necessary.
Memory contd..
• To execute the programs, processor reads the instructions and data
from memory and performs the computing operations in its ALU
section
• Results are then transferred to the o/p section for display or stored
in memory for later use
• The memory block has 2 sections
1. ROM-stores the programs and data that do not need alterations.
Ex: monitor program of a single board microcomputer
2. RAM
Input unit
• Computers accept coded information through input units which read
the data.
• The most well known input device is the keyboard.
• Whenever a key is pressed, the corresponding letter or digit is
automatically translated into its corresponding binary code and
transmitted over a cable to either memory or the processor.
• Many other kinds of input devices are available including joysticks,
trackballs and mouses.
Input unit contd..
• The user can enter the instructions and data into memory through devices
such as a keyboard or simple switches (input devices)
• The processor reads the instructions from the memory and processes the
data according to those instructions
• The results can be displayed on devices such as seven segment display ,
LED or printed by a printer.
• These devices are called output devices
• The input section transfers the data & instructions in binary form from
outside world to the processor
Output unit
• The O/P section transfers the data from the computer to the output
devices such as LEDs, CRT, printer etc. (i.e., Its function is to send
processor results to the outside world( or user).
• The most familiar device is printer.
• Some units, such as graphic displays, provide both an output function
and an input function.
Arithmetic and logic unit

• Most computer operations are executed in the Arithmetic and Logical


unit (ALU) of the processor.
• Ex: Suppose two numbers located in the memory are to be added.
• They are brought into the processor, and the actual addition is
carried out by the ALU.
• The sum may then be stored in the memory or retained in the
processor for immediate use.
• When operands are brought into the processor, they are stored in the
high-speed storage elements called “registers”.
• Access times to registers are some what faster than access times to
the fastest cache unit in the memory hierarchy.
Control Unit
• The memory, arithmetic and logic, input and output units store and
process information and perform input and output operations.
• The control unit coordinates the operation of these units.
• The I/O transfers consisting of input and output operations are
controlled by the instructions of I/O programs that identify the devices
involved and the information to be transferred.
• The actual timing signals that govern the transfers are generated by
control circuits.
• Timing signals are the signals that determine when a given action is to
takes place.
• Data transfers between the processor and memory are also controlled by
the control unit through timing signals.
• All computer operations are controlled by the control unit.
• Usually distributed throughout the machine instead of standing alone.
Control Unit contd..

Operations of a computer
• Accepts information in the form of programs and data through an
input unit and store it in the memory
• Fetches the information stored in the memory, under program
control, into an ALU, where the information is processed
• Output the processed information through an output unit
• Control all activities inside the machine through a control unit
Unit 1-Lecture 3-10.6.2023

Recap
• Input and Output unit
• ALU
• Control Unit
Bus Structure
• The simplest and most common way of interconnecting various parts of
the computer
• A group of lines that serve as a connecting port for several devices is called
a bus.
• The buses carry data, address and control signals
Bus structure contd.. Data Bus
• Used for transmitting the data / instruction
from CPU to memory/IO and vice-versa.
• It is bi-directional
Control Bus
• Used to transfer the control and timing signals
from one component to the other component.
• The CPU uses control bus to communicate with
the devices that are connected to the
Address Bus computer system.
•Used to carry address from CPU to • The CPU transmits different types of control
memory/IO devices. signals to the system components.
•It is used to identify the particular location • It is bi-directional. Ex: memory R/W, I/O R/W
in memory.
•It carries the source or destination
address of data i.e. where to store or from
where to retrieve the data.
•It is uni-directional.
Bus structure contd..
• All units are connected to this bus.
• Because the bus can be used for only one transfer at a time, only two
units can actively use the bus at any given time.
• Bus control lines are used to arbitrate multiple requests for use of the
bus.
• The main virtue of the single-bus structure is its low cost and its
flexibility for attaching peripheral devices.
• Systems that contain multiple buses achieve more concurrency in
operations by allowing two or more transfers to be carried out at the
same time.
• This leads to better performance but at an increased cost.
• Which of the following system bus is used to designate the source or
destination of the data on the bus itself?
1.Control bus
2.Data bus
3.Address bus
4.System bus
• The bus which is used to transfer data from main memory to
peripheral device is-
1.Data bus
2.Input bus
3.DMA bus
4.Output bus
Performance

• Most important measure of a computer is how quickly a computer


can execute programs
Three factors affect the performance of a computer
• Hardware design
• Instruction set
• Compiler
Performance contd..
• As the programs are written in a higher level language, performance is
also affected by the compiler that translates programs into machine
language.
• For best performance, the compilers, the machine instruction set,
and the hardware must be designed in a coordinated way.
• Elapsed time: Total time required to execute the program.
-This is a measure of the performance of the entire computer system.
- It is affected by the speed of the processor, the disk and the printer.
- Depends on all units in a computer system
Performance contd..
Processor time
• Sum of the time periods during which the processor is active
• Depends on the hardware involved in the execution of individual machine
instructions.
• This hardware comprises of processor and the memory which are usually
connected by a bus as shown below
• Processor and a cache memory can be integrated into a single chip (fig. B)

Fig. B
Fig. A
Performance contd..
Processor time

• CPU time is the time for which the CPU was busy executing the task.
• It does not take into account the time spent in waiting for I/O (disk IO or network I/O).
• Since I/O operations, such as reading files from disk, are performed by the OS, these operations
may involve a noticeable amount of time in waiting for I/O subsystems to complete their
operations.
• This waiting time will be included in the elapsed time, but not CPU time.
• Hence CPU time is usually less than the elapsed time.
Performance contd..
• Fig shows the cache memory as part of the processor unit.
• At the start of execution, all program instructions and the required
data are stored in the main memory.
• As execution proceeds, instructions are fetched one by one over the
bus into the processor, and a copy is placed in the cache.
• When the execution of an instruction calls for data located in the
main memory, the data are fetched and a copy is placed in the cache.
• Later, if the same instruction or data item is needed a second time, it
is read directly from the cache.
• A program will be executed faster, if the movement of instructions
and data between the main memory and the processor is minimized
which is achieved by using the cache.
Example
• Suppose a number of instructions are executed repeatedly over a
short period of time as happens in a program loop.
• If these instructions are available in the cache memory, they can be
fetched quickly during period of repeated use.
• The same applies to data that are used repeatedly.
Unit 1-Lecture 4-12.6.2023

Recap
• Bus Structure
• Performance
Processor clock
• Processor circuits are controlled by a timing signal called clock.
• The clock defines the regular time intervals called clock cycles.
• To execute a machine instruction, the processor divides the action to be
performed into a sequence of basic steps such that each step can be
completed in one clock cycle.
• The length P of one clock cycle affects the processor performance.
• Its inverse is the clock rate. R=1/P which is measured in cycles per second.
• Processor used in today’s personal computer and work station have a clock
rates that range from a few hundred million to over a billion cycles per
second.
Memory location and addresses
• Number and character operands as well as instructions are stored in
the memory of a computer.
• The memory consists of many millions of storage cells, each of which
can store a bit of information having the value 0 or 1
• Because a single bit represents a small amount of information bits are
seldom handled individually.
• The usual approach is to deal with them in groups of fixed size.
• For this purpose, the memory is organized so that a group of n bits
can be stored or retrieved in a single , basic operation.
• Each group of n bits is referred to as a word of information and n is
called the word length.
Memory location and addresses contd…
• The memory of a computer can be schematically represented as a collection of words as
shown below

• Modern computers have word lengths that typically range from 16 to 64 bits
Memory location and addresses contd…
• If the word length of a computer is 32 bits, a single word can store a 32 bit 2’s
complement number or four ASCII characters, each occupying 8 bits, as shown
below.
• A unit of 8 bits is called a byte.
• Machine instructions may require one or more words for their representation.
Memory location and addresses contd…
• Accessing the memory to store or retrieve a either a word or a byte,
requires address for each location.
• The addresses of successive locations in the memory are given by the
numbers 0 through 2k-1, where k is a positive integer.
• The 2k addresses constitute the address space of the computer, and
the memory can have up to 2k address locations.
Example
24-bit address generates an address space of 224 (16, 777,216)
locations= 16 M where 1 M is 220 (1,048,576).
• A 32-bt address creates an address space of 232 or 4G locations
-where 1G is 230 and T (Tera) is 240
Unit 1-Lecture 5-12.6.2023

Recap
Byte addressability
• The information could be dealt with a bit, byte and word.
• A byte is always 8 bits, but the word length typically ranges from 16 to 64 bits.
• It is impractical to assign distinct addresses to individual bit locations in the
memory.
• The most practical assignment is to have addresses refer to successive byte
locations in the memory.
• This is the assignment used in most modern computers.
• The term byte addressability is used for this assignment.
• Byte locations have addresses 0, 1, 2,… .
• Thus if the word length of the machine is 32 bits, successive words are located
at addresses 0, 4, 8, … with each word consisting of 4 bytes.
Big-Endian and little Endian Assignments
• There are two ways that the byte addresses can be assigned across
words as shown below

• The Big Endian is used when the lower byte addresses are used for
the most significant bytes of a word.
• The little Endian is used when the lower byte addresses are used for
the least significant bytes of a word.
Word alignment
• In case of a 32 bit word length, natural word boundaries occur at
addresses 0, 4, 8 .. as shown in the fig.
• Words are said to be aligned in memory, if they begin at a byte
address that is a multiple of the number of bytes in the word.
• 16 bit word: Word address 0, 2, 4…
• 32 bit word: Word address 0, 4, 8, 12..
• 64 bit word: Word address 0, 8, 16, 24..
Accessing numbers, characters, and character strings
• A number usually occupies one word.
• It can be accessed in the memory by specifying its word address.
• Similarly individual characters can be accessed by their byte address.
• In many applications, it is necessary to handle character strings of
variable length.
• The beginning of the string is indicated by giving the address of the byte
containing its first character. Ex: ‘college’  address is 0004h
• Successive byte locations contain successive characters of the string.
• There are 2 ways to indicate the length of the string.
• A special control character with its meaning ‘end of string’ can be used
as the last character at the string. ($ )
• Or a separate memory word location or processor register can contain a
number indicating the length of the string in bytes.
Unit 1-Lecture 6-19.6.2023

Recap
• Byte addressability
• Big Endian and Little Endian assignments
• Word alignment
• Accessing numbers, characters, and character strings
Memory systems
• Programs and the data they operate on are held in the memory of the
computer.
• The execution speed of programs is highly dependent on the speed
with which instructions and data can be transferred between the
processor and the memory.
• It is required to have a large memory to facilitate execution of
programs that are large and deal with large amount of data.
• Ideally, memory would be fast, large and inexpensive.
• But is not possible to meet all the three requirements
simultaneously.
• Increased speed and size are achieved at increased cost.
Basic concepts
• Maximum size of the memory that can be used in any computer is
determined by the addressing scheme.
For example
• A 16-bit computer that is capable of generating 16 bit addresses is capable of
addressing up to 216 = 64k memory locations.
• The number of memory locations represents the size of the address space of
a computer.
• Most modern computers are byte addressable.
• Fig. (Next slide) shows the address assignments for a byte-addressable 32
bit computer.
• The little Endian arrangement is used in Intel processors.
• ARM can be configured to use either little Endian or big Endian arrangement.
Basic concepts contd..
Basic concepts contd..
• The memory is usually designed to store and retrieve data in word-length
quantities.
• For example,
• A byte-addressable computer whose instructions generate 32-bit
addresses.
• When a 32-bit address is sent from the processor to the memory unit, the
high order 30 bits determine which word will be accessed.
• If a byte quantity is specified, the low-order 2 bits of the address specify
which byte location is involved.
Unit 1-Lecture 7 20.6.2023

Recap
• Memory Systems
• Basic concepts
Basic concepts contd..
• The connection between the processor and its memory consists of
address, data, and control lines
• The processor uses the address lines to specify the memory location
involved in a data transfer operation, and uses the data lines to transfer
the data
• At the same time, the control lines carry the command indicating a
Read or a Write operation and whether a byte or a word is to be
transferred.
• The control lines also provide the necessary timing information and are
used by the memory to indicate when it has completed the requested
operation.
• When the processor-memory interface receives the memory’s response,
it asserts the MFC (Memory Function Completed) signal shown in Figure
Basic concepts contd..

• This is the processor’s internal control signal that indicates that the
requested memory operation has been completed.
• When asserted, the processor proceeds to the next step in its
execution sequence
• Data transfer between the processor and memory takes place
through two processor registers Memory Data Register (MDR) and
Memory Address Register (MAR).
Basic concepts contd..
• Memory access time: A useful measure of the speed of memory units
is the time that elapses between the initiation of an operation to
transfer a word of data and the completion of that operation.
• Memory cycle time: Another important measure which is the
minimum time delay required between the initiation of two
successive memory operations
For example, the time between two successive Read operations.
• The cycle time is usually slightly longer than the access time,
depending on the implementation details of the memory unit.
• A memory unit is called a random-access memory (RAM) if the access
time to any location is the same, independent of the location’s
address.
Cache memory
• The processor of a computer can usually process instructions and data
faster than they can be fetched from the main memory.
• Hence, the memory access time is the bottleneck in the system.
• One way to reduce the memory access time is to use a cache
memory.
• This is a small, fast memory inserted between the larger, slower main
memory and the processor.
• It holds the currently active portions of a program and their data.
Virtual memory

• Virtual memory is another important concept related to memory


organization.
• Only the active portions of a program are stored in the main memory,
and the remainder is stored on the much larger secondary storage
device.
• Sections of the program are transferred back and forth between the
main memory and the secondary storage device in a manner that is
transparent to the application program.
• As a result, the application program sees a memory that is much larger
than the computer’s physical main memory
Virtual memory contd..
• Data move frequently between the main memory and the cache and
between the main memory and the disk.
• These transfers do not occur one word at a time.
• Data are always transferred in contiguous blocks involving tens,
hundreds, or thousands of words.
• Data transfers between the main memory and high-speed devices
such as a graphic display or an Ethernet interface also involve large
blocks of data.
• Hence, a critical parameter for the performance of the main memory
is its ability to read or write blocks of data at high speed.
Semiconductor RAM memories
•Semiconductor random-access memories (RAMs) are available in a wide range of
speeds.
•Their cycle times range from 100 ns to less than 10 ns.
ROM
•Both static and dynamic RAM chips are volatile, which means that they retain
information only while power is turned on.
•There are many applications requiring memory devices that retain the stored
information when power is turned off.
• For example, the need to store a small program in such a memory, to be used to
start the bootstrap process of loading the operating system from a hard disk into
the main memory.
•The embedded applications are another important example.
• Many embedded applications do not use a hard disk and require non-volatile
memories to store their software.
Non Volatile memories
• The contents of non-volatile memories can be read in the same way as
the volatile memories.
• But, a special writing process is needed to place the information into a
non-volatile memory.
• Since its normal operation involves only reading the stored data, a
memory of this type is called a Read-only Memory (ROM).
• A memory is called a read-only memory, or ROM, when information can
be written into it only once at the time of manufacture
PROM
• Some ROM designs allow the data to be loaded by the user, thus
providing a programmable ROM (PROM).
• Before it is programmed, the memory contains all 0s.
• The user can insert 1s at the required locations by burning out the
fuses at these locations using high-current pulses.
• This process is irreversible.
• The cost of preparing the masks needed for storing a particular
information pattern makes ROMs cost effective only in large volumes.
• The alternative technology of PROMs provides a more convenient and
considerably less expensive approach, because memory chips can be
programmed directly by the user.

Unit 1, Lecture 8-21.6.2023
Recap
• Basic concepts contd..
• Cache memory
• Virtual memory
• Semiconductor RAM memories
• Non Volatile memories
EPROM
• Another type of ROM chip provides an even higher level of convenience.
• It allows the stored data to be erased and new data to be written into it.
• Such an erasable, reprogrammable ROM is called an EPROM.
• It provides considerable flexibility during the development phase of digital
systems.
• Since EPROMs are capable of retaining stored information for a long time,
they can be used in place of ROMs or PROMs while software is being
developed.
• In this way, memory changes and updates can be easily made.
EEPROM

• An EPROM must be physically removed from the circuit for


reprogramming.
• Also, the stored information cannot be erased selectively.
• The entire contents of the chip are erased when exposed to ultraviolet
light.
• Another type of erasable PROM can be programmed, erased, and
reprogrammed electrically.
• Such a chip is called an electrically erasable PROM, or EEPROM.
• It does not have to be removed for erasure.
• Moreover, it is possible to erase the cell contents selectively.
EEPROM contd..
Disadvantage
• Different voltages are needed for erasing, writing, and reading the
stored data, which increases circuit complexity.

• However, this disadvantage is outweighed by the many advantages of


EEPROMs.
Flash Memory

• An approach similar to EEPROM technology


• A flash cell is based on a single transistor controlled by trapped charge,
much like an EEPROM cell.
• Also like an EEPROM, it is possible to read the contents of a single cell.
• The key difference is that, in a flash device, it is only possible to write an
entire block of cells.
• Prior to writing, the previous contents of the block are erased.
• Flash devices have greater density, which leads to higher capacity and a
lower cost per bit.
• They require a single power supply voltage, and consume less power in
their operation.
• The low power consumption of flash memories makes them attractive for
use in portable, battery-powered equipment
Applications
• Hand-held computers, cell phones, digital cameras, and MP3 music
players.
• In hand-held computers and cell phones, a flash memory holds the
software needed to operate the equipment, thus obviating the need for
a disk drive.
• A flash memory is used in digital cameras to store picture data.
• In MP3 players, flash memories store the data that represent sound.
• Cell phones, digital cameras, and MP3 players are good examples of
embedded systems.
• Single flash chips may not provide sufficient storage capacity for the
applications.
Memory Hierarchy

• The access time for main memory is about 10 times longer than the
access time for L1 cache
Accessing I/O- Devices
• A single bus-structure can be used for connecting I/O-devices to a computer
• Each I/O device is assigned a unique set of address.
• Bus consists of 3 sets of lines to carry address, data & control signals.
• When processor places an address on address lines, the intended-device
responds to the command.
• The processor requests either a read or write operation.
• The requested-data are transferred over the data lines.
Accessing I/O- Devices contd..

• There are 2 ways to deal with I/O-devices: 1) Memory-mapped I/O & 2) I/O-
mapped I/O.
1) Memory-Mapped I/O
• Memory and I/O-devices share the same address-space & hence the name.
• Used in most computers
• With memory-mapped I/O, any machine instruction that can access memory
can be used to transfer data to or from an I/O device.
• For example, if DATAIN is the address of a register in an input device, the
instruction
• Load R2, DATAIN reads the data from the DATAIN register and loads them into
processor register R2.
• Similarly, the instruction Store R2, DATAOUT sends the contents of register R2
to location DATAOUT, which is a register in an output device.
Accessing I/O- Devices contd..

2) I/O-Mapped I/O
• Memory and I/O address-spaces are different.
• A special instructions named IN and OUT are used for data-transfer.
• Advantage of separate I/O space: I/O-devices deal with fewer
address-lines
Unit 1, Lecture 9-26.6.2023

Recap
• EPROM, EEPROM, FLASH memory
• Accessing I/O device
Basic processing unit
Some fundamental concepts
• A typical computing task consists of a series of operations specified by a
sequence of machine-language instructions that constitute a program.
• The processor fetches one instruction at a time and performs the
operations specified
• Instructions are fetched from successive memory locations until a branch
or jump instruction is encountered
• Processor keeps track of the address of the memory location containing
the next instruction to be fetched using the Program Counter (PC)
• After fetching an instruction, the contents of the PC are updated to point
to the next instruction in the sequence
• A branch instruction may cause a different value to be loaded into the PC.
• When an instruction is fetched, it is placed in the instruction register,
IR, from where it is interpreted, or decoded, by the processor’s
control circuitry.
• The IR holds the instruction until its execution is completed.
Basic processing unit contd..
• Consider a 32-bit computer in which each instruction is contained in one
word in the memory, as in RISC-style instruction set architecture.
• To execute an instruction, processor has to perform following 3 steps:
1) Fetch the contents of memory location pointed to by PC.( Content of this
location is an instruction to be executed).
The instructions are loaded into IR
IR [[PC]]
2) Assuming that the memory is byte addressable, Increment PC by 4.
PC [PC] +4
3) Carry out the actions specified by instruction in the IR.
• Instruction fetch phase : Fetching an instruction and loading it into the IR.
• Instruction execution phase: Performing the operation specified in the
instruction
Basic processing unit contd..

The operation specified by an instruction can be carried out by performing one


or more of the following actions:
1) Read the contents of a given memory-location and load them into a register.
2) Read data from one or more registers.
3) Perform an arithmetic or logic operation and place the result into a register.
4) Store data from a register into a given memory-location.
Basic processing unit contd..

Main hardware components of a processor


The hardware-components needed to perform the actions are shown in Figure.
Basic processing unit contd..
• The processor communicates with the memory through the processor-
memory interface, which transfers data from and to the memory during
Read and Write operations.
• The instruction address generator updates the contents of the PC after
every instruction is fetched.
• The register file is a memory unit whose storage locations are organized
to form the processor’s general-purpose registers.
• During execution, the contents of the registers named in an instruction
that performs an arithmetic or logic operation are sent to the arithmetic
and logic unit (ALU), which performs the required computation.
• The results of the computation are stored in a register in the register file.
Data Processing Hardware
• A typical computation operates on data stored in registers.
• These data are processed by combinational circuits, such as adders,
and the results are placed into a register
Data Processing Hardware contd..
• A clock signal is used to control the timing of data transfers.
• The registers comprise edge-triggered flip-flops into which new data
are loaded at the active edge of the clock.
• The clock period, which is the time between two successive rising
edges, must be long enough to allow the combinational circuit to
produce the correct result.
• The operation performed by the combinational block may be quite
complex and it can often be broken down into several simpler steps,
where each step is performed by a subcircuit of the original circuit
• These subcircuits can then be cascaded into a multistage structure
A hardware structure with multiple stages

• If n stages are used, the operation will be completed in n clock cycles.


• Since these combinational subcircuits are smaller, they can complete
their operation in less time, and hence a shorter clock period can be used.
• A key advantage of the multi-stage structure is that it is suitable for
pipelined operation
• Such a structure is particularly useful for implementing processors that
have a RISC-style instruction set.
Pipelining
• Basic Concept—The Ideal Case
• The speed of execution of programs is influenced by many factors.
• One way to improve performance is to use faster circuit technology to
implement the processor and the main memory.
• Another possibility is to arrange the hardware so that more than one
operation can be performed at the same time.
• In this way, the number of operations performed per second is increased,
even though the time needed to perform any one operation is not changed.
• Pipelining is a particularly effective way of organizing concurrent activity in a
computer system.
• Pipelining is commonly known as an assembly-line operation
Pipelining concepts
Pipelining contd..
• Example: Assembly line used in automobile manufacturing
• The first station in an assembly line may prepare the automobile
chassis, the next station adds the body, the next one installs the
engine, and so on.
• While one group of workers is installing the engine on one
automobile, another group is fitting a body on the chassis of a second
automobile, and yet another group is preparing a new chassis for a
third automobile.
• Although it may take hours or days to complete one automobile, the
assembly-line operation makes it possible to have a new automobile
rolling off the end of the assembly line every few minutes.
Pipelining contd..
Pipelining contd..
• Instruction Ij is fetched in the first cycle and moves through the remaining
stages in the following cycles.
• In the second cycle, instruction Ij+1 is fetched while instruction Ij is in the
Decode stage where its operands are also read from the register file.
• In the third cycle, instruction Ij+2 is fetched while instruction Ij+1 is in the
Decode stage and instruction Ij is in the Compute stage where an arithmetic
or logic operation is performed on its operands.
• Ideally, this overlapping pattern of execution would be possible for all
instructions.
• Although any one instruction takes five cycles to complete its execution,
instructions are completed at the rate of one per cycle.
Pipelining contd..

• As other instructions are fetched, execution proceeds through successive


stages.
• At any given time, each stage of the pipeline is processing a different
instruction.
• Information such as register addresses, immediate data, and the operations to
be performed must be carried through the pipeline as each instruction
proceeds from one stage to the next.
• This information is held in interstage buffers.
References

You might also like