Arm Programming
Arm Programming
2023
ARM Processor and Programming
(22ES4CCAPP)
Credits : 3-0-1 Hours:40
Text books:
1. Computer Organization and Architecture, Carl Hamacher, Zvonko
Vranesic, McGraw-Hill,20012.
2. ARM System Developer’s Guide, Sloss, Symes, WrightMorgan
Kaufmann Publishers, Elsevier,2005
3. ARM Assembly Language- Fundamentals and Techniques, William
Hohl, CRC press, Taylor and Frncis,2009
Course Outcomes
CO1: Ability to understand and explain, the functional blocks of a computer --
and peripherals, performance of a processor, memory and I/O systems, ARM
processor, interrupts and exceptions, Stacks and subroutines
Recap
• Introduction to the course
• Functional block diagram of a computer
Information Handled by a Computer
Instructions/machine instructions
• Govern the transfer of information within a computer as well as between the
computer and its I/O devices
• Specify the arithmetic and logic operations to be performed
• Program
Data
• Used as operands by the instructions
• Source program
Encoded in binary code – 0 and 1s
Programs and data in memory
Primary storage
• Fast
• Programs must be stored in memory while they are being executed
• Large number of semiconductor storage cells
• Processed in words
• Memory hierarchy – cache, main memory
Secondary storage – larger and cheaper
Memory
• Like a pages of a note book with space for a fixed number of binary
numbers on each line.
• The pages are made of semiconductor materials
• Each line is an 8 bit register that can store 8 binary bits and several
of these registers are arranged in a sequence called memory
• Memory stores instructions and data in binary form and provides
this information to the µp whenever necessary.
Memory contd..
• To execute the programs, processor reads the instructions and data
from memory and performs the computing operations in its ALU
section
• Results are then transferred to the o/p section for display or stored
in memory for later use
• The memory block has 2 sections
1. ROM-stores the programs and data that do not need alterations.
Ex: monitor program of a single board microcomputer
2. RAM
Input unit
• Computers accept coded information through input units which read
the data.
• The most well known input device is the keyboard.
• Whenever a key is pressed, the corresponding letter or digit is
automatically translated into its corresponding binary code and
transmitted over a cable to either memory or the processor.
• Many other kinds of input devices are available including joysticks,
trackballs and mouses.
Input unit contd..
• The user can enter the instructions and data into memory through devices
such as a keyboard or simple switches (input devices)
• The processor reads the instructions from the memory and processes the
data according to those instructions
• The results can be displayed on devices such as seven segment display ,
LED or printed by a printer.
• These devices are called output devices
• The input section transfers the data & instructions in binary form from
outside world to the processor
Output unit
• The O/P section transfers the data from the computer to the output
devices such as LEDs, CRT, printer etc. (i.e., Its function is to send
processor results to the outside world( or user).
• The most familiar device is printer.
• Some units, such as graphic displays, provide both an output function
and an input function.
Arithmetic and logic unit
Operations of a computer
• Accepts information in the form of programs and data through an
input unit and store it in the memory
• Fetches the information stored in the memory, under program
control, into an ALU, where the information is processed
• Output the processed information through an output unit
• Control all activities inside the machine through a control unit
Unit 1-Lecture 3-10.6.2023
Recap
• Input and Output unit
• ALU
• Control Unit
Bus Structure
• The simplest and most common way of interconnecting various parts of
the computer
• A group of lines that serve as a connecting port for several devices is called
a bus.
• The buses carry data, address and control signals
Bus structure contd.. Data Bus
• Used for transmitting the data / instruction
from CPU to memory/IO and vice-versa.
• It is bi-directional
Control Bus
• Used to transfer the control and timing signals
from one component to the other component.
• The CPU uses control bus to communicate with
the devices that are connected to the
Address Bus computer system.
•Used to carry address from CPU to • The CPU transmits different types of control
memory/IO devices. signals to the system components.
•It is used to identify the particular location • It is bi-directional. Ex: memory R/W, I/O R/W
in memory.
•It carries the source or destination
address of data i.e. where to store or from
where to retrieve the data.
•It is uni-directional.
Bus structure contd..
• All units are connected to this bus.
• Because the bus can be used for only one transfer at a time, only two
units can actively use the bus at any given time.
• Bus control lines are used to arbitrate multiple requests for use of the
bus.
• The main virtue of the single-bus structure is its low cost and its
flexibility for attaching peripheral devices.
• Systems that contain multiple buses achieve more concurrency in
operations by allowing two or more transfers to be carried out at the
same time.
• This leads to better performance but at an increased cost.
• Which of the following system bus is used to designate the source or
destination of the data on the bus itself?
1.Control bus
2.Data bus
3.Address bus
4.System bus
• The bus which is used to transfer data from main memory to
peripheral device is-
1.Data bus
2.Input bus
3.DMA bus
4.Output bus
Performance
Fig. B
Fig. A
Performance contd..
Processor time
• CPU time is the time for which the CPU was busy executing the task.
• It does not take into account the time spent in waiting for I/O (disk IO or network I/O).
• Since I/O operations, such as reading files from disk, are performed by the OS, these operations
may involve a noticeable amount of time in waiting for I/O subsystems to complete their
operations.
• This waiting time will be included in the elapsed time, but not CPU time.
• Hence CPU time is usually less than the elapsed time.
Performance contd..
• Fig shows the cache memory as part of the processor unit.
• At the start of execution, all program instructions and the required
data are stored in the main memory.
• As execution proceeds, instructions are fetched one by one over the
bus into the processor, and a copy is placed in the cache.
• When the execution of an instruction calls for data located in the
main memory, the data are fetched and a copy is placed in the cache.
• Later, if the same instruction or data item is needed a second time, it
is read directly from the cache.
• A program will be executed faster, if the movement of instructions
and data between the main memory and the processor is minimized
which is achieved by using the cache.
Example
• Suppose a number of instructions are executed repeatedly over a
short period of time as happens in a program loop.
• If these instructions are available in the cache memory, they can be
fetched quickly during period of repeated use.
• The same applies to data that are used repeatedly.
Unit 1-Lecture 4-12.6.2023
Recap
• Bus Structure
• Performance
Processor clock
• Processor circuits are controlled by a timing signal called clock.
• The clock defines the regular time intervals called clock cycles.
• To execute a machine instruction, the processor divides the action to be
performed into a sequence of basic steps such that each step can be
completed in one clock cycle.
• The length P of one clock cycle affects the processor performance.
• Its inverse is the clock rate. R=1/P which is measured in cycles per second.
• Processor used in today’s personal computer and work station have a clock
rates that range from a few hundred million to over a billion cycles per
second.
Memory location and addresses
• Number and character operands as well as instructions are stored in
the memory of a computer.
• The memory consists of many millions of storage cells, each of which
can store a bit of information having the value 0 or 1
• Because a single bit represents a small amount of information bits are
seldom handled individually.
• The usual approach is to deal with them in groups of fixed size.
• For this purpose, the memory is organized so that a group of n bits
can be stored or retrieved in a single , basic operation.
• Each group of n bits is referred to as a word of information and n is
called the word length.
Memory location and addresses contd…
• The memory of a computer can be schematically represented as a collection of words as
shown below
• Modern computers have word lengths that typically range from 16 to 64 bits
Memory location and addresses contd…
• If the word length of a computer is 32 bits, a single word can store a 32 bit 2’s
complement number or four ASCII characters, each occupying 8 bits, as shown
below.
• A unit of 8 bits is called a byte.
• Machine instructions may require one or more words for their representation.
Memory location and addresses contd…
• Accessing the memory to store or retrieve a either a word or a byte,
requires address for each location.
• The addresses of successive locations in the memory are given by the
numbers 0 through 2k-1, where k is a positive integer.
• The 2k addresses constitute the address space of the computer, and
the memory can have up to 2k address locations.
Example
24-bit address generates an address space of 224 (16, 777,216)
locations= 16 M where 1 M is 220 (1,048,576).
• A 32-bt address creates an address space of 232 or 4G locations
-where 1G is 230 and T (Tera) is 240
Unit 1-Lecture 5-12.6.2023
Recap
Byte addressability
• The information could be dealt with a bit, byte and word.
• A byte is always 8 bits, but the word length typically ranges from 16 to 64 bits.
• It is impractical to assign distinct addresses to individual bit locations in the
memory.
• The most practical assignment is to have addresses refer to successive byte
locations in the memory.
• This is the assignment used in most modern computers.
• The term byte addressability is used for this assignment.
• Byte locations have addresses 0, 1, 2,… .
• Thus if the word length of the machine is 32 bits, successive words are located
at addresses 0, 4, 8, … with each word consisting of 4 bytes.
Big-Endian and little Endian Assignments
• There are two ways that the byte addresses can be assigned across
words as shown below
• The Big Endian is used when the lower byte addresses are used for
the most significant bytes of a word.
• The little Endian is used when the lower byte addresses are used for
the least significant bytes of a word.
Word alignment
• In case of a 32 bit word length, natural word boundaries occur at
addresses 0, 4, 8 .. as shown in the fig.
• Words are said to be aligned in memory, if they begin at a byte
address that is a multiple of the number of bytes in the word.
• 16 bit word: Word address 0, 2, 4…
• 32 bit word: Word address 0, 4, 8, 12..
• 64 bit word: Word address 0, 8, 16, 24..
Accessing numbers, characters, and character strings
• A number usually occupies one word.
• It can be accessed in the memory by specifying its word address.
• Similarly individual characters can be accessed by their byte address.
• In many applications, it is necessary to handle character strings of
variable length.
• The beginning of the string is indicated by giving the address of the byte
containing its first character. Ex: ‘college’ address is 0004h
• Successive byte locations contain successive characters of the string.
• There are 2 ways to indicate the length of the string.
• A special control character with its meaning ‘end of string’ can be used
as the last character at the string. ($ )
• Or a separate memory word location or processor register can contain a
number indicating the length of the string in bytes.
Unit 1-Lecture 6-19.6.2023
Recap
• Byte addressability
• Big Endian and Little Endian assignments
• Word alignment
• Accessing numbers, characters, and character strings
Memory systems
• Programs and the data they operate on are held in the memory of the
computer.
• The execution speed of programs is highly dependent on the speed
with which instructions and data can be transferred between the
processor and the memory.
• It is required to have a large memory to facilitate execution of
programs that are large and deal with large amount of data.
• Ideally, memory would be fast, large and inexpensive.
• But is not possible to meet all the three requirements
simultaneously.
• Increased speed and size are achieved at increased cost.
Basic concepts
• Maximum size of the memory that can be used in any computer is
determined by the addressing scheme.
For example
• A 16-bit computer that is capable of generating 16 bit addresses is capable of
addressing up to 216 = 64k memory locations.
• The number of memory locations represents the size of the address space of
a computer.
• Most modern computers are byte addressable.
• Fig. (Next slide) shows the address assignments for a byte-addressable 32
bit computer.
• The little Endian arrangement is used in Intel processors.
• ARM can be configured to use either little Endian or big Endian arrangement.
Basic concepts contd..
Basic concepts contd..
• The memory is usually designed to store and retrieve data in word-length
quantities.
• For example,
• A byte-addressable computer whose instructions generate 32-bit
addresses.
• When a 32-bit address is sent from the processor to the memory unit, the
high order 30 bits determine which word will be accessed.
• If a byte quantity is specified, the low-order 2 bits of the address specify
which byte location is involved.
Unit 1-Lecture 7 20.6.2023
Recap
• Memory Systems
• Basic concepts
Basic concepts contd..
• The connection between the processor and its memory consists of
address, data, and control lines
• The processor uses the address lines to specify the memory location
involved in a data transfer operation, and uses the data lines to transfer
the data
• At the same time, the control lines carry the command indicating a
Read or a Write operation and whether a byte or a word is to be
transferred.
• The control lines also provide the necessary timing information and are
used by the memory to indicate when it has completed the requested
operation.
• When the processor-memory interface receives the memory’s response,
it asserts the MFC (Memory Function Completed) signal shown in Figure
Basic concepts contd..
• This is the processor’s internal control signal that indicates that the
requested memory operation has been completed.
• When asserted, the processor proceeds to the next step in its
execution sequence
• Data transfer between the processor and memory takes place
through two processor registers Memory Data Register (MDR) and
Memory Address Register (MAR).
Basic concepts contd..
• Memory access time: A useful measure of the speed of memory units
is the time that elapses between the initiation of an operation to
transfer a word of data and the completion of that operation.
• Memory cycle time: Another important measure which is the
minimum time delay required between the initiation of two
successive memory operations
For example, the time between two successive Read operations.
• The cycle time is usually slightly longer than the access time,
depending on the implementation details of the memory unit.
• A memory unit is called a random-access memory (RAM) if the access
time to any location is the same, independent of the location’s
address.
Cache memory
• The processor of a computer can usually process instructions and data
faster than they can be fetched from the main memory.
• Hence, the memory access time is the bottleneck in the system.
• One way to reduce the memory access time is to use a cache
memory.
• This is a small, fast memory inserted between the larger, slower main
memory and the processor.
• It holds the currently active portions of a program and their data.
Virtual memory
• The access time for main memory is about 10 times longer than the
access time for L1 cache
Accessing I/O- Devices
• A single bus-structure can be used for connecting I/O-devices to a computer
• Each I/O device is assigned a unique set of address.
• Bus consists of 3 sets of lines to carry address, data & control signals.
• When processor places an address on address lines, the intended-device
responds to the command.
• The processor requests either a read or write operation.
• The requested-data are transferred over the data lines.
Accessing I/O- Devices contd..
• There are 2 ways to deal with I/O-devices: 1) Memory-mapped I/O & 2) I/O-
mapped I/O.
1) Memory-Mapped I/O
• Memory and I/O-devices share the same address-space & hence the name.
• Used in most computers
• With memory-mapped I/O, any machine instruction that can access memory
can be used to transfer data to or from an I/O device.
• For example, if DATAIN is the address of a register in an input device, the
instruction
• Load R2, DATAIN reads the data from the DATAIN register and loads them into
processor register R2.
• Similarly, the instruction Store R2, DATAOUT sends the contents of register R2
to location DATAOUT, which is a register in an output device.
Accessing I/O- Devices contd..
2) I/O-Mapped I/O
• Memory and I/O address-spaces are different.
• A special instructions named IN and OUT are used for data-transfer.
• Advantage of separate I/O space: I/O-devices deal with fewer
address-lines
Unit 1, Lecture 9-26.6.2023
Recap
• EPROM, EEPROM, FLASH memory
• Accessing I/O device
Basic processing unit
Some fundamental concepts
• A typical computing task consists of a series of operations specified by a
sequence of machine-language instructions that constitute a program.
• The processor fetches one instruction at a time and performs the
operations specified
• Instructions are fetched from successive memory locations until a branch
or jump instruction is encountered
• Processor keeps track of the address of the memory location containing
the next instruction to be fetched using the Program Counter (PC)
• After fetching an instruction, the contents of the PC are updated to point
to the next instruction in the sequence
• A branch instruction may cause a different value to be loaded into the PC.
• When an instruction is fetched, it is placed in the instruction register,
IR, from where it is interpreted, or decoded, by the processor’s
control circuitry.
• The IR holds the instruction until its execution is completed.
Basic processing unit contd..
• Consider a 32-bit computer in which each instruction is contained in one
word in the memory, as in RISC-style instruction set architecture.
• To execute an instruction, processor has to perform following 3 steps:
1) Fetch the contents of memory location pointed to by PC.( Content of this
location is an instruction to be executed).
The instructions are loaded into IR
IR [[PC]]
2) Assuming that the memory is byte addressable, Increment PC by 4.
PC [PC] +4
3) Carry out the actions specified by instruction in the IR.
• Instruction fetch phase : Fetching an instruction and loading it into the IR.
• Instruction execution phase: Performing the operation specified in the
instruction
Basic processing unit contd..