0% found this document useful (0 votes)
31 views

EL203 - Embedded

The document outlines the course EL203-Embedded Hardware Design for Winter 2025, detailing its credit structure and course contents, which include topics such as custom processors, microcontrollers, and real-time interfacing. It provides a list of suggested textbooks and references for further reading, as well as an overview of embedded systems and their applications across various industries. Additionally, it discusses design challenges, metrics for optimization, and the architecture of embedded systems, particularly focusing on ARM processors.

Uploaded by

HARSH VITHLANI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

EL203 - Embedded

The document outlines the course EL203-Embedded Hardware Design for Winter 2025, detailing its credit structure and course contents, which include topics such as custom processors, microcontrollers, and real-time interfacing. It provides a list of suggested textbooks and references for further reading, as well as an overview of embedded systems and their applications across various industries. Additionally, it discusses design challenges, metrics for optimization, and the architecture of embedded systems, particularly focusing on ARM processors.

Uploaded by

HARSH VITHLANI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 467

Welcome to

EL203-Embedded Hardware Design !

Credit Structure (L-T-P-Cr): 3-0-2-4


Semester: Winter 2025
Course Contents

1. Introduction
2. Custom Single-Purpose Processors
3. Microcontroller Based on Cortex-M Class
Processors
4. Real-Time Interfacing (CAN)
5. Implementation of Embedded Systems

2
Suggested Textbook/References:
Text Book:
• Frank Vahid and Tony Givargis, Embedded System Design: A Unified Hardware/Software
Introduction, Wiley, Student edition, 2006.

Reference Book:
• Sarah Harris and David Harris, Digital Design and Computer Architecture: ARM Edition,
Morgan Kaufmann Publishers Inc., United States, May 2015.
• Jonathan Walker Valvano, Embedded Systems: Introduction to Arm® Cortex™-M
Microcontrollers, 5th Edition, CreateSpace Independent Publishing Platform, 2011.
• Samir Palnitkar, Digital Design using Verilog HDL, Prentice Hall; 2nd Edition, 2003.
• ARM System Developer’s Guide: Designing and Optimizing System Software 1st Edition
(Designing and Optimizing System Software), Morgan Kaufmann Publishers Inc., 2011.
• Peter R. Wilson, Design Recipes for FPGAs, Elsevier.

3
What is an Embedded System ?

• An embedded system is an
electronic/electro-mechanical system
designed to perform a specific function

• An embedded System is a combination of


both hardware and Software (firmware)
4
iPhone 7

5
iPhone 7 PCB

6
iPhone 4 PCB

7
First iPhone PCB

8
Block Diagram of a Mobile Computing System

9
iPhone XS, 2018

ASIC Accelerators

Neural
GPU
Engine
CPU

TechInsights.com Apple iPhone XS teardown

10
Intel FPGA

11
Architecture

12
Open Computing Language (OpenCL)
Optimize use of all computational
resources in the system
CPUs, GPUs and other
processors as peers
Efficient parallel programming
model
Based on C99
Data-and task-parallel
computational model
Abstract the specifics of
underlying hardware
Specify accuracy of floating-point
computations
Desktop and Handheld Profiles
13
OpenGL: A cross-language, cross-platform application
programming interface for rendering 2D and 3D vector graphics.
14
How many sensors are in a cell phone?
Light
Proximity
2+ Cameras
3+ Microphones
Touch I/F
GPS
Wi-Fi, Cellular, NFC,
Bluetooth
Accelerometer
Magnetometer
Gyroscope
Barometric
Temperature
Humidity
Fingerprint
…. Many more 15
iPhone Accelerometer Sensor
az
a = (ax, ay, az)

ay
O

ax az : yaw, ay : pitch, ax : roll

Install “Physics Toolbox Sensor Suite” on Your Smartphone

16
Gyro sensor

Gyroscope
17
MEMS Gyroscopes

Image: ST Microelectronics
Draper Lab comb drive tuning fork gyroscope

18
Magnetometers

Magnetic compass

Image: Henrik Mouritsen 19


iPhone Magnetometer Sensor
zB

B = B (cosθ, 0, sinθ)

yB
B

xB

Rotational vector, R = Rz (yaw) Ry (pitch) Rx (roll)

20
Self-Driving Car

21
List of Embedded Systems
• Anti-lock brakes • Modems
• Auto-focus cameras • MPEG decoders
• Automatic teller machines • Network cards
• Automatic toll systems • Network switches/routers
• Automatic transmission • On-board navigation
• Avionic systems • Pagers
• Battery chargers • Photocopiers
• Camcorders • Point-of-sale systems
• Cell phones • Portable video games
• Cell-phone base stations • Printers
• Cordless phones • Satellite phones
• Cruise control • Scanners
• Curbside check-in systems • Smart ovens/dishwashers
• Digital cameras • Speech recognizers
• Disk drives • Stereo systems
• Electronic card readers • Teleconferencing systems
• Electronic instruments • Televisions
• Electronic toys/games • Temperature controllers
• Factory control • Theft tracking systems
• Fax machines • TV set-top boxes
• Fingerprint identifiers • VCR’s, DVD players
• Home security systems • Video game consoles
• Life-support systems • Video phones
• Medical testing systems • Washers and dryers

And the list goes on and on 22


Design Challenge – Optimizing Design Metrics
Obvious Design Goal:
• Construct an implementation with desired
functionality

Key Design Challenge:


• Simultaneously optimize numerous design metrics

Design Metric
• A measurable feature of a system’s implementation
• Optimizing design metrics is a key challenge
23
Common Metrics
• Unit Cost: the monetary cost of manufacturing each copy of the system,
excluding NRE cost
• NRE Cost (Non-Recurring Engineering Cost): The one-time monetary cost
of designing the system
• Size: the physical space required by the system
• Performance: the execution time or throughput of the system
• Power: the amount of power consumed by the system
• Flexibility: the ability to change the functionality of the system without
incurring heavy NRE cost
• Time-to-prototype: the time needed to build a working version of the
system
• Time-to-market: the time required to develop a system to the point that it
can be released and sold to customers
• Maintainability: the ability to modify the system after its initial release
• Correctness, safety, many more
24
Design Metric Competition

Power

Performance Size

NRE cost

Improving one may worsen others 25


Time-to-Market: A Demanding Design Metric
Time required to develop a product to the point it can be sold to customers

Revenues (Rs.)

Time (months)
• Market window: Period during which the product would have highest sales
• Average time-to-market constraint is about 8 months
• Delays can be costly 26
Losses Due to Delayed Market Entry
Peak revenue

Peak revenue from delayed entry


Revenues (Rs)

Market Market
rise fall

Time
D W 2W
On-time Delayed
entry entry

Simplified revenue model


• Product life = 2W, peak at W
• Time of market entry defines a triangle, representing market penetration
• Triangle area equals revenue
Loss: The difference between the on-time and delayed triangle areas 27
Examples: Losses due to delayed market entry
• Area = 1/2 * base * height
• On-time = 1/2 * 2W * W
• Delayed = 1/2 * (W-D+W)*(W-D)
• Percentage revenue loss = (D(3W-D)/2W2)*100%

• Example-1
• Lifetime 2W=52 wks, delay D=4 wks
• (4*(3*26 –4)/2*26^2) = 22%

• Example-2
• Lifetime 2W=52 wks, delay D=10 wks
• (10*(3*26 –10)/2*26^2) = 50%

Delays are costly! 28


NRE and Unit Cost Metrics
Costs:
• Unit cost: the monetary cost of manufacturing each copy of the system,
excluding NRE cost
• NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of
designing the system
• total cost = NRE cost + unit cost * # of units
• per-product cost = total cost / # of units
= (NRE cost / # of units) + unit cost

Example
– NRE=Rs2000, unit=Rs100
– For 10 units
– total cost = Rs2000 + 10*Rs100 = Rs3000
– per-product cost = Rs2000/10 + Rs100 = Rs300

Amortizing NRE cost over the units results in an additional Rs200 per unit
29
The Performance Design Metric
• Widely-used measure of system, widely-abused
Clock frequency, instructions per second – not good measures
Digital camera example – a user cares about how fast it processes
images, not clock speed or instructions per second

• Latency (response time)


Time between task start and end
e.g., Camera’s process images in 0.25 seconds

• Throughput
Tasks per second, e.g. Camera processes 4 images per second

30
Processor Technology

Controller Datapath Controller Datapath Controller Datapath


Control Register Control logic Registers Control index
logic and file and State logic
State total
register
Custom State
register +
General ALU register
IR PC ALU IR PC
Data Data
memory memory
Program Data Program
memory memory memory
Assembly code Assembly code
for: for:

total = 0 total = 0
for i =1 to … for i =1 to …

General-purpose (“software”) Application-specific Single-purpose (“hardware”)

• Intel Processor
• AMD Processor
• ARM Processor 31
ARM Processors Based Microcontrollers

32
List of Embedded Systems
• Anti-lock brakes • Modems
• Auto-focus cameras • MPEG decoders
• Automatic teller machines • Network cards
• Automatic toll systems • Network switches/routers
• Automatic transmission • On-board navigation
• Avionic systems • Pagers
• Battery chargers • Photocopiers
• Camcorders • Point-of-sale systems
• Cell phones • Portable video games
• Cell-phone base stations • Printers
• Cordless phones • Satellite phones
• Cruise control • Scanners
• Curbside check-in systems • Smart ovens/dishwashers
• Digital cameras • Speech recognizers
• Disk drives • Stereo systems
• Electronic card readers • Teleconferencing systems
• Electronic instruments • Televisions
• Electronic toys/games • Temperature controllers
• Factory control • Theft tracking systems
• Fax machines • TV set-top boxes
• Fingerprint identifiers • VCR’s, DVD players
• Home security systems • Video game consoles
• Life-support systems • Video phones
• Medical testing systems • Washers and dryers

And the list goes on and on 33


Embedded System
Industry
Medical
Cars

Military
Phone Computer
Space
Consumer House
Electronics
34
5 billion Hidden
• Computer A: Cell Phone
• X86 M: Microcontroller
• ARM
• AMD R: Real-Time
• Memory
• I/O interface
• Hardware
• Software
• Electrical, Mechanical, Chemical etc.
35
Consideration
• Test
Market share
• Profit
Cost
• Power
• Size
Correct answer
• Time
Right time

36
Components of an Embedded System

Electrical
Mechanical

Devices
Computer Chemical
Biological
Optical

37
Inside a Computer
Memory
Processor
RAM Volatile
Data

Memory
Code ROM I/O
Flash

Non-Volatile Bus 38
Von Neumann
Memory
Processor
RAM Volatile
Data

Memory
Code ROM I/O
Flash

Non-Volatile Bus 39
Harvard

Processor RAM

ROM I/O

System Bus
ICode
STM32F407VGT6 40
Focus I/O
ICode System Bus

ROM Processor RAM

PortA PortC
PortB PortD
PortE PortF
41
Microcontroller
ICode System Bus

ROM Processor RAM

PortA PortC
PortB PortD
PortE PortF
42
Types I/O
• Parallel
• Serial (1 bit at a time)
Communication
• Analog (measure and generate
signals)
• Time (measure and Output signals)

43
5 billion Hidden
• Computer A: Cell Phone
• X86 R: Real-Time
• ARM
• AMD M: Microcontroller
• Memory
• I/O interface
• Hardware
• Software
• Electrical, Mechanical, Chemical etc.
44
ARM Processor Family
ARM family ARM Architecture

ARM7 ARMv4

ARM9 ARMv5
ARM11 ARMv6
Cortex-A ARMv7-A
Cortex-R ARMv7-R
Cortex-M ARMv7-M
45
Cortex-A
Smartphone
Cortex-A Tablets
Servers
Desktop Processors

32-bit Cortex-A
Cortex-A5, Cortex-A7, Cortex-A8
Cortex-A9, Cortex-A12, Cortex-A15

64-bit Cortex-A
Cortex-A53, Cortex-A55, Cortex-A57
Cortex-A72, Cortex-A75, Cortex-A73
46
Cortex-R

5G Modem
Cortex-R
Automobile

Cortex-R52, Cortex-R5, Cortex-R4


Cortex-R8, Cortex-R7

47
Cortex-M
• Low-power Microcontroller
Cortex-M0, Cortex-M0+, Cortex-M1
Cortex-M3, Cortex-M4, Cortex-M7

• Modern IoT Devices


Cortex-M23, Cortex-M33

48
Microcontroller
ICode System Bus

ROM Processor RAM

PortA PortC
PortB PortD
PortE PortF
49
Microcontroller Pins

• MCU pins are grouped into ports, e.g. PortA, PortB etc.
• PA1: Pin1 of PortA 50
Input/Output

51
General Purpose I/O Special Purpose I/O
GPIO SPIO
LED I2C UART

LCD SSI TIMER

KEYPAD CAN

SWITCH PWM

7-SEGMENT ADC/DAC
DC MOTOR

52
Buses

Advanced Peripheral Bus (APB)


• Minimum of 2 clock cycles
access to peripherals

Advanced High-Performance Bus (AHB)


• 1 clock cycle access to peripherals

53
Contents
• Cortex-M Architecture
• Universal Asynchronous Receiver and
Transmitter (UART)
• TIMER
• Serial Peripheral Interface (SPI)
• Inter-Integrated Circuit (I2C) Protocol
• Analog-to-Digital Converter (ADC)
• Pulse-Width Modulation(PWM)
54
Assembly Tools

55
Assembly .axf
Source Code .o ARM ELF/DWARF
Linker image
.S
ARM
Assembler
ELF
.o
Object/ fromelf Disassembly fromelf
DWARF
C
Source Code .o
armar Library ROM
format
C
Assembler
ELF: Executable and Linkable Format
DWARF: Debugging with attributed record 56format
Software Development Process
Editor
Source Test
Simulator
Code
Compiler

Object Real Target


Test
Hardware

57
What You Will Learn

• ARM Instruction Set (ISA) and the Thump Instruction Set.

• Create data structure such as FIFO and LILO in assembly level.

• Create Finite State Machine (FSM) such as Moore Machine


and Mealy Machine.

• Programming complex algorithm in Assembly and C

• Write hardware drivers to configure peripherals such as GPIO,


ADC, UART, TIMER, etc.

58
ARM Assembly Language

ADDRESS

PROCESSOR Memory

59
Computing Device

ADDRESS
CODE
PROCESSOR
DATA

60
COMPUTER
APPLICATIONS/OS Windows, Android
mbedOS

PROGRAMMING LANUAGE C, C++, Java

INSTRUCTION SET ARCHITECTURE (ISA) MOV, R0, R1,


LDR, BEQ

MICROARCHITECTURE

GATES

TRANSISTORS

61
Number Systems

62
BASE 10 Counting

0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19

BASE 2 Counting
0 1
10 11
100 101
63
DECIMAL BINARY
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
10 1010
11 1011
12 1100
13 1101
14 1110
15 1111 64
Binary System
1101112

2 5 2 4 23 2 2 21 2 0

1 1 0 1 1 1
(1x25) + (1x24) + (0x23) + (1x22) + (1x21) + (1x20)

= 5510
65
BASE 16 Counting

0 1 2 3 4 5 6 7 8 9 A B C D E F

66
DECIMAL BINARY HEXADECIMAL
0 0000 0
1 0001 1
2 0010 2
3 0011 3
4 0100 4
5 0101 5
6 0110 6
7 0111 7
8 1000 8
9 1001 9
10 1010 A
11 1011 B
12 1100 C
13 1101 D
14 1110 E
15 1111 F 67
Hexadecimal System
0xA5E9

163 162 161 160


A 5 E 9
(10x163) + (5x162) + (14x161) + (9x160)

= 4247310

68
Hexadecimal System
0xA5E9

A 5 E 9
0b 1010 0101 1110 1001

69
Bits-to-Commands

70
mnemonic

0xE0CC31B0 STHR sum, [pointer], #6


0x1AFFFFF1 loop_one
0xE3A0D008 count, #8

71
0xE3A0D008

MOV count, #8

31 28 27 26 25 24 21 20 19 16 15 12 11 0

cond 0 0 I opcode S Rn Rd shifter_operand

ADD for add


STRH for store half word
B for branch 72
Assembly Syntax

73
Instructions have 4 fields separated by space or tabs.

(1) (2) (3) (4)


Label opcode operand; comment

store_routine STR R0, [R1] ; Store the value of R0 into R1

74
• Label: Optional. Used to find the of position
current instruction in memory.

• Opcode: specifies processor operation to


perform.

• Operand: Specifies source/destination of data to


be process by opcode.

• Comment: Optional. Used to explain code


meaning.
75
• ARM Design Philosophy
• RISC Architecture

76
ARM as a Company

ARM does not manufacture processor.

77
RISC – Reduced Instruction Set Computer
CISC – Complex Instruction Set Computer
E.g. – Intel Processor

78
RISC Design Based on Four Philosophy

 Instructions – Reduced number of instructions.


 Pipelines – Instructions are executed in parallel by
pipelines.
 Registers – Large general-purpose register set.
 Load-Store – Processor operates on data held in
registers.

79
ARM Design Philosophy

RISC High Code


Density

ARM

Power
Efficiency

80
RISC vs. CISC

RISC CISC
Greater
Complexity Compiler Compiler
Code Code
Generation Generation

Greater
Processor Processor Complexity

81
ARM Processor vs. Intel Processor

ARM INTEL
• RISC CISC
• Little, Big Endian Little Endian

82
Embedded System with ARM Processor

Controller
ARM
Processor
Peripherals
Bus

83
ARM Based Microcontroller
ROM
SRAM
DRAM
ARM AHB-APB bridge FLASH
Processor
External
AHB-external bridge

Interrupt controller

AHB arbiter
AHB-APB bridge

Ethernet
RTC
Timers
USART

84
ARM Buses

• ARM Bus Technology

• AMBA Bus Protocol

85
ARM Bus Technology
Two classes of devices

• Bus master – ARM processor core


• Bus slave – Peripherals

86
Physical level
-Electrical Characteristics
-Bus width e.g. 16-bit, 32-bit etc.

Two bus Architecture levels

Protocol level
-Communication rules

87
AMBA Bus Protocol

AMBA – Advanced Microcontroller


Bus Architecture

ASB – ARM System Bus

APB – ARM Peripheral Bus

AHB – ARM High-Performance Bus

88
Memory

Performance Cache

Main
Memory

1 MB 1 GB

Memory Size

89
Memory Width

• Number of bits memory returns on each access.


E.g.
16-bit
32-bit
64-bit

Instruction size 8-bit memory 16-bit memory 32-bit memory


ARM 32-bit 4 cycles 2 cycles 1 cycle
Thumb 16-bit 2 cycles 1 cycle 1 cycle

90
Memory Type
• ROM (Read-Only Memory) – Cannot be reprogrammed

• Flash ROM – Can be written and read

• DRAM (Dynamic Random Access Memory) – Lowest


cost per megabyte compared to other RAMs

• SRAM (Static Random Access Memory) – Faster than


DRAM

• SDRAM (Synchronous Dynamic Random Access


Memory) – Run at much higher clock rate

• EEPROM – Can be written and read


91
ARM Memory Space Allocation
32-bit ?

• Internal registers in the register bank


are 32-bit

• Data paths are 32 bits

• Bus Interfaces are 32-bit

92
ARM Memory Bit Size
Bit size allows the CPU to address memory for an
individual process.

X-bit can be handle 2x bytes of memory

The higher the bit size the higher performance.

32-bit can handle


232 = 4294967296 bytes
= 4.2GB

93
ARM Memory Space
8-bit

4GB SRF 0xFFFF FFFF

3GB 0xC000 0000

2GB 0x8000 0000

0x4000 7FFF
1GB SRAM 0x4000 0000
EEPROM 0x0010 1000
0x0010 0000
0 FLASH 0x0007 FFFF
0x0000 0000
94
Peripherals

ARM peripherals are memory-mapped

• Memory Controller: Connects different types of memory to processor


• Interrupt Controller: Determines which peripherals can access the
processor at specific times

95
Computer Architecture
Von Neumann Architecture (single bus)

ALU

CPU Memory

Control Unit

Memory Input/Output Input/Output

Data
Control Today

Proposed in 1947

96
Computer Architecture
Harvard Architecture (two or more buses)

Instruction Memory CPU Data Memory

Input/Output
Simplified Harvard Architecture:
Data Bus
CPU MEMORY
Instruction Bus

97
Cache and Tightly Coupled Memory
ARM Core

Unified Cache

Load and Control

AMBA Bus Interface Unit


Main Memory

Ob-Chip AMBA Bus

Von Neumann architecture with cache

98
Cache and Tightly Coupled Memory
ARM Core

Load and Control


D I

Data Instruction D I
TCM TCM

AMBA Bus Interface Unit


Main Memory
D+I
Ob-Chip AMBA Bus

Harvard architecture with Tightly Coupled Memory (TCM)

99
Cache and Tightly Coupled Memory
ARM Core

Load and Control


D I D I

Data Instruction Data Instruction


TCM TCM cache cache
D I
AMBA Bus Interface Unit
Main Memory
D+I
Ob-Chip AMBA Bus

Harvard architecture with Cache and Tightly Coupled Memory (TCM)

100
Memory Management

FLASH MPU Region 1

Peripherals MPU Region 2

SRAM MPU Region 3

MPU: Memory Protection Unit


MMU: Memory Management Unit
101
Coprocessor Extensions

ARM
Interface Coprocessor
Core

102
Multiple Coprocessor

FPU
Interface

ARM Core

Interface
Coprocessor
15

103
ARM Programmer’s Model

104
Data Types

105
Byte or 8-bits

Halfword or 16-bits

Word or 32-bits

106
Processor Models

107
ARM7TDMI

108
User

System

Ubdef

Abort

IRQ

FIQ

Supervisor
(SVC)
ARM7TDMI

109
User Unprivileged
Mode
System

Ubdef

Abort

IRQ Privileged
Mode

FIQ

Supervisor
(SVC)
ARM7TDMI

110
User Normal application run mode Unprivileged
Mode
Privileged mode using the same
System
registers as User Mode

Ubdef Undefined instructions

Abort Memory access violation

Entered when normal interrupt is Privileged


IRQ
raised Mode
Entered on reset and when software
FIQ
interrupt occurs

Supervisor Entered on reset and when software


(SVC) interrupt occurs

ARM7TDMI

111
Cortex-M4 Processor Models

112
Privileged User

Handler Exception
mode handling

Thread Applications Applications


mode

CORTEX-M4

113
Registers
• A register is a storage inside the
processor core.

• ARM processor have a number of


registers inside the processor core
to perform data processing and
control.

• Most of these registers are grouped


in a unit called the register bank. 114
ARM7TDMI Processor

• 30 General-Purpose Registers
• 6 Status Registers
• 1 Program counter Register

115
ARM7TDMI Partially Shared Registers
User/System Supervisor Abort Undefined Interrupt Fast Interrupt
R0 R0 R0 R0 R0 R0
R1 R1 R1 R1 R1 R1
R2 R2 R2 R2 R2 R2
R3 R3 R3 R3 R3 R3
R4 R4 R4 R4 R4 R4
R5 R5 R5 R5 R5 R5
R6 R6 R6 R6 R6 R6
R7 R7 R7 R7 R7 R7
R8 R8 R8 R8 R8 R8_FIQ
R9 R9 R9 R9 R9 R9_FIQ
R10 R10 R10 R10 R10 R10_FIQ
R11 R11 R11 R11 R11 R11_FIQ
R12 R12 R12 R12 R12 R12_FIQ
R13 R13_SVC R13_ABORT R13_UNDEF R13_IRQ R13_FIQ
R14 R14_SVC R14_ABORT R14_UNDEF R14_IRQ R14_FIQ
PC PC PC PC PC PC
CPSR CPSR CPSR CPSR CPSR CPSR
SPSR_SVC SPSR_ABORT SPSR_UNDEF SPSR_IRQ SPSR_FIQ
116
• The General Purpose registers (R0 – R12) contain
data or addresses.

• R13 is known as the stack pointer (SP) and it points


to the top element of the stack.

• R14 is known as the Link Register (LR) and it’s


used to store the return location for functions.

• R15 is known as the Program Counter (PC). It is


readable and writeable

• Read returns current instruction address plus 4


• Writing to PC causes a branch operation
117
CPSR: Current Program Status Register
31 30 29 28 7 6 5 4 3 2 1 0
M M M M M
N Z C V I F T 4 3 2 1 0

118
xPSR Mode
xPSR[4:0] MODE

10000 User mode


10001 FIQ mode
10010 IRQ Mode
10011 Supervisor mode
10111 Abort mode
11011 Undefined mode
11111 System mode

119
ARM7TDMI Vector Table

Exception Type MODE Vector Address


• Reset SVC 0x0000 0000
• Undefined instruction UNDEF 0x0000 0004
• Software Interrupt(SVC) SVC 0x0000 0008
• Prefetch Abort ABORT 0x0000 000C
• Data Abort ABORT 0x0000 0010
• IRQ IRQ 0x0000 0018
• FIQ FIQ 0x0000 001C

120
Cortex-M Processor

• 17 General-Purpose Registers
• 1 Status Registers
• 3 Program counter Register

121
Cortex-M Registers
Registers
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9 Special
R10 Registers
R11 xPSR
R12 PRIMASK
Stack Pointer (SP) Stack Pointer (SP) R13 FAULTMASK
Link Register (LR) R14 BASEPRI
MSP PSP Program Counter (PC) R15 CONTROL

122
Cortex-M Registers

Special Registers
xPSR
PRIMASK
FAULTMASK
BASEPRI
CONTROL

MRS ; Read special register

MSR ; Write into special register

123
Cortex-M Registers
xPSR

APSR EPSR IPSR


Application Execution Interrupt
PSR PSR PSR

31 30 29 28 27 26 25 24 23: 20 19: 16 15: 10 9: 8 7 6 5 4 3 2 1 0

APSR N Z C V Q GE

EPSR ICI/IT T ICI/IT

IPSR ISRNUM

124
PSR bit fields
Bit Description
• N Negative flag
• Z Zero flag
• C Carry flag
• V Overflow flag
• Q Sticky saturation flag
• GE[3:0] Greater-Than or equal to flag
• ICI/TI Interrupt Continuation
Instruction bits/IF-THEN bit
• T Thumb bit
• Exception Number Indicates exception
125
Cortex-M Vector Table
Exception Type Exception No. Vector Address
• Top of stack - 0x0000 0000
• Reset 1 0x0000 0004
• NMI 2 0x0000 0008
• Hard fault 3 0x0000 000C
• Memory management fault 4 0x0000 0010
• Bus fault 5 0x0000 0014
• Usage fault 6 0x0000 0018
• Svcall 11 0x0000 002C
• Debug Monitor 12 0x0000 0030
• PendSV 14 0x0000 0038
• SysTick 15 0x0000 003C
• Interrupt 16 and above 0x0000 0040 and above
126
ARM Core Data Flow Model

127
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 128
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 129
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 130
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 131
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 132
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 133
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 134
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 135
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 136
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 137
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 138
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 139
ADDRESS
DATA
Instruction
Decoder
Write Sign extend
Read
R15 Register file Rd
R0 – R15 Result
A Rm B Acc
A B
Rn
Barrel shifter
MAC
N

ALU

Address register
Incrementer 140
ADDRESS
ARM Pipeline

Fetch Decode Execute

Load an instruction Identifies the Process instruction and


from memory instruction writes result back to
register

141
ARM Pipeline

Fetch Decode Execute

Cycle 1 ADD

Time
Cycle 2 SUB ADD

Cycle 3 CMP SUB ADD

142
Increased Pipeline

Fetch Decode Execute Memory Write

143
0x8000 LDR pc, [pc, #0]
0x8004 NOP
0x8008 DCD jumpaddress

Fetch Decode Execute

DCD NOP LDR


pc + 8
(0x8000 + 8)

144
Assembler Rules
and Directives

145
Structure of
an Assemble Module

146
AREA FFT CODE, READONLY • Name this block of code FFT

ENTRY • Mark first instruction to execute

Start MOV R0, #10 • Set up parameters

MOV R1, #3
ADD R0, R0, R1 • R0 = R0 + R1

Stop B Stop • Infinite loop

END • Mark end of file

147
AREA FFT CODE, READONLY • Name this block of code FFT

EXPORT Start • Mark first instruction to execute

Start MOV R0, #10 • Set up parameters

MOV R1, #3
ADD R0, R0, R1 • R0 = R0 + R1

Stop B Stop • Infinite loop

END • Mark end of file

148
ARM, Thumb, and Thumb-2 Instructions

149
• ARM instructions are 32-bit width

• First used in ARM7TDMI, ARM9, ARM10,


and ARM11 Processor

• Thumb instructions are subset of ARM


instructions

• Thumb instructions process 32-bit data,


are 16-bit width

150
Examples:

ARM instruction: ADD R0, R0, R2

Thumb instructions: ADD R0, R2

151
ARM instruction Thumb,b-2 instruction
ARM7TDMI, ARM9, Cortex-M
ARM10, ARM11

Thumb instruction
ARM7TDMI, ARM9,
ARM10, Cortex-A, Cortex-R

152
Predefined Register Names

153
r0 – r15 or R0 – R15
a1 – a4 (augment, result, or scratch registers, same as r0 to r3)

sp or SP (stack pointer)
lr or LR (link register)
pc or PC (program counter)
cpsr or CPSR
spsr or SPSR
apsr or APSR

154
Directives ?

• Assist and control assembly process.

• Are also called pseudo-ops.

• Not part of the instruction set.

• They change the way code is


assembled.
155
Keil GNU Assembler
AREA .sect
RN .asg
EQU .equ
DCB, DCW, DCD .byte, .half, .word
ALIGN .align
SPACE .space
END .end
156
• Thumb – Placed at the top of the file to specify that the code is
generated with Thumb instructions.

• CODE – Denotes the section for machine instructions (ROM).

• DATA – Denotes the sections for global variables (RAM).

• AREA – Instructs the assemblers to assemble a new code or data


section.

• SPACE – Reserves a block of memory and fills with zeros.

• ALIGN – Used to ensure next object aligns properly.

157
• EXPORT – to make an object accessible from another file.

• GLOBAL – same as EXPORT.

• IMPORT – to access an “exported” object.

• END – Placed at the end of each file.

• BCD – Placed byte (8-bits) sized constant in memory.

• DCW – Placed half-word (16-bits) sized constant into memory.

• DCD – Placed a word (32-bits) sized constant into memory.

• EQU – To give a symbolic name to a numeric constant.


158
AREA FFT CODE, READONLY • Name this block of code FFT

EXPORT Start • Mark first instruction to execute

Start MOV R0, #10 • Set up parameters

MOV R1, #3
ADD R0, R0, R1 • R0 = R0 + R1

Stop B Stop • Infinite loop

END • Mark end of file

159
Load-Store Instructions
and Addressing

160
Memory Demarcations

161
8-bit

4GB SRF 0xFFFF FFFF

3GB 0xC000 0000

2GB 0x8000 0000

0x4000 7FFF
1GB SRAM 0x4000 0000
EEPROM 0x0010 1000
0x0010 0000
0 FLASH 0x0007 FFFF
0x0000 0000
162
Bit Size
Bit size allows the CPU to address memory for an
individual process.

X-bit can be handle 2x bytes of memory

The higher the bit size the higher performance.

32-bit can handle


232 = 4294967296 bytes
= 4.2GB

163
Texas Instruments TM4C123xx memory map

164
165
166
Frequently used Load/Store Instructions

167
LOAD STORE Size and Type
LDR STR 32-bits
LDRB STRB 8-bits (byte)
LDRH STRH 16-bits (Halfword)
LDRSB Signed byte
LDRSH Signed Halfword
LDM SDM Multiple words

168
Load: Take value from memory and write to register

Store: Read value from register and write to memory

LDR {size} {cond} <Rd>, <addressing_mode>


STR {size} {cond} <Rd>, <addressing_mode>

Size: e.g. byte, halfword, word etc.


Rd: source or destination register
169
Frequently used Load/Store Instructions

170
Frequently used Load/Store Instructions

LOAD STORE Size and Type


31 0
LDR STR 32-bits
7 0
LDRB STRB 8-bits (byte)
15 0
LDRH STRH 16-bits (Halfword)
LDRSB Signed byte
LDRSH Signed Halfword
LDM SDM Multiple words

171
Example: LDR R11, [R0]

Address Memory

0x8000 0xEE
0x8001 0xFF
0x8002 0x90
0x8003 0xA7

before LDR
R0 0x8000 R11 0x12345678

after LDR
R11 0xA790FFEE

172
Example: LDRH R11, [R0]

Address Memory

0x8000 0xEE
0x8001 0xFF
0x8002 0x90
0x8003 0xA7

before LDRH
R0 0x8000 R11 0x12345678

after LDRH
R11 0x0000FFEE

173
Example: LDRSH R11, [R0]

Address Memory

0x8000 0xEE
0x8001 0x8C
0x8002 0x90
0x8003 0xA7

before LDRSH
R0 0x8000 R11 0x12345678

after LDRSH
R11 0xFFFF8CEE

174
Example: STR R3, [R8]

Address Memory

0x8000 0xBE
0x8001 0xBA
0x8002 0xED
0x8003 0xFE

before STR
R3 0xFEEDBABE R8 0x00008000

after STR
R8 0x00008000

175
Example: STR R3, [R8], #4

Address Memory

0x8000 0xBE
0x8001 0xBA
0x8002 0xED
0x8003 0xFE

before STR
R3 0xFEEDBABE R8 0x00008000

after STR
R8 0x00008004

176
More examples:

LDR R5, [R3] ; load R5 with data from ea<R3>


STRB R0, [R9] ; store data in R0 to ea<R9>
STR R3, [R9, R5, LSL #3] ; store data in R3 to ea<R9 + (R5<<3)>
LDR R1, [R0, #4]! ; load R1 from ea<R0+4>, R0=R0+4
STRB R7, [R6, #-1]! ; store byte to ea<R6-1>, R6=R6-1
LDR R3, [R9],#4 ; load R3 ea<R9>, R9=R9+4
STR R1, [R5],#8 ; store word to ea<R5>, R5=R5+8

177
Addressing Modes

• Pre-indexed addressing
• Post-indexed addressing

178
Pre-indexed addressing

LDR {size} {cond} <Rd>, [<Rn>, <offset>] {!}

STR {size} {cond} <Rd>, [<Rn>, <offset>] {!}

179
Example: STR R0, [R1,#12]

R0
12 0x20C 0x5 0x5
source register

R1
0x200 0x200
base register

180
Example: STR R0, [R1,#12]

R0
12 0x20C 0x5 0x5
source register

R1
0x200 0x200
base register

STR R0, [R1,#12] !


181
Post-indexed addressing

LDR {size} {cond} <Rd>, [<Rn>], <offset>

STR {size} {cond} <Rd>, [<Rn>], <offset>

182
Example: STR R0, [R1],#12

R1 R0
0x20C 12 0x20C 0x5
updated source register
base register

R1
0x200 0x200 0x5
base register

183
ENDIANNESS
32-bit register

R3 0x0A0B0C0D

Store

Memory 0x400 - 0x403

• Little Endian
• Big Endian
184
Little Endian

0x0D 0x0C 0x0B 0x0A


0x400 0x401 0x402 0x403 0x404

Big Endian

0x0A 0x0B 0x0C 0x0D


0x400 0x401 0x402 0x403 0x404

185
Defining Memory Areas
table DCB 0xFE, 0xF9, 0x12, 0x34
DCB 0x11, 0x22, 0x33, 0x44

Address Data Value


0x4000 0xFE
0x4001 0xF9
0x4002 0x12
0x4003 0x34
0x4004 0x11
0x4005 0x22
0x4006 0x33
0x4007 0x44
186
table DCD FEF91234
DCD 11223344

Address Data Value


0x4000 0x34
0x4001 0x12
0x4002 0xF9
0x4003 0xFF
0x4004 0x44
0x4005 0x33
0x4006 0x22
0x4007 0x11
187
Arithmetic and Logic Instructions

188
Flags

189
ARM7TDMI CPSR

31 30 29 28 7 6 5 4 3 2 1 0
M M M M M
N Z C V I F T 4 3 2 1 0

190
Cortex-MPSR

31 30 29 28 27 26 25 24 23: 20 19: 16 15: 10 9: 8 7 6 5 4 3 2 1 0

APSR N Z C V Q GE

EPSR ICI/IT T ICI/IT

IPSR ISRNUM

191
The N flag

• Checking for negative result.


Adding -1 to -2

FFFFFFFF
+ FFFFFFFF
FFFFFFFD
1111 1111 1111 1111 1111 1111 1111 1101

192
Set N bit in PSR

MOV R3, #-1


MOV R4, #-2
ADDS R3, R4, R3

193
Example

7B000000
+ 30000000
AB000000
1010 1011 0000 0000 0000 0000 0000 0000

194
The V flag
• For indicating a negative result.
• Over flow occurs when:
Addition, subtraction, compare result is
greater than 231 or less than -231

A1234567
+ B0000000
151234567
195
Set V bit in PSR

LDR R3, #0x7B000000


LDR R4, #0x30000000
ADDS R3, R4, R3

196
The Z flag
• For checking for negative result.

MOVT R7, #0xF4 ;set counter to 0xF40000


delay
SUBS R7, R7, #1 ;pseudo delay
BNE delay
197
The C flag
• For checking for result greater than 232

LDR R3, =0x7B000000


LDR R7, =0xF0000000
ADDS R4, R7, R3 ;value exceeds 32 bits, generated C out

198
Comparison Instructions

199
CMP – Compare
Subtracts a register or an immediate value from a
register value and updates the conditions codes

CMN – Compare negative


Adds a register or an immediate value to another
register and updates the conditions codes

200
TST – Test
Logically ANDs an arithmetic value with a register
value and updates the condition codes without
affecting V flag

TEQ – Test equivalence


Logically exclusive ORs an arithmetic value with a
register value and updates the condition codes
without affecting V flag

201
MRS –
Move PSR to general-purpose register

MSR –
Move general-purpose register to PSR

e.g. ARM7TDMI or Raspberry Pi


MRS R0, CPSR
MRS R1, SPSR
202
Example:

CMP R8, #0 ;R8 = = 0 ?


BEQ ROUTINE ;yes, then go to my routine
TST R4, R3 ;R3 = 0xC0000000 to test bit 31, 30
TEQ R9, R4, LSL #3

203
Example Cortex-M:

MRS R3, APSR ;read flag information into R3


MSR APSR, R2 ;write to just the flags
MSR PSR, R7 ;write all status information to R7

204
What sets N, Z, C, V flags?

1. Flag setting and clearing instructions TST or CMP


2. Explicit flag setting instructions i.e. instructions ending with S
e.g. ADDS,SUBS
3. Direct write to PSR to explicitly set or clear flags
4. 16-bit Thumb ALU instruction

205
Boolean Operations

206
Examples:

MOVN R5, #0 ; R5 = -1 in two’s component


AND R1, R2 , R3 ; R1 = R2 AND R3
OR R1, R2 , R3 ; R1 = R2 OR R3
EOR R1, R2 , R3 ; R1 = R2 exclusive OR R3
BIC R1, R2 , R3 ; R1 = R2 AND NOT R3

207
Shift and Rotation

208
Operand Operand
1 2

Barrel
shifter

ALU

Result 209
Shift and Rotation

210
Operand Operand
1 2

Barrel
shifter

ALU

Result 211
LSL: Logical shift left by n bits

Multiplication by 2n
C ....... 0

MSB

LSB
7 6 5 4 3 2 1 0
00010111 = 23 0 0 0 1 0 1 1 1

LSL
0 0 1 0 1 1 1 0 0
00101110 = 46
212
LSR: Logical shift right by n bits

Unsigned division by 2n
0 ....... C

MSB

LSB
7 6 5 4 3 2 1 0
00010111 = 23 0 0 0 1 0 1 1 1

LSR
00001011 = 11 0 0 0 0 0 1 0 1 1
213
ASR: Arithmetic shift by n bits

Signed division by 2n
....... C

MSB

LSB
7 6 5 4 3 2 1 0
0 0 0 1 0 1 1 1

0 0 0 0 1 0 1 1
214
ROR: Rotate right by n bits
32-bit rotate

....... C
MSB

LSB
7 6 5 4 3 2 1 0
0 0 0 1 0 1 1 1

0 0 1 0 1 1 1 0
215
RRX: Rotate right extended by one bits

33-bit rotate, 33rd bit is carry flag

....... C

216
Simple shift and rotate examples:

LSL R4, R6, #4 ; R4 = R6 << 4 bits


LSL R4, R6, R3 ; R4 = R6 << # specified in R3
ROR R4, R6, #12 ; R4 = R6 rotated right 12 bits

217
Moving data with shift:

LSR R0, R2, #24 ; extract top byte from R2 into R0


ORR R3, R0, R3, LSL #8 ;shift up R3 and insert R0

218
Addition and Subtraction
Operations

219
Addition and Subtraction Instructions

ADD R1, R2, R3 ; R1 = R2 + R3


ADC R1, R2, R3 ; R1 = R2 + R3 + C
SUB R1, R2, R3 ; R1 = R2 – R3
SUBC R1, R2, R3 ; R1 = R2 – R3 + C – 1
RSB R1, R2, R3 ; R1 = R3 – R2
RSC R1, R2, R3 ; R1 = R3 – R2 + C – 1
220
Adding 64-bit integers

ADDS R4, R0, R2 ; adding the least significant words


ADC R5, R1, R3 ; adding the most significant words

221
64-bit Subtractions

SUB R0, R0, R2, LSL #2 ; R0 = R0 – (R2 << 2)


ADD R1, R1, R3, LSR #3 ; R0 = R1 + (R3 >> 3)

222
Saturated Math Operations

223
Original signal represented by 16-bit signed integers
0x00007FFF

0xFFFF8000

Signal exceeding bound


0x00007FFF

0xFFFF8000
224
Signal with saturation

0x00007FFF

0xFFFF8000

225
Saturated math instructions

QADD, QADD8, QADD16, UQADD8, etc

226
Saturating a 32-bit signed value into a 16-bit signed value

R3 0x00030000

SSAT R4, #16, R3

Result:
R4 0x00007FFF

Q - bit set
227
Multiplication Operations

228
Simple multiplications

MUL R4, R2, R1 ; R4 = R2 * R1


MULS R4, R2, R1 ; R4 = R2 * R1, then set the flags
MLA R7, R8, R9, R3 ; R7 = R8 * R9 + R3

229
Multiplication with SMULL and UMULL

SMULL R4, R8, R2, R3 ; R4 = bits 31 - 0 of R2*R3


UMULL R6, R8, R0, R1 ; {R8, R6} = R0 * R1

230
Multiplication with SMLAL and UMLAL

SMLAL R4, R8, R2, R3 ; {R8, R4} = R2*R3 + {R8, R4}


UMLAL R5, R8, R0, R1 ; {R8, R5} = R0*R1 + {R8, R5}

231
Multiplying with a constant

LSL R1, R0, #2 ; R1 = R0*4


ADD R0, R1, R1, LSL #2 ; R0 = R1 + R1*4

232
Division Operations

233
Division

MOV R1, #0xFF000000


MOV R2, #0x98
UDIV R3, R1, R2 ; R3 = R1/R2

234
DSP Instructions

235
LDR R3, =0xFFFE6487
LDR R2, =0x80008F71
LDR R4, =0xFFFF0003 ;accumulator
SMMLAR R9, R2, R3, R4

236
Bit Manipulation

BFI : Bit Field Insert


UBFX : Unsigned Bit Field Extract
SBFX : Signed Bit Field Extract
BFC : Bit Field Clear
RBIT : Reverse Bit Order
237
Example
R0 0xABCDDCBA
Before Instruction

R1 0xFFFFFFFF

BFI R1, R0, #8, #8


R0 0xABCDDCBA
After Instruction
R1 0xFFFFBAFF
238
Example

Before Instruction

R0 0x000000DD

BFI R0, #4, #4


After Instruction
R0 0x0000000D
239
Fractional Notation

240
0xF320ABCD

-215,962,675 4,079,004,621

241
Integer Fraction
23 22 21 20 2-1 2-2 2-3 2-4
1 0 1 1
1 0 1 1
1 0 1 1
1 0 1 1
1 0 1 1

1011 = 8 + 2 + 1 = 11
101.1 = 4 + 1 + 0.5 = 5.5
10.11 = 2 + 0.5 + 0.25 = 2.75
0.1011 = 0.5 + 0.125 + 0.06875 = 0.6875
242
2-1 = 0.5
2-2 = 0.25
2-3 = 0.75

2-1 2-2
0 0
0x2-1 + 0x2-2 = 0

2-1 2-2
0 1
0x2-1 + 1x2-2 = 0.25

2-1 2-2
1 1
1x2-1 + 1x2-2 = 0.75
243
101100102
= -27 + 25 +24 + 21
= -78

244
101100102
= -27 + 25 +24 + 21 = -78

-27 + 25 +24 + 21 𝑚 − 𝑏𝑖𝑡


𝑛 = 𝑚−1
2
27
1 ≤ 𝑛 ≤ (1 − 2−(𝑚−1) )
-27 + 25 + 24 + 21
= 7 7
2 2 27 2 7
= -1 + 2-2 + 2-3 + 2-6

= - 0.609375 245
Representing 2.71828 with 16-bits

e = 2.71828….
2 bits for integer part
1 bit for sign Q13
13 bits left for fractional part

Qn + Qn = Qn
Qn x Qm = Q(n + m)

246
Convert e to Q13 notation

e x 213 = 22268.1647
Integer portion converted to Hex

0x56FC

0 10 1011011111100

Binary point
Sign bit
247
Examples

LDR R3, =0x56FC ; e in Q13 notation


LDR R2, =0x2D41 ; sqrt(2) in Q13 notation
MUL R5, R2, R3

248
Branching

249
ARM Pipeline

Fetch Decode Execute

Load an instruction Identifies the Process instruction and


from memory instruction writes result back to
register

250
ARM Pipeline
Fetch Decode Execute

Cycle 1 ADD

Time
Cycle 2 SUB ADD

Cycle 3 CMP SUB ADD

251
ARM Cortex-M Branch Instructions
B - Branch ; Simple branching

BX - Branch Indirect ; Branching using a register value

BL - Branch and Link


BL - Branch Indirect and Link
CBZ, CBNZ - Compare and Branch if Zero
IT block - IF-THEN 252
B and BL Instruction

31 28 27 25 24 23 0

cond 1 0 1 L

Link bit 0 = Branch (B)


1 = Branch and Link

253
Field Mnemonic Condition Flags Meaning Code
EQ Z set Equal 0000
NE Z clear Not equal 0001
CS/CH C set Unsigned ≥ 0010
CC/LO C clear Unsigned < 0011
MI N set Negative 0100
PL N clear Positive or zero 0101
VS V set Overflow 0110
VC V clear Not overflow 0111
HI C set and Z clear Unsigned > 1000
LS C clear and Z set Unsigned ≤ 1001
GE N≥V Signed ≥ 1010
LT V≠V Signed < 1011
GT Z clear, N = V Signed > 1100
LE Z set, N ≠ V Signed ≤ 1101
AI Always Default 1110
254
Example:

CMP R0, R1
BLT LEVEL12 ; jump to level LEVEL12

255
Example:

B DELAY ; immediate address


BX R9 ; address contained in a register

256
Example:

BEQ.W DELAY ; width


BX R9 ; move the value held in R4 in PC

257
Compare and Branch

CMP R2, #0 ; compare content of R2 and zero


BEQ DELAY ; branch to delay if content of R2 equal to zero

CBZ R2, DELAY; branch to delay if content of R2 equal to zero

258
LOOPS

259
while loop

j=100
while (j!=0){
//do something
j --;
}

260
while loop: example 1
j=100
while (j!=0){
//do something
j --;
}

MOV R3, #0x64


B Test
Loop .
; do something
SUBS R3, R3, #1 ; j –
Test .. ; evaluation condition ; = 0?
BNE Loop
261
while loop: example 2
j=100
while (j!=0){
//do something
j --;
}

MOV R3, #0x64


Loop CBZB R3 Exit
; do something
SUBS R3, R3, #1 ; j –
B Loop
Exit
262
for loop

for (j=0; j<10; j++)


{
//do something
}

263
for loop: example 1
for (j=0; j<10; j++)
{
//do something
}

MOV R1, #0 ; j = 10
Loop CMP R1, #10 ; j < 10 ?
BGE DONE ; j >= 10, finish
.
; do something
.
ADDS R1, R1, #1 ; j ++
BNE Loop
DONE 264
for loop: example 2
for (j=0; j<10; j++)
{
//do something
}

MOV R1, #10 ; j = 10


Loop CMP R1, #10 ; j < 10 ?
.
.
; do something
.
SUBS R1, R1, #1 ; j=j-1
BNE Loop ; j = 0, finish
DONE 265
Conditional Execution

266
Testing a string for ‘!’ and ‘?’

if (char = = ‘!’ || char = = ‘?’)


found ++ ;

267
Testing a string for ‘!’ and ‘?’

if (char = = ‘!’ || char = = ‘?’)


found ++ ;

TEQ R0, # ’!’


TEQNE R0, # ’?’
ADDEQ R1, R1, #1

268
Greatest common divisor (GCD)
while (a != b) {
if (a > b) a = a – b;
else b = b – a;
}

269
GCD: Example 1
while (a != b) {
if (a > b) a = a – b;
else b = b – a;
}

gcd CMP R0, R1 ; a>b?


BEQ end ; if a = b we’re done
BLT less ; a < b branches
SUBS R0, R0, R1 ; a=a–b
B gcd ; loop again
gcd SUBS R1, R1, R0 ; b=b–a
B gcd 270
GCD: Example 1
while (a != b) {
if (a > b) a = a – b;
else b = b – a;
}

gcd CMP R0, R1


SUBGT R0, R0, R1
SUBLT R1, R1, R0
BNE gcd

271
IF-THEN(IT) Block

272
ITxyz condition

xyz = T for Then


xyz = E for Else

273
IT Condition
if (R3 < R8) {
R3 = R3 + R8;
R4 = 0;
}
else
R3 = 0;

274
IT Condition: example 1
if (R3 < R8) {
R3 = R3 + R8;
R4 = 0;
}
else
R3 = 0;

ITTE LT
ADDLT R3, R3, R8
MOVLT R4, #0
SUBGE R3, R3, R3 275
Field Mnemonic Condition Flags Meaning Code
EQ Z set Equal 0000
NE Z clear Not equal 0001
CS/CH C set Unsigned ≥ 0010
CC/LO C clear Unsigned < 0011
MI N set Negative 0100
PL N clear Positive or zero 0101
VS V set Overflow 0110
VC V clear Not overflow 0111
HI C set and Z clear Unsigned > 1000
LS C clear and Z set Unsigned ≤ 1001
GE N≥V Signed ≥ 1010
LT V≠V Signed < 1011
GT Z clear, N = V Signed > 1100
LE Z set, N ≠ V Signed ≤ 1101
AI Always Default 1110
276
Lookup Tables

277
Integer Lookup Tables
Address Memory
R5 0x8000
0x8000 element n
0x8004 element n + 1 R4 offset
0x8008 element n + 2
0x800C element n + 3
LDR R6, [R5, R4]
. element n + 4
. .
LDR R6, [R5, R4, LSL #2]
. .

LDRH R6, [R5, R4] 278


The Stack

279
• Last-In-First Out (LIFO)

PUSH
POP

280
LDM and STM

LDM <address-mode> {<cond>} <Rn> {!}, <reg-list> {^}

optional

Base register for load operation

281
Cortex-M

LDM <address-mode> {<cond>} <Rn> {!}, <reg-list>

282
Example

LDR R0, [R9]


LDR R1, [R9, #4]
LDMIA R9, {R0 – R3} = LDR R2, [R9, #8]
LDR R3, [R9, #12]

283
LDM and STM

STM <address-mode> {<cond>} <Rn> {!}, <reg-list> {^}

optional

Base register for load operation

284
Instructions

IA : Increment After
IB : Increment Before
DA : Decrement After
DB : Decrement Before

285
PUSH and POP Instructions

PUSH {<cond>} <reg-list>

POP {<cond>} <reg-list>

optional

286
Floating Point Arithmetic

287
Integer Datatypes

• Byte, or 8 bits
• Halfword, or 16 bits
• Word, or 32 bits

288
Word, or 32 bits
Unsigned = 0 to 4,294,967,295 (232-1)

Signed = -2,147,483,648 to 2,147,483,647


or -231 to 231-1

289
1037 = 0000 0000 0000 0000 0000 0100 0000 1101

1036 = 0000 0000 0000 0000 0000 0100 0000 1100

1038 = 0000 0000 0000 0000 0000 0100 0000 1110

290
Floating-Point Data Types

16-bit : half-precision
32-bit : single-precision
64-bit : double-precision
128-bit : quad-precision

291
Half-precision format
15 14 10 9 0
Sign Exponent Fraction

Single-precision format
31 30 23 22 0
Sign Exponent Fraction

Half-precision format
63 62 52 51 0
Sign Exponent Fraction
292
Floating point representation

F= (-1) S x 2(exp -bias) x1.f


[1.0, 2.0)

Where:
s is the sign
• 0 for positive
• 1 for negative
exp is the exponent
• The bias is a constant specified in the format
• The purpose is to create a positive exponent
f is the fraction
• Sometimes referred to as mantissa
293
Example:
Find the single-precision representation of 6.5
Solution
F = (-1)S x 2(exp -bias) x1.f

The sign is positive, so the sign bit will be 0.

The power of 2 that will result in a significand between


1.0 and almost 2.0 is 4.0 (22)

6.5 = -1 0 x 2
2x 1.625
294
Find the significand
6.5 = 4 x
x = 1.625

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0 1 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 00000 0000
4 0 D 0 0 0 0 0

295
Half- Single- Double-
Precision Precision Precision
Format width in bits 16 32 64
Exponent width in bits 5 8 11
Fraction bits 10 23 52
Exp maximum +15 +127 +1023
Exp minimum -14 -126 -1022
Exponent bias 15 127 1023

296
Floating Point Instruction

V <operation>{cond}.F32 {<dest>},<src1>,<src2>

297
Non-Arithmetic Instruction

298
Absolute Value

VABS {cond}.F32 <Sd>,<Sm>

299
Negate

VNEG {cond}.F32 <Sd>,<Sm>

300
Addition/Subtraction

VADD {cond}.F32 <Sd>,<Sn>,<Sm>

VSUBB {cond}.F32 <Sd>,<Sn>,<Sm>

301
Multiplication/Multiply-Accumulate

VMUL {cond}.F32 <Sd>,<Sn>,<Sm>

VNMLA {cond}.F32 <Sd>,<Sn>,<Sm>

302
Data Structures

303
Introduction to FIFO

Fifo_Put Fifo_Get
FIFO
Producer Consumer

304
Finite State Machine (FSM)

305
Abstract Principles of FSM

• Inputs
• Outputs
• States
• State transitions

306
Types of FSM

• Output depends only on the state


Moore FSM • Next state depends on input and current state

• Output depends on both input and state


Mealy FSM • Next state depends on input and current state

307
Moore FSM Case Study

• The robot has two drive wheels and third free turning wheel.
• Lets assume that the motors are connected to S1 and S2.

1. If both motors are on i.e., S1 =1, S2 =1 ;the robot goes straight


2. If just the left motor is on i.e., S1 =1, S2 =0 ;the robot will turn right
3. If just the right motor is on i.e., S1 =0, S2 =1 ;the robot will turn left
308
1. If S3 =0, S4 =0 ;robot is lost
2. If S3 =1, S4 =0 ;robot is just a little bit to the right
3. If S3 =0, S4 =1 ;robot is just a little bit to the left
4. If S3 =1, S4 =1 ;robot is on the line
309
(i) Left
Off center to the left

(ii) Center
Center to the road

(iii) Right
Off center to the right

310
• Go straight if robot is on the line
• Turn right if robot is off to the left
• Turn left if robot is off to the right

311
State Motor Delay 00 01 10 11
Center 1,1 1 Right Right Left Center
Left 1,0 1 Left Right Center Center
Right 0,1 1 Right Center Left Center

312
// Link data to structure
structure state {

# define Center 0
#define left 1
#define right 2

StateType fsm[3]={}
{0x03, 1, { Right, Left, Right, Center}}, //Center
{0x02, 1, { Left, Center, Right, Center}}, //Left
{0x01, 1, { Right, Left, Center, Center}}, //Right
}
313
Timers

• System Tick Timer


• General Purpose Timer

314
SysTick Timer

System Tick Timer

24-bits down counter that runs


at bus clock frequency

315
SysTick Registers

• SysTick Control and Status Register


• SysTick Reload Value Register
• SysTick Current Value Register

316
Example: Causing an action to occur every 1 sec

SysTick -> LOAD = 16000 000 - 1

Clock speed = 16MHz

16MHz = 16, 000, 000 cycles = 1 second

317
How about 1 millisecond ?

16, 000, 000 cycles = 1 second


1 second = 1000 millisecond

Therefore 1 millisecond:

16, 000, 000


= 16, 000 cycles
1000

SysTick -> LOAD = 16,000


318
General Purpose Timer

TIMER vs. COUNTER

Internal clock source External clock source


E.g. PLL, XTAL, RC E.g. Clock fed to CPU

319
Timer Uses

• Creating delay
• Counting events
• Measuring time between events

320
Types of Timers
Timer stop counting Timer continues counting
after timeout after timeout

(1) One-shot vs. Periodic

(2) Down-Counter vs. Up-Counter

Timer counts from a set Timer counts from zero


value to zero to a set value

321
Calculates Values

16-bits = 216 = 65, 356


16MHz = 16 000 000
16-bits 1
65356 x = 65, 356 x (6.25 x 10-8) = 4.096x10-3 = 4.09 milliseconds
TIMER 16MHz

16-bits = 232 = 4,294,967,296


16MHz = 16 000 000
32-bits
1
4294967296 x
TIMER 16MHz
= 4294967296 x (6.25 x 10-8) = 268.435 seconds

322
Introduction to Interrupts

323
Several Devices in Single Microcontroller

Interrupts vs. Polling


int main()
int main()
{
{
while (1) {
while (1) {
. . . .
}
if(switch = on){
}
getData(); }
. . . .
OnSwitch_ISR{
}
getData()
}
}

324
Several Devices in Single Microcontroller

NMI
IRQ[0]

IRQ[1]
Peripherals Cortex-M
IRQ[2] NVIC
Core
IRQ[N]
Cortex-M

Nested Vector Interrupt Controller (NVIC)

325
Interrupt Priority

Interrupt# Interrupt Priority

1 RESET -3 Highest

2 NMI -2
3 Hard Fault -1

PRIORITY FIXED BY ARM

326
Interfacing

327
Outline
• Interfacing basics

• Microprocessor interfacing
• I/O Addressing

• Protocols
• Serial
• Parallel
• Wireless

328
A simple bus
• Wires
• Uni-directional or bi-directional
• One line may represent multiple wires

• Bus
• Set of wires with a single function
• Address bus, data bus
• Or, entire collection of wires
• Address, data and control
• Associated protocol: rules for communication

329
Bus Structure

rd'/wr

enable
Processor Memory
addr[0-11]

data[0-7]

bus
330
Ports
• Conducting device on periphery
• Connects bus to processor or memory
• Often referred to as a pin
• Actual pins on periphery of IC package that plug
into socket on printed-circuit board
• Sometimes metallic balls instead of pins
• Today, metal “pads” connecting processors and
memories within single IC
• Single wire or set of wires with single function
• E.g., 12-wire address port

331
port
rd'/wr

enable
Processor Memory
addr[0-11]

data[0-7]

bus
332
Timing Diagrams
• Most common method for describing a communication
protocol

• Time proceeds to the right on x-axis

• Control signal: low or high

• Data signal: not valid or valid

• Protocol may have sub-protocols

333
Read Example
rd'/wr • rd’/wr set low
enable
• address placed on addr for at
addr
least tsetup time before enable
asserted
data

tsetup tread • enable triggers memory to place


data on data wires by time tread
read protocol

rd'/wr
enable
addr

data
tsetup twrite
334
write protocol
Basic protocol concepts
• Actor: master initiates, servant (slave) respond
• Direction: sender, receiver
• Addresses: special kind of data
• Specifies a location in memory, a peripheral, or a register within a peripheral
• Time multiplexing
• Share a single set of wires for multiple pieces of data
• Saves wires at expense of time

Master req Servant Master req Servant


data(15: data(15: add data add data
0) 0) r mux rdemux
mux demux
data(8) addr/data

req req
data 15:8 7:0 addr/dat addr data
a
data serializing address/data muxing 335
Control Method
req 1. Master asserts req to
receive data

Master Servant 2. Servant puts data on bus


data within time taccess

3. Master receives data and


deasserts req
4. Servant ready for next
request
req 1 3

data 2 4

taccess
336
Interfacing: I/O addressing
A microprocessor communicates with other devices using
some of its pins

• Port-based I/O (parallel I/O)


• Processor has one or more N-bit ports
• Processor’s software reads and writes a port just like a register
• E.g., P0 = 0xFF; v = P1.2; -- P0 and P1 are 8-bit ports

• Bus-based I/O
• Processor has address, data and control ports that form a single bus
• Communication protocol is built into the processor
• A single instruction carries out the read or write protocol on the bus

337
Compromises/extensions
Memory
• Parallel I/O peripheral
System bus
• When processor only supports bus-based I/O Processor
but parallel I/O needed Parallel I/O peripheral
• Each port on peripheral connected to a register
within peripheral that is read/written by the
processor Port A Port B Port C

Parallel I/O

• Extended parallel I/O


Port 0
• When processor supports port-based I/O but Port 1
more ports needed Processor Port 2
• One or more processor ports interface with Port 3
parallel I/O peripheral extending total number
Parallel I/O peripheral
of ports available for I/O
• e.g., extending 4 ports to 6 ports in figure
Port A Port B Port C

Extended parallel I/O


338
Multilevel Bus Architectures
• Don’t want one bus for all communication
– Peripherals would need high-speed, processor-specific bus interface
• excess gates, power consumption, and cost; less portable
– Too many peripherals slows down bus

• Processor-local bus Micro- Cache Memo DMA


• High speed, wide, most frequent proces ry control
communication sor control ler
• Connects microprocessor, cache, memory ler
Processor-local bus
controllers, etc.
• Peripheral bus Periph Periph Periph Bridge
• Lower speed, narrower, less frequent eral eral eral
communication
Peripheral bus
• Typically industry standard bus (ISA, PCI) for
portability
• Bridge
– Single-purpose processor converts communication between busses
339
Advanced communication principles
• Layering
• Break complexity of communication protocol into pieces easier to design and
understand
• Lower levels provide services to higher level
• Lower level might work with bits while higher level might work with
packets of data
• Physical layer
• Lowest level in hierarchy
• Medium to carry data from one actor (device or node) to another
• Parallel communication
• Physical layer capable of transporting multiple bits of data
• Serial communication
• Physical layer transports one bit of data at a time
• Wireless communication
• No physical connection needed for transport at physical layer

340
Parallel communication
• Multiple data, control, and possibly power wires
• One bit per wire

• High data throughput with short distances

• Typically used when connecting devices on same IC or same


circuit board
• Bus must be kept short
• long parallel wires result in high capacitance values which
requires more time to charge/discharge
• Data misalignment between wires increases as length increases

• Higher cost, bulky


341
Serial communication
• Single data wire, possibly also control and power wires
• Words transmitted one bit at a time
• Higher data throughput with long distances
• Less average capacitance, so more bits per unit of time
• Cheaper, less bulky
• More complex interfacing logic and communication protocol
• Sender needs to decompose word into bits
• Receiver needs to recompose bits into word
• Control signals often sent on same wire as data increasing protocol
complexity

342
Wireless communication
• Infrared (IR)
• Electronic wave frequencies just below visible light spectrum
• Diode emits infrared light to generate signal
• Infrared transistor detects signal, conducts when exposed to infrared light
• Cheap to build
• Need line of sight, limited range

• Radio frequency (RF)


• Electromagnetic wave frequencies in radio spectrum
• Analog circuitry and antenna needed on both sides of transmission
• Line of sight not needed, transmitter power determines range

343
Serial protocols: I2C
I2C (Inter-IC)
• Two-wire serial bus protocol developed by Philips
Semiconductors nearly 20 years ago

• Enables peripheral ICs to communicate using simple


communication hardware

• Data transfer rates up to 100 kbits/s and 7-bit addressing


possible in normal mode

• 3.4 Mbits/s and 10-bit addressing in fast-mode

Common devices capable of interfacing to I2C bus:


• EPROMS, Flash, and some RAM memory, real-time clocks,
watchdog timers, and microcontrollers
344
I2C bus structure
SCL
SDA
Micro- EEPROM Temp. LCD-
controlle (servant) Sensor controlle
r (servant) r
(master) Addr=0x01 Addr=0x02(servant)
Addr=0x03

SDA SDA SDA SDA

SCL SCL SCL SCL


Start Sending 0 Sending 1 Stop
condition condition
From From
Serva recei
D nt ver
C S A A A A R A D D D A S O
T R 6 5 0 / C 8 7 0 C T P
T w K K

Typical read/write cycle


345
Serial protocols: CAN
CAN (Controller area network)
• Protocol for real-time applications
• Developed by Robert Bosch GmbH
• Originally for communication among components of cars
• Applications now using CAN include:
• elevator controllers, copiers, telescopes, production-line control systems, and
medical instruments
• Data transfer rates up to 1 Mbit/s and 11-bit addressing

• Common devices interfacing with CAN:


• 8051-compatible 8592 processor and standalone CAN controllers

• Actual physical design of CAN bus not specified in protocol


• Requires devices to transmit/detect dominant and recessive signals to/from bus
• e.g., ‘1’ = dominant, ‘0’ = recessive if single data wire used
• Bus guarantees dominant signal prevails over recessive signal if asserted simultaneously

346
Serial Protocols: FireWire
• FireWire (a.k.a. I-Link, Lynx, IEEE 1394)
• High-performance serial bus developed by Apple Computer Inc.

• Designed for interfacing independent electronic components


• e.g., Desktop, scanner

• Data transfer rates from 12.5 to 400 Mbits/s, 64-bit addressing

• Plug-and-play capabilities

• Packet-based layered design structure

• Applications using FireWire include:


• disk drives, printers, scanners, cameras

• Capable of supporting a LAN similar to Ethernet


• 64-bit address:
• 10 bits for network ids, 1023 subnetworks
• 6 bits for node ids, each subnetwork can have 63 nodes
• 48 bits for memory address, each node can have 281 terabytes of distinct locations

347
Serial protocols: USB
• USB (Universal Serial Bus)
• Easier connection between PC and monitors, printers, digital speakers, modems,
scanners, digital cameras, joysticks, multimedia game equipment

• 2 data rates:
• 12 Mbps for increased bandwidth devices
• 1.5 Mbps for lower-speed devices (joysticks, game pads)

• Tiered star topology can be used


• One USB device (hub) connected to PC
• hub can be embedded in devices like monitor, printer, or keyboard or can be standalone
• Multiple USB devices can be connected to hub
• Up to 127 devices can be connected like this

• USB host controller


• Manages and controls bandwidth and driver software required by each peripheral
• Dynamically allocates power downstream according to devices connected/disconnected

348
Parallel protocols: PCI Bus
• PCI Bus (Peripheral Component Interconnect)

• High performance bus originated at Intel in the early 1990’s

• Standard adopted by industry and administered by PCISIG (PCI Special Interest Group)

• Interconnects chips, expansion boards, processor memory subsystems

• Data transfer rates of 127.2 to 508.6 Mbits/s and 32-bit addressing


• Later extended to 64-bit while maintaining compatibility with 32-bit schemes

• Synchronous bus architecture

• Multiplexed data/address lines

349
Parallel protocols: ARM Bus
• Designed and used internally by ARM Corporation

• Interfaces with ARM line of processors

• Many IC design companies have own bus protocol

• Data transfer rate is a function of clock speed


• If clock speed of bus is X, transfer rate = 16 x X bits/s

• 32-bit addressing

350
Wireless protocols: IrDA
• Protocol suite that supports short-range point-to-point infrared data
transmission

• Created and promoted by the Infrared Data Association (IrDA)

• Data transfer rate of 9.6 kbps and 4 Mbps

• IrDA hardware deployed in notebook computers, printers, PDAs, digital


cameras, public phones, cell phones

• Lack of suitable drivers has slowed use by applications

• Windows 2000/98 now include support

• Becoming available on popular embedded OS’s

351
Wireless protocols: Bluetooth
• New, global standard for wireless connectivity

• Based on low-cost, short-range radio link

• Connection established when within 10 meters of each other

• No line-of-sight required
• e.g., Connect to printer in another room

352
Wireless Protocols: IEEE 802.11
• IEEE 802.11

• Proposed standard for wireless LANs

• Specifies parameters for PHY and MAC layers of network


• PHY layer
• physical layer
• handles transmission of data between nodes
• provisions for data transfer rates of 1 or 2 Mbps
• operates in 2.4 to 2.4835 GHz frequency band (RF)
• or 300 to 428,000 GHz (IR)

• MAC (Media Access Control) layer


• medium access control layer
• protocol responsible for maintaining order in shared medium
• collision avoidance/detection

353
Self-Driving Car

354
Components
• Antiskid brakes
• Inflatable restraints
• Collision warning and avoidance
• Blind-zone vehicle detection (especially for large trucks)
• Infrared night vision systems
• Heads-up displays
• Automatic accident notification
• Rear-view cameras
355
Communications and entertainment
• AM/FM radio
• Digital audio broadcasting
• CD/DVD player
• Cellular phone
• Computer/e-mail
• Satellite radio
Convenience
• Electronic GPS navigation
• Personalized seat/mirror/radio settings
• Electronic door locks 356
Emissions, performance, and fuel economy
• Vehicle instrumentation
• Electronic ignition
• Tire inflation sensors
• Computerized performance evaluation and
maintenance scheduling
• Adaptable suspension systems
Alternative propulsion systems
• Electric vehicles
• Advanced batteries
• Hybrid vehicles 357
Sensors

358
Type of sensor
• 1-D Range Finders
Infrared linear distance sensor that can be used to make low-cost embedded system

• 2-D Range Finders


Sensors that can measure the distance on 2-Dplane, used for navigation

• 3-D Sensors
3D distance measurement such as Intel’s RealSense, Microsoft’s Kinect, ASUS’s Xtion

• Audio/Speech Recognition
Currently, there are few voice recognition related parts, but it seems to be added continuously

• Cameras
• Camera driver used for object recognition, face recognition, character recognition, etc.
• Various application packages such as OpenCV

• Sensor Interfaces
• Very few sensors support USB and web protocols
• There are still many sensors that can acquire data from a microprocessor
• These sensors can be used with UART in MCU, or ROS in mini PC. 359
Sensor Interface
Physical Electrical

Physical ARM
Sensor Cortex-M4
World

360
Cyber Physical System

Components (Cyber)
 Microprocessor
 Communication system
 Sensors
 Actuators
D/A
A/D
Physics
 Environment condition

361
Analog to Digital (A/D) Conversion
Number of
n-bit steps Step Size
8-bit 256 5V / 256 = 19.53mV

10-bit 1024 5V / 1024 = 4.88mV

12-bit 4096 5V / 4096 = 1.2mV

16-bit 65,536 5V / 65,536 = 0.076mV

Considering Vref as 5V 362


Flash ADC Converter
Vin Comparators
3.0V
5V

R Y6
Digital Logic
R Y5 Y6 Y5 Y4 Y3 Y2 Y1 Y0 Dout
Y4 1 1 1 1 1 1 1 111
R 0 1 1 1 1 1 1 110
0 0 1 1 1 1 1 101 Dout
R Y3
0 0 0 1 1 1 1 100
0 0 0 0 1 1 1 011
R Y2 0 0 0 0 0 1 1 010
0 0 0 0 0 0 1 001
R Y1 0 0 0 0 0 0 0 000

R Y0

R
363
Digital to Analog (D/A) Conversion

364
Digital to Analog (D/A) Conversion

365
Sensors

366
Thin-Film Pressure Sensor

r
A

𝐿 𝐿 𝐿
Resistance, 𝑅 = ρ = ρ 2 = ρ 2
𝐴 𝜋𝑟 𝜋𝐷

ρ is the resistivity A= 𝑘𝐷2 for rectangle shape

367
Pressure Sensor Model
1 𝑑𝐷
𝑑𝑅 = 2
ρ𝑑𝐿 + 𝐿𝑑ρ − 2ρ𝐿
𝑘𝐷 𝐷

𝑑𝑅/𝑅 𝑑ρ/ρ 𝑑𝐷/𝐷


=1+ −2
𝑑𝐿/𝐿 𝑑𝐿/𝐿 𝑑𝐿/𝐿

𝑑𝐿
= ε𝑥 : stress along x axis
𝐿

𝑑𝐷
= ε𝑦 : stress along y axis
𝐷
ε𝑦
μ=− : Poisson ratio
ε𝑥
368
𝑑𝑅/𝑅 𝑑ρ/ρ ε𝑦
=1+ −2
ε𝑥 ε𝑥 ε𝑥

For small change in structure :

Δ𝑅 = 𝐺𝑅ε𝑥

This behaves as variable resistance.


R is the resistance without pressure.

369
Recommended Circuit

Variable Rs or ΔRs

370
Inertial Measurement Unit (IMU)

371
Inertial Measurement Unit (IMU)

• Accelerometer
• Gyro Sensor
• Magnetometer

372
• Accelerometer
• Gyro Sensor
• Magnetometer

373
Capacitor

L
W

C = ε𝐴
𝑑
d

374
Capacitor

ε𝐴
x Cx = 𝑥

375
Accelerometer

ADXL345

376
MEMS Accelerometer

377
Two Capacitors Model

C1
C2

378
Voltage Model

+Vs KVL
𝐶1
Vo = - Vs +𝐶1+𝐶2 (2Vs)
C1
Vo
C2 Output voltage:
Vo = 𝐶1−𝐶2 V
𝐶1+𝐶2 s
-Vs

A differential capacitor

379
Accelerometer Circuit (Analog Device)

380
Accelerometer Circuit (Analog Device)

381
iPhone Accelerometer Sensor
az
a = (ax, ay, az)

ay
O

ax az: Yaw, ay: Pitch, ax: Roll

Install “Physics Toolbox Sensor Suite” on Your Smartphone

382
• Accelerometer
• Gyro Sensor
• Magnetometer

383
Gyroscope
384
MEMS Gyroscopes

Image: ST Microelectronics
Draper Lab comb drive tuning fork gyroscope

385
MPU6050 Accelerometer and Gyroscope Sensor

386
• Accelerometer
• Gyro Sensor
• Magnetometer

387
Magnetometers

Magnetic compass

Image: Henrik Mouritsen


388
Magnetometer: Hall Effect Sensor

Geographical east

B or H: Magnetic field

Geographical down

389
Hall Sensor Circuit

390
Electronic Compass: GPS

391
iPhone Magnetometer Sensor
zB

B = B (cosθ, 0, sinθ)

yB
B

xB

Rotational vector, R = Rz (yaw) Ry (pitch) Rx (roll)

392
Image Sensor (Camera)

393
Optical Line Camera and Line Following
• A vision system is a key component in any autonomous car
• Optical camera projects an image onto a surface composed of light
sensitive pixels

• Charge Coupled Device (CCD) image sensor:


• An array of light sensitive pixels fabricated on a silicon chip, used
to detect projected images
• 2D array an essential component in many digital cameras

• Line or edge following can be constructed using a 1-D CCD array, and a
simplified algorithm

wikipedia
wikipedia 394
Source: ARM educational materials
Optical Line Camera and Line Following
Recommended line camera: TAOS TSL1401CL
• 128 x 1 linear optical sensor array
• 3 – 5 V Vdd power supply

Datasheet TSLI401CL
395
Source: ARM educational materials
Self-Driving Car

Lidar sensor
Microcontroller
board (Algorithm) Line/track sensors

Velocity sensor
Servo Motors

Motors
Power supply

396
Power Management Circuit

397
Power Supply for Autonomous Cars
Battery 7.2 V

Servo MCU Sensor


power supply power supply power supply
Rb
Motor M
5V 5V 5V

+
7.2 V
-
MCU Sensors
Servo

398
Power Supply 1: Linear Voltage Regulator
e.g. LM2940CT-5.0/NOPB
• 5V output
• 0V to 26V input
• 1A max output
• 500mV dropout

+ Linear voltage +
regulator
Input Output
voltage voltage
(e.g. 5V)

- - 399
Power Supply II: Boost Converter

L iL Diode

+ VL - +
+ C RL Vo
Vi sw
-
-

400
Entire Power Supply System
Two-component stable power-supply:
• Boost converter to ensure input voltage to linear regulator is
always > 5V + Vdropout , e.g. 12V
• Linear regulator to provide stable 5V supply
Battery
7.2 V 12V 5V

DC-DC Linear
Rb boost voltage
regulator Vout
converter

7.2 V +
-

401
Electronic Components

Optocoupler Power MOSFET • SIEMENS


• TOSHIBA UMOS

402
IR Sensor

403
Driver Circuit

404
Motors

405
406
Motor: Electrical Equivalent Circuit

+ Vm
Vm
-
+ M -

im

Symbol:
 

407
Motor: Electrical Equivalent Circuit

switch
switch

motor +
Vb M Vm
- +
-

battery

 

408
Motor Controllers

MCU
Power MOSFET
switch

+
+ Vb M Vm
Vb M Vm -
-

409
Motor Controllers:
Power MOSFETS as Switches

Vbat +
-

MCU

410
Motor Controllers:
Power MOSFETS as Switches

MOSFET Vbat +

Driver -
MCU

411
H-Bridge Motor Driver

H-bridge motor driver is design using HiSIM-HV and Diode-CMC models with PWM controller.

412
Function of H-Bridge Motor Driver

413
Function of H-Bridge Motor Driver

414
Servo Speed

servo

415
Servo Damping

416
PWM Control Method

417
Servo Motor Equivalent Circuit Topology

418
Motor Model

R : Resistance
L : Inductance
J : Inertia
D : Rotational friction
τ : Motor torque
ω : Angular velocity
I : Motor current
Km : Model parameter
Kt : Model parameter

419
Control Systems

420
Control System
• Control physical system’s output
• By setting physical system’s input
• Tracking
• E.g.
• Cruise control
• Thermostat control
• Disk drive control
• Aircraft altitude control
• Difficulty due to
• Disturbance: wind, road, tire, brake; opening/closing door…
• Human interface: feel good, feel right…

421
Tracking

422
Open-Loop Control Systems
• Plant
• Physical system to be controlled
• Car, plane, disk, heater,…
• Actuator
• Device to control the plant
• Throttle, wing flap, disk motor,…
• Controller
• Designed product to control the plant

423
Open-Loop Control Systems
• Output
• The aspect of the physical system we are interested in
• Speed, disk location, temperature
• Reference
• The value we want to see at output
• Desired speed, desired location, desired temperature

• Disturbance
• Uncontrollable input to the plant imposed by environment
• Wind, bumping the disk drive, door opening

424
Other Characteristics of open loop
• Feed-forward control
• Delay in actual change of the output
• Controller doesn’t know how well thing goes
• Simple
• Best use for predictable systems

425
Close Loop Control Systems
• Sensor
• Measure the plant output
• Error detector
• Detect Error
• Feedback control systems
• Minimize tracking error

426
Designing Open Loop Control System
• Develop a model of the plant
• Develop a controller
• Analyze the controller
• Consider Disturbance
• Determine Performance
• Example: Open Loop Cruise Control System

427
Model of the Plant
• May not be necessary
• Can be done through experimenting and tuning
• But,
• Can make it easier to design
• May be useful for deriving the controller

• Example: throttle that goes from 0 to 45 degree


• On flat surface at 50 mph, open the throttle to 40 degree
• Wait 1 “time unit”
• Measure the speed, let’s say 55 mph
• Then the following equation satisfy the above scenario
• vt+1 = 0.7*vt + 0.5*ut
• 55 = 0.7*50 + 0.5*40
• IF the equation holds for all other scenario
• Then we have a model of the plant

428
Designing the Controller
• Assuming we want to use a simple linear function
• ut = F(rt) = P * rt
• rt is the desired speed

• Linear proportional controller


• vt+1 = 0.7*vt + 0.5*ut = 0.7*vt + 0.5P*rt
• Let vt+1 = vt at steady state = vss
• vss = 0.7*vss + 0.5P*rt
• At steady state, we want vss = rt
• P = 0.6
• i.e. ut = 0.6*rt

429
Analyzing the Controller
• Let v0 = 20mph, r0 = 50mph

• vt+1 = 0.7*vt + 0.5(0.6)*rt = 0.7*vt + 0.3*50 = 0.7*vt+15

• Throttle position is ut = 0.6*50=30 degree

430
Considering the Disturbance
• Assume road grade can affect the speed
• From –5mph to +5 mph
• vt+1=0.7*vt+10
• vt+1=0.7*vt+20

431
Determining Performance
• Vt+1=0.7*vt+0.5P*r0-w0
• v1=0.7*v0+0.5P*r0-w0
• v2=0.7*(0.7*v0+0.5P*r0-w0) +0.5P*r0-w0 =0.7*0.7*v0+(0.7+1.0)*0.5P*r0-(0.7+1.0)w0
• vt=0.7t*v0+(0.7t-1+0.7t-2+…+0.7+1.0)(0.5P*r0-w0)
• Coefficient of vt determines rate of decay of v0
• >1 or <-1, vt will grow without bound
• <0, vt will oscillate

432
Designing Close Loop Control System

433
• ut = P * (rt-vt)
Stability
• vt+1 = 0.7vt+0.5ut-wt = 0.7vt+0.5P*(rt-vt)-w = (0.7-0.5P)*vt+0.5P*rt-wt
• vt=(0.7-0.5P)t*v0+((0.7-0.5P)t-1+(0.7-0.5P)t-2+…+0.7-0.5P+1.0)(0.5P*r0-w0)

• Stability constraint (I.e. convergence) requires


| 0.7-0.5P | < 1
-1 < 0.7-0.5P < 1
-0.6 < P < 3.4

434
• ut = P * (rt-vt)
Reducing Effect of v0
• vt+1 = 0.7vt+0.5ut-wt = 0.7vt+0.5P*(rt-vt)-w = (0.7-0.5P)*vt+0.5P*rt-wt
• vt=(0.7-0.5P)t*v0+((0.7-0.5P)t-1+(0.7-0.5P)t-2+…+0.7-0.5P+1.0)(0.5P*r0-w0)

• To reduce the effect of initial condition


• 0.7-0.5P as small as possible
• P=1.4

435
Avoid Oscillation
• ut = P * (rt-vt)
• vt+1 = 0.7vt+0.5ut-wt = 0.7vt+0.5P*(rt-vt)-w = (0.7-0.5P)*vt+0.5P*rt-wt
• vt=(0.7-0.5P)t*v0+((0.7-0.5P)t-1+(0.7-0.5P)t-2+…+0.7-0.5P+1.0)(0.5P*r0-w0)

• To avoid oscillation
• 0.7-0.5P >=0
• P<=1.4

436
Perfect Tracking
• ut = P * (rt-vt)
• vt+1 = 0.7vt+0.5ut-wt = 0.7vt+0.5P*(rt-vt)-w =(0.7-0.5P)*vt+0.5P*rt-wt
• vss=(0.7-0.5P)*vss+0.5P*r0-w0
(1-0.7+0.5P)vss=0.5P*r0-w0
vss=(0.5P/(0.3+0.5P)) * r0 - (1.0/(0.3+0.5P)) * wo

• To make vss as close to r0 as possible


• P should be as large as possible

437
Close-Loop Design
• ut = P * (rt-vt)

• Finally, setting P=3.3


• Stable, track well, some oscillation
• ut = 3.3 * (rt-vt)

438
Analyze the controller

• v0=20 mph, r0=50 mph, w=0


• vt+1 = 0.7vt+0.5P*(rt-vt)-w
= 0.7vt+0.5*3.3*(50-vt)
• ut = P * (rt-vt)
= 3.3 * (50-vt)

• But ut range from 0-45


• Controller saturates

439
Analyze the controller

• v0 = 20 mph, r0 = 50 mph, w = 0

• vt+1 = 0.7vt+0.5*ut

• ut = 3.3 * (50-vt)
• Saturate at 0, 45

• Oscillation!
• “feel bad”

440
Analyze the controller

• Set P=1.0 to void


oscillation
• Terrible SS
performance

441
Analyzing the Controller

442
Minimize the effect of disturbance

• vt+1 = 0.7vt+0.5*3.3*(rt-vt)-w
• w=-5 or +5

• 39.74
• Close to 42.31
• Better than
• 33
• 66
• Cost
• SS error
• oscillation

443
General Control System
• Objective
• Causing output to track a reference even in the presence of
• Measurement noise
• Model error
• Disturbances

• Metrics
• Stability
• Output remains bounded
• Performance
• How well an output tracks the reference
• Disturbance rejection
• Robustness
• Ability to tolerate modeling error of the plant

444
Performance
• Rise time
• Time it takes form
10% to 90%

• Peak time

• Overshoot
• Percentage by which
Peak exceed final
value

• Settling time
• Time it takes to reach
1% of final value

445
Plant Modeling
• May need to be done first

• Plant is usually on continuous time


• Not discrete time
• E.g. car speed continuously react to throttle position, not at discrete
interval
• Sampling period must be chosen carefully
• To make sure “nothing interesting” happen in between
• I.e. small enough

• Plant is usually non-linear


• E.g. shock absorber response may need to be 8th order differential

• Iterative development of the plant model and controller


• Have a plant model that is “good enough” 446
Controller Design: P
• Proportional controller
• A controller that multiplies the tracking error by a constant
• ut = P * (rt-vt)
• Close loop model with a linear plant
• E.g. vt+1 = (0.7-0.5P)*vt+0.5P*rt-wt
• P affects
• Transient response
• Stability, oscillation
• Steady state tacking
• As large as possible
• Disturbance rejection
• As large as possible

447
Controller Design: PD
• Proportional and Derivative control
• ut = P * (rt-vt) + D * ((rt-vt)-(rt-1-vt-1)) = P * et+ D * (et-et-1)

• Consider the size of error over time

• Intuitively
• Want to “push” more if the error is not reducing fast enough
• Want to “push” less if the error is reducing really fast

448
PD Controller
• Need to keep track of error derivative

• E.g. Cruise controller example


• vt+1 = 0.7vt+0.5ut-wt
• Let ut = P * et + D * (et-et-1), et=rt-vt
• vt+1=0.7vt+0.5*(P*(rt-vt)+D*((rt-vt)-(rt-1-vt-1)))-wt
• vt+1=(0.7-0.5*(P+D))*vt+0.5D*vt-1+0.5*(P+D)*rt-0.5D*rt-1-wt
• Assume reference input and distribance are constant, the steady-state
speed is
• Vss=(0.5P/(1-0.7+0.5P)) * r
• Does not depend on D

• P can be set for best tracking and disturbance control

• Then D set to control oscillation/overshoot/rate of convergence


449
PD Control Example

450
PI Control
• Proportional plus integral control
• ut=P*et + I*(e0+e1+…+et)
• Sum up error over time
• Ensure reaching desired output, eventually
• vss will not be reached until ess=0
• Use P to control disturbance
• Use I to ensure steady state convergence and convergence rate

451
PID Controller
• Combine Proportional, integral, and derivative control
• ut=P*et+I*(e0+e1+…+et)+D*(et-et-1)

• Available off-the shelf

452
Software Coding
• Main function loops forever, during each iteration
• Read plant output sensor
• May require A2D
• Read current desired reference input
• Call PidUpdate, to determine actuator value
• Set actuator value
• May require D2A

453
Software Coding (continue)

• Pgain, Dgain, Igain are constants


• sensor_value_previous
• For D control
• error_sum
• For I control

454
Computation

• ut=P*et + I*(e0+e1+…+et) + D*(et-et-1)

455
PID Tuning

• Analytically deriving P, I, D may not be possible


• E.g. plant not is not available, or to costly to obtain

• Ad hoc method for getting “reasonable” P, I, D


• Start with a small P, I=D=0
• Increase D, until seeing oscillation
• Reduce D a bit
• Increase P, until seeing oscillation
• Reduce D a bit
• Increase I, until seeing oscillation

• Iterate until can change anything without excessive oscillation

456
Practical Issues with Computer-Based Control

• Quantization

• Overflow

• Aliasing

• Computation Delay

457
Quantization & Overflow
• Quantization
• Can’t store 0.36 as 4-bit fractional number
• Can only store 0.75, 0.59, 0.25, 0.00, -0.25, -050,-0.75, -1.00
• Choose 0.25
• Result in quantization error of 0.11

• Sources of quantization error


• Operations, e.g. 0.50*0.25=0.125
• Can use more bits until input/output to the environment/memory
• A2D converters

• Overflow
• Can’t store 0.75+0.50 = 1.25 as 4-bit fractional number

• Solutions:
• Use fix-point representation/operations carefully
• Time-consuming
• Use floating-point co-processor
• Costly

458
Aliasing
• Quantization/overflow
• Due to discrete nature of computer data

• Aliasing
• Due to discrete nature of sampling

459
Aliasing Example
• Sampling at 2.5 Hz, period of 0.4, the following are indistinguishable
• y(t)=1.0*sin(6πt), frequency 3 Hz
• y(t)=1.0*sin(πt), frequency of 0.5 Hz

• In fact, with sampling frequency of 2.5 Hz


• Can only correctly sample signal below Nyquist frequency 2.5/2 = 1.25 Hz

460
Computation Delay
• Inherent delay in processing
• Actuation occurs later than expected

• Need to characterize implementation delay to make sure it is


negligible

• Hardware delay is usually easy to characterize


• Synchronous design

• Software delay is harder to predict


• Should organize code carefully, so delay is predictable and minimized
• Write software with predictable timing behavior (be like hardware)
• Time Trigger Architecture
• Synchronous Software Language
461
Benefit of Computer Control
• Cost:
• Expensive to make analog control immune to
• Age, temperature, manufacturing error

• Computer control replace complex analog hardware with complex code

• Programmability:
• Computer Control can be “upgraded”
• Change in control mode, gain, are easy to do

• Computer Control can be adaptive to change in plant


• Due to age, temperature, …etc

• “future-proof”
• Easily adapt to change in standards, .. etc.
462
Summary of EL203-EHD

463
Contents
• Introduction to EHD
• Embedded system and microcontroller
• Microcontroller based on ARM Processor, history of ARM
• ARM assembly language, number systems, bits-to-commands
• ARM design philosophy, RISC architecture
• Memory management
• ARM co-processor
• ARM programmer’s model, data type, processor model
• Assembler rules and directives
• Load-store instructions, addressing, and memory demarcations
• Arithmetic and logic instructions, shift and rotation, addition and subtraction, multiplication and
division, saturated math, DSP instructions
• ARM pipeline, branching, loop
• Loops, IT blocks, lookup tables, the stack
• Floating point arithmetic and non-arithmetic Instructions
• Finite state machine (FSM) and application in car control
• System-tick-timer, general-purpose-timer, interrupts
• Interfacing, GPIO, Port, I2C, PCI, CAN, Bluetooth, FireWire, UART
• Car components, analog-to-digital converter (ADC), digital-to-analog converter (DAC)
• Sensors: pressure sensor , accelerometer, gyro-sensor, IMU, GPS, camera
• Power management circuit, voltage regulator, boost converter, driver circuit, DC motor, PWM control
• Controller, PID controller
• Summary of EHD
464
Microcontroller
ICode System Bus

ROM Processor RAM

PortA PortC
PortB PortD
PortE PortF
465
Embedded System
Industry
Medical
Cars

Military
Phone Computer
Space
Consumer House
Electronics
466
Self-Driving Car

Lidar sensor
Microcontroller
board (Algorithm) Line/track sensors

Velocity sensor
Servo Motors

Motors
Power supply

467

You might also like