Network on Chip
Presenter: Bilal Ahmed
Date: 04 Feb, 2025
1
Table of Contents
• Motivation & Background
• What are NOCs?
• Evolution of Communication Systems Architecture
• Where to find NOCs
• Why NOCs?
• Generic NoC Architecture
• NOC Topologies
• Routing Algorithms
• Switching Schemes
• Flow Control
• NoCs at different levels
2
Motivation
3
Intel® Pentium® 4 Processor Intel® Core™ i9 processor 14900K
Launched Q3’02 Launched Q3’23
Total Cores 1 Total Cores 24
Processor Base Frequency 2.80 GHz # of Performance-cores 8
Cache 512 KB L2 Cache # of Efficient-cores16
Bus Speed 533 MHz Intel® Turbo Boost Max Technology 3.0
FSB Parity No Frequency ‡ 5.8 GHz
Performance-core Base Frequency 3.2 GHz
Efficient-core Base Frequency 2.4 GHz
Total L2 Cache32 MB
Lithography 130nm Lithography 7nm
TDP 68.4w TDP 125 W
Maximum Turbo Power 253 W
4
Background
• Shift from single processor to many core processors
• Transistors increasing (smaller process nodes)
Processor performance hindered by walls
• Frequency Wall (Post 2010 frequency hasn't increased significantly)
• Power wall (power density mw/mm2, in fact it is decreasing)
• ILP Wall(performance per cycle)
• Memory wall (memory requests vs memory accesses time)
5
What are NOCs?
• Packet switched on-chip communication network.
• An attempt to scale down the concepts of large-scale networks, and
apply them to the embedded system-on-chip (SoC) domain.
• A chip-wide network
• Processing Elements (PEs) inter-connected via a packet-based network.
• Many of today’s systems-on-chip are too complex to utilize a traditional
hierarchical bus or crossbar interconnect approach. Yesterday’s village
traffic has turned into today’s congested freeways.
6
NoC Architecture
Network interface: Connects endpoints (cores) to
network. Decouples computation/communication.
Switch/Router: It connects fixed number of input channels
to fixed number of output channels
Channel: A single logical connection between
routers/switches
Node: A network endpoint connected to a router/switch
Message: Unit of transfer for clients (e.g., cores,
memory, etc.)
Packet: Unit of transfer for network
Flit: Flow control digit, where flow control is within the
on-chip network
7
8
A NoC based Modern SoC
9
10
NoC Evolution
Shared
Custom Bus NoC
Hierarchical Bus
Bus Matrix
Time
1990 1995 2000 2005 2010
Pasricha, Sudeep, and Nikil Dutt. On-chip communication architectures: system on chip interconnect. Morgan Kaufmann, 2010.
Efficient Communication System/Network
• Wires scale slower than transistors/gates
• Communication arch can consume up to 50% of total on chip power.
• Communication Bottleneck, Computation not bottleneck anymore
• Memory and Interconnect are bottleneck
• Reduce manufacturing difficulties
• Transfer delay
• Wiring issues
• Power consumption
• Reliability
11
Buses VS NoCs
• Architectural paradigm shift: Replace wire spaghetti by network
• Usage paradigm shift: Pack everything in packets
• Organizational paradigm shift
• Confiscate communications from logic designers
12
Buses VS NoCs
13
Why NoC?
▪ Regular geometry that is
scalable
▪ Flexible QoS guarantees
▪ Higher bandwidth
▪ Reusable components
• Buffers, arbiters, routers
▪ No long global wires (or global
clock tree)
• No problematic global
synchronization
▪ Reliable and predictable
electrical and physical properties
14
Why NoCs?
FlexNoC by Arteris
• Physically aware IP.
• Optimizing interconnects
• Reducing development
time
• Lowering power
consumption
• Minimizing die size
• Automates design and
verification tasks
• Supports tailored
topologies
• Enhances system
responsiveness.
15
Generic NoC Router Architecture
• Processing Elements (PE) arranged in a mesh-like grid
• Each PE is connected to a local router through a Network
Interface Controller (NIC).
• NIC module packetizes/de-packetizes the data into/from the
underlying interconnection network.
• The PE together with its NIC forms a network node.
• Nodes communicate with each other by injecting data packets
into the network.
• Packets traverse the network toward their destination, based on
various routing algorithms and control flow mechanisms.
16
Generic NoC Router Architecture
• Input Buffers
• VC buffers
• Route Computation
• Arbiter (Switch Allocator)
• Crossbar Switch
17
Generic Router Pipeline
18
Network Interconnect Design Parameters
• The number of terminals
• The peak bandwidth of each terminal
• The average bandwidth of each terminal
• The required latency
• The message size or a distribution of message sizes
• The traffic pattern(s) expected
• The required quality of service
• Reliability
19
Network On Chip Design
Topology: Specifies way switches are wired(structure/layout).
• Direct
• Mesh, Torrus
• Indirect
• Fat tree
Routing: How a message gets from src to dest(Path).
• Static(Fixed paths)
• Dynamic(Routing decisions made as per current state of network)
• XY Routing
• Source routing
• Shortest path
• Adaptive routing
Switching: How data is forwarded through routers.
Circuit switching and Packet switching
• Switching Techniques
• Store and Forward(SAF)
• Wormhole (WH)
• Virtual Cut Through (VCT) switching
Flow Control: Storing messages, managing buffer space and data flow between routers.
• On/Off Scheme
• Credit Scheme
• ACK/NACK Flow Control
20
Topology
NoC topology is the arrangement of channels and nodes in an on- chip network.
Similar to a map of roads.
• Switch/Router (Node) Degree: number of links at a node used to estimate the
cost.
• Higher degree->more links/ports of each router.
• Hop Count: number of hops a message takes from source to destination. Good
for estimating latency.
• Network Diameter: Large min hop count in the on-chip network
• Path Diversity: Multiple shortest paths between source and destination pair.
21
Topology
2D mesh is most popular topology
▪ All links have the same length
• eases physical design
▪ Chip area grows linearly with the number of
nodes
▪ Must be designed in such a way as to
avoid traffic accumulating in the
center of the mesh
22
Routing
Responsible for correctly and efficiently routing
packets from source to the destination
• Choice of a routing algorithm depends on
trade-offs between several potentially conflicting
metrics
▪ Minimizing power and logic hardware (routing
tables) for lower area footprint
▪ Higher performance by lower delay and
maximizing traffic utilization.
▪ Improving robustness to better adapt to
changing traffic needs Routing schemes
classified based on several categories
23
Routing
Deterministic routing: Chooses the
same route for communication
between a source and a destination
Oblivious Routing: Multiple paths are
chosen between a pair of tiles A and B
based on a set of predetermined paths.
Source Routing: The routes are
embedded in the packet header and so
the source drives the path, the turns at
each intermediate junction.
Adaptive Routing: Select route that
always leads the packet closest to the
destination.
24
Circuit and Packet Switching
25
Packet Switching
Packets are transmitted from source and make their way independently to destination.
Possibly along different routes and with different delays.
▪ zero start-up time, followed by a variable delay due to contention in routers along the path
▪ QoS guarantees are harder to make in packet switching as compared to circuit switching.
Three types of main packet switching schemes
• SAF (Store and Forward) Switching
• Virtual Cut Through Switching
• Wormhole Switching
26
Wormhole
packet
switching
27
Wormhole
packet
switching
28
Wormhole
packet
switching
29
Wormhole
packet
switching
30
Wormhole
packet
switching
31
Wormhole
packet
switching
32
Wormhole
packet
switching
33
Wormhole
packet
switching
34
Flow Control Schemes
Objective is to allocate network resources for packets to traverse the
NoC.
• It can also be viewed as a problem of resolving contention during
packet traversal.
• Manages congestion.
• Provide error recovery mechanism for retransmission in network.
35
Flow Control Schemes
36
Virtual Channel Flow Control
37
Deadlock, Livelock, and Starvation
Deadlock: A packet does not reach its destination, because it is blocked
at some intermediate resource.
Livelock: A packet does not reach its destination, because it enters a
cyclic path.
Starvation: A packet does not reach its destination, because some
resource does not grant access (while it grants access to other packets).
38
NoC based CPU SoCs
39
Rapid Silicon’s Gemini SoC
40
Eminent CPU based on NoC
41
Where to find NOCs?
• On-chip NoCs interconnect
components (e.g., CPU cores,
GPU cores, caches, accelerators,
and memory controllers) within
a single chip (System-on-Chip,
SoC).
• D2D NoCs interconnect multiple
chiplets within a single package
using a high-speed interconnect
fabric.
• C2C NoCs connect multiple
chips across a PCB or system for
cohesive operation (e.g.,
multi-SoC systems or
heterogeneous computing
platforms).
42
NoCs at different levels
43
Questions
44