0% found this document useful (0 votes)
33 views

【AWS】Dive deep on AWS networking infrastructure

The document discusses the evolution of AWS networking infrastructure from consuming existing hardware and software to innovating new solutions. It covers phases from using industry standards to creating custom hardware and focusing on customer benefits. Key aspects covered include hardware design, software control, scalability, availability and performance.

Uploaded by

blackangel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

【AWS】Dive deep on AWS networking infrastructure

The document discusses the evolution of AWS networking infrastructure from consuming existing hardware and software to innovating new solutions. It covers phases from using industry standards to creating custom hardware and focusing on customer benefits. Key aspects covered include hardware design, software control, scalability, availability and performance.

Uploaded by

blackangel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Dive deep on AWS networking

infrastructure
Colin Whittaker (he/him)
Principal Engineer
AWS

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS networking
Infrastructure networking Amazon EC2 networking Edge networking

Routers/switches Virtual private cloud (VPC) Amazon Route 53

Copper/optical cables Elastic network interface AWS Global Accelerator

Data centers AWS Hyperplane Amazon CloudFront

Inter-Region backbone Elastic Fabric Adapter (EFA) AWS Direct Connect

Internet peering/transit Placement groups AWS Cloud WAN

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Choose your own adventure

Option A: Hardware innovation

How and why we design our hardware for routing, encryption and transport

Option B: Software innovation

Distributed vs centralized control, evolving out of self contained devices

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure Available

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure Available

Scalable

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure Available

Scalable Performant

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Phases of evolution

Consume Create Innovate

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Consume
Industry hardware and software

Basic automation

Pushed beyond design intentions

Large chassis backplane/midplane

CC Patrick Finnegan

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Core concepts into create
Embrace Moore’s law

Own our destiny

Use repeatable design patterns

Limit effect boundaries

Constantly iterate and evolve

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Chassis platforms
Control plane CPU Control plane CPU

Forwarding
Line card ASIC(s)
Ports Switch fabric
Forwarding
Line card ASIC(s)
Ports
Switch fabric
Forwarding
Line card ASIC(s)
Ports

Forwarding Switch fabric


Line card ASIC(s)
Ports

Forwarding
Line card ASIC(s)
Ports Switch fabric

PSU / FAN PSU / FAN PSU / FAN PSU / FAN PSU / FAN

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it

“A complex system that works is invariably found to


have evolved from a simple system that worked. The
inverse proposition also appears to be true: A
complex system designed from scratch never works
and cannot be made to work. You have to start over,
beginning with a working simple system.”

John Gall, General Systemantics: An essay on how


systems work, and especially how they fail, 1975

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Single chip–based platforms

Single chip network device


Control plane CPU
PSU / FAN PSU / FAN

Forwarding ASIC

Ports

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Create
TOPOLOGY AND HARDWARE

Clos fabric

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Phases of evolution

Consume Create Innovate

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Innovate
PRESENT DAY

Freedom to examine trade-offs

Custom hardware

Multi-domain applications

Focus on the benefit

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it
WE’VE COME A LONG WAY

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it

Network
Ethernet
CPU system on a
chip
Console

Power

QSFP QSFP QSFP QSFP

32 optical modules
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it

Network
Ethernet
CPU system on a
hw
“bmc” chip
Console

Power

QSFP QSFP QSFP QSFP

32 optical modules
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it

Network
Ethernet
CPU system on a
hw
“bmc” chip
Console

Power

QSFP QSFP QSFP QSFP

32 optical modules
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it

Network
Ethernet
CPU system on a
hw
“bmc” chip
Console

i2c bus

QSFP QSFP QSFP QSFP

32 optical modules
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it

Network
Ethernet
CPU system on a
hw
“bmc” chip
Console
Module
access
offload

QSFP QSFP QSFP QSFP

32 optical modules
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Innovate
PRESENT DAY

ASIC & Optics Board

CPU & Memory Card

Hardware BMC & Storage Board

I2C Offload Module

Power Delivery

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it – 102.8T rack
16+16 32x400G Devices

1024x400G ports total

256x400G ports for Consumers

Max 30.8kVA per rack

Direct-attach copper (DAC) cabling

100G 6.7mm OD at 2.5m

400G 11mm OD at 2.5m

Active DAC with retimers


© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it – Short reach

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it – SN connector

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it
MEDIUM HAUL

Data center interconnect (DCI)

OIF 400G ZR

400G – ZR+ 400km

Integrated routing, DWDM, encryption

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it – 51.2T rack
MEDIUM HAUL

8x 12.8Tbit/s T2 Devices
8x 12.8Tbit/s DWDM Switches
8x16x400G ZR(+) Ports

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DWDM Platform

Four optical sleds


2x400G QSFP-DD to DWDM

Firmware upgradeable
fine tune link quality

Layer 1 Encryption
AES-256

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How we do it
MANAGEMENT NETWORKS

Out-of-band switch Console server

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Colin Whittaker Giacomo Bernardi Giorgio Bonfiglio
[email protected] [email protected] [email protected]

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS backbone

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ARN OSL ***
RR RR RR

CPH Inet
RR

Inet Route
Controller

* Netflow Route Full Table


* TE (Local) Programmer Router
* TE (Global)
* Overrides
* CDN Data

CPH Hot
Route
RR

CPH Hot
Routes
99% Traffic
Consumer
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure Available

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure Available

Scalable

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tenets

Secure Available

Scalable Performant

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Phases of evolution

Consume Create Innovate

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Create
NETWORK OPERATING SYSTEM

Linux-based Routing
Management Telemetry
protocols
Multi-sourced manufacturing

Multi-ASIC Linux kernel

SDK

Network ASIC

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Create
NETWORK OPERATING SYSTEM

Linux-based Routing
Management Telemetry
protocols
Multi-sourced manufacturing

Multi-ASIC Linux kernel

OSPF/BGP ++
SDK

Network ASIC

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Create
Config generation

Deployment coordination

Active telemetry

Auto-remediation

NOC-less

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Phases of evolution

Consume Create Innovate

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Metal boxes and a lot of cables
Small number of rack variations

Rack and cable switches for burn-in

Collect inventory and compare with bill of materials

Reprogram with AWS controlled binaries

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Which way do I go?

AWS network

Open Border
Shortest Gateway
Path First ? ? Protocol
(OSPF) (BGP)

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Which way do I go?
ISSUES WITH OFF-THE-SHELF PROTOCOLS

X C X Y

8 4

Last link standing


Z

Cross-domain imbalance

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Which way do I go?
DISTRIBUTED VERSUS CENTRAL

Statically stable Low scope of impact High visibility Deterministic

Distributed (classical) Centralized (SDN)

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Which way do I go?
BEST OF BOTH WORLDS

Statically stable Deterministic


Highly visible Low scope of
impact
Hybrid

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
So many paths!
TRADITIONAL TCP BEHAVIOR

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
So many paths!
ELASTIC FABRIC ADAPTER (EFA) AND SCALABLE RELIABLE DATAGRAM

Dave Brown’s Keynote


Session: NET211-L

Monday Night Live with Peter DeSantis – 2018 Scaling HPC Applications on EC2 – 2018

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Doctor, why does it hurt?
PASSIVE MONITORING

>7B observations per minute

Counters

Sensors

Events

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Doctor, why does it hurt?
ACTIVE MONITORING

>1.5B probes per minute

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Doctor, why does it hurt?
ACTIVE MONITORING

Internet

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Doctor, why does it hurt?
CLEAR SIGNAL

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Doctor, why does it hurt?
CORRELATION

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Doctor, why does it hurt?
TRIANGULATION

---------
---------
------X--
---------

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ahhh . . . That’s better
AUTO-REMEDIATION

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Layered control
Local for speed

Central for optimization

Hierarchical abstractions

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Future: Intentful management
Expected behaviors
Planning

Hierarchical
Regional

Multi-domain Intent
Zonal

Closed loop
Device

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Colin Whittaker Giacomo Bernardi Giorgio Bonfiglio
[email protected] [email protected] [email protected]

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS backbone

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ARN OSL ***
RR RR RR

CPH Inet
RR

Inet Route
Controller

* Netflow Route Full Table


* TE (Local) Programmer Router
* TE (Global)
* Overrides
* CDN Data

CPH Hot
Route
RR

CPH Hot
Routes
99% Traffic
Consumer
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.

You might also like