0% found this document useful (0 votes)
60 views83 pages

Data Center Interconnects

Uploaded by

Hrvoje Silov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views83 pages

Data Center Interconnects

Uploaded by

Hrvoje Silov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

Data Center Interconnects

Ivan Pepelnjak ([email protected])


NIL Data Communications

This material is copyrighted and licensed for the sole use by tEaM pHrOzEn-HeLL ([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
2 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations
Inputs:
• Business needs
• Generic technology requirements
• High-availability requirements
• Application structure and HA implementation
• Transport options

Design decisions:
• External routing
• Firewalling Covered in L3 DCI section
• Load balancing options
• Storage connectivity
• DCI type

© NIL Data
3 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Inputs

Business Needs
Disaster Recovery Site
• Secondary (cold) infrastructure activated after the recovery

Disaster Avoidance
• Migrate the workload before an anticipated disaster
• Data centers are concurrently active for a limited amount of time
• Target data center might already run other application loads

Active/Active Data Center


• Multiple data centers are active at the same time
• Optimum use of resources
• The same application runs concurrently in multiple data centers

© NIL Data
4 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Inputs

Technology Requirements
Disaster Recovery Site
Network
• Storage replication to the DR site
• WAN connectivity at DR site
• Manual or orchestrated switchover or DR startup
Storage
Disaster Avoidance
+ Application load adjustment or live VM migration
+ WAN connectivity adjustments
+ Load balancing adjustments

Active/Active data centers


+ Global load balancing

© NIL Data
5 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Inputs

High-Availability Requirements
Disaster recovery
• Duplicated storage
• Servers (or VMs) are started on DR site after primary site failure
• Downtime: minutes or hours

Disaster avoidance
• Minimum downtime
• Local and global load balancing facilitates seamless Prefer
failover
• Live VMs moved to secondary site
Avoid
• Stretched cluster between sites (temporary or permanent)

26 This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Inputs

Transport Options
Dark fiber, SONET/SDH or DWDM VPLS Avoid

• L1 connectivity, supports any • L2 connectivity


L2/L3 technology or protocol • Emulates a switched LAN
• Provider switches are visible to the
Pseudowires end-user
• L1.5 connectivity • No end-to-end LACP or STP
• Point-to-point links implemented
with Ethernet (rarely ATM or FR) IP (MPLS/VPN, IPsec ...)
• Dictates packet framing (802.3) • L3 connectivity
• End-to-end L2 control protocols • Non-IP traffic must be tunneled
(STP, LACP, LLDP)

© NIL Data
7 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Decisions

Storage Connectivity
Synchronous
Long-distance block storage protocols: iSCSI or FC replication
• FCoE does not work due to PFC limitations
• Use checksum in iSCSI

1-WR
4-OK
WAN
2-WR
3-OK

Alternative: distributed file system with NFS

Asynchronous
replication
Transport FC iSCSI
DWDM/fiber  

1-WR
2-OK
WAN
Pseudowires FCIP  2-WR
3-OK
VPLS FCIP 
IP FCIP 

© NIL Data
8 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Decisions

LUN-based Replication Drawbacks


SCSI timeouts
• Synchronous replication limits maximum distance due to tight SCSI timeouts
Buffer-to-buffer credits
• FC uses hop-by-hop buffer credits to ensure lossless transport
• Long round-trip times reduce throughput
• FCIP can use TCP flow control (only with VE_port model)
• iSCSI has no need for B2B credits
• Additional performance improvements with FCIP I/O Acceleration (Cisco)
Write to primary copy
• Read requests served from secondary LUN copy
• Write requests always sent to the primary LUN instance
Alternative: distributed file system with NFS

© NIL Data
9 This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Design Considerations – Decisions

DCI Network-Side Design Options


Layer-3 (IP) interconnect
• Traditional IP routing design

Layer-3 interconnect with path separation Prefer


• Multiple parallel isolated L3 interconnects
• DMZ, application, database, management, storage ... segments strictly
separated
• Implementation: Multi-VRF and P2P VLANs or MPLS/VPN

Layer-2 interconnect Avoid


• Stretched VLANs (bridging across WAN)
• Business requirements: stretched cluster, VM mobility
• Implementation: depends on the available transport technology

210This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
11This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI Overview
Classic Enterprise IP routing design
• Routing & bridging within each data center
• External connectivity from all data centers
• IP routing between data centers cores

Design considerations
• Workload distribution
• External routing
• Load balancing and NAT
• Firewalling
• High-availability and disaster avoidance procedures

© NIL Data
12This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
13This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

Application and Workload Distribution


Common design choices:
Internet
A) Each DC runs its set of
applications
B) The same application runs in both
data centers
DCI

A) Both DC are accessible through


both external links
B) Each DC has its own primary
external connectivity
Distributed storage

© NIL Data
14This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

External Connectivity: One Common Prefix


• Both data centers advertise the same
IP prefix into global BGP Internet
• Internal routing distributes the traffic to

fd00:0001::/32
fd00:0001::/32
target servers

Results:
• DCI failure causes traffic black holes DCI
• Heavy DCI utilization for inbound traffic
• Outbound traffic is optimal
• Asymmetric traffic flows
• Stateful firewalls not very useful

Distributed storage

315This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

External Connectivity: Specific+Summary Prefix


• Each data center advertises its own
prefix Internet
• Both data centers advertise a summary

fd00:0002::/32
fd00:0000::/30
fd00:0001::/32
fd00:0000::/30
FAIL
prefix for backup purposes
STOP

Results:
• Traffic flows are optimal DCI
• DCI heavily loaded during external
connectivity failures
• Use DNS-based load balancing
• Stateful firewalls will break TCP
sessions after external link
failure/recovery
Distributed storage

616This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

External Connectivity: Independent Prefixes


• Each data center advertises its own
prefix Internet
• No backup connectivity

fd00:0002::/32
fd00:0001::/32
• DCI used only for inter-DC traffic

Results:
• Traffic flows are optimal DCI

• Must use DNS-based load balancing


• Somewhat reduced high availability
• Stateful firewalls work as expected

Distributed storage

© NIL Data
17This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
18This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

Scalability/High-Availability Options
Scale up (bigger servers) or scale out (load balancing)?
• “Scale out” requires multiple parallel application copies
• Easy for web-based applications
• Oracle RAC supports active/active clusters
• MySQL 7.0 supports local active/active clusters and row-based
replication

Load balancing
• Within a data center (ACE/BIG-IP LTM)
• Between data centers (LB with source NAT)
• Globally with DNS-based load balancing (GSS/BIG-IP GTM)

© NIL Data
19This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

Local Load Balancing


• Load balancing between servers in
the same data center Internet
• VIP belongs to the DC public
address space
• Load balancer can be in the data
path
DCI

Combine with DNS-based load


balancing for applications running in
multiple DC

© NIL Data
20This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

Inter-DC Load Balancing


• Load balancing between servers in
multiple data centers
• Public VIP per application in Internet
each data center
• Requires Source NAT
• DNS-based global load
balancing
DCI
• Enables transparent
workload migration or
disaster avoidance
• F5: proximity-based load
balancing with EtherIP

© NIL Data
21This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

Disaster Avoidance With Load Balancing


Prerequisites
• Public VIP per application in each data
center
Internet
• DNS-based global load balancing
• Synchronization between global
and local load balancing
A:B = 0:4
2:2
1:3

Process
LB to DC-B
• Graceful shutdown of DCI
servers in DC A
• Start new servers in DC B
• Load balancers shift load
toward DC B
• No Layer-2 DCI or vMotion required

822This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
23This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI

Layer-3 DCI with Path Isolation

Data Center A Data Center B

DC core DC core
PE-DC-A PE-DC-B

Requirements
• Layer-3 connectivity between data centers (reduce VLAN/STP domain size)
• Maintain separation between security zones (DMZ, applications, database, storage)
Solutions
• Multi-VRF + multiple (routed) VLANs across WAN link (simple)
• Single-hop MPLS/VPN across L2 interconnect link (technology-independent)
• Private MPLS/VPN backbone (multiple DC)

© NIL Data
24This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI with Path Isolation

Layer-3 Path Isolation Implementation Options


Transport technology Multi-VRF MPLS/VPN
Dark fiber or DWDM Ethernet VLANs MPLS over Ethernet
SONET/SDH VLANs on emulated MPLS over PPP
Ethernet
Ethernet pseudowires VLANs MPLS over Ethernet
• Routed at WAN edge
• One VLAN per VRF
VPLS VLANs as above MPLS over Ethernet
L3 MPLS/VPN GRE tunneling MPLS over GRE
• P2P or mGRE Carrier’s carrier architecture
• Dedicate tunnel per VRF
Generic IP connectivity GRE tunneling as above MPLS over GRE

© NIL Data
25This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI with Path Isolation

Multi-VRF with L1/L2 WAN Link

DC edge router DC edge router


APP
DB
MGMT

802.1Q

• Ethernet encapsulation on DCI link


• One VLAN per VRF + global VLAN
• Each VLAN is routed with a DC edge router VRF

© NIL Data
26This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI with Path Isolation

Multi-VRF DCI Across IP Infrastructure

DC edge router DC edge router


APP
DB
MGMT

• Ethernet encapsulation on DCI link


• One VLAN per VRF + global VLAN
• Each VLAN is routed with a DC edge router VRF

© NIL Data
27This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI with Path Isolation

MPLS/VPN Across L2 DCI Transport

DC edge router DC edge router


APP

MPLS/VPN

MPLS/VPN
MPLS
DB
MGMT

IP

• Ethernet encapsulation on DCI link


• Global IP traffic encapsulated into L2 frames
• VRF traffic encapsulated with MPLS shim header + L2 frames

© NIL Data
28This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-3 DCI with Path Isolation

Inter-DC MPLS/VPN Across IP Infrastructure

DC edge router DC edge router


APP

MPLS/VPN

MPLS/VPN
DB
MGMT

• Global IP traffic routed across WAN IP infrastructure


• MPLS traffic is GRE-encapsulated
• Single GRE tunnel required between a pair of DC edge routers

© NIL Data
29This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
30This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

Introduction to LISP
MR MS
Alternative
topology (ALT)

IP backbone
ITR ETR

RLOC EID

LISP = Locator/Identity Separation Protocol


• Maps host IP prefix (EID) into transport IP address (RLOC)
• EID is fixed, RLOC can change
• Host-to-host traffic is UDP-encapsulated between ITR and ETR
• Global EID-to-RLOC mapping service

© NIL Data
31This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

LISP Terminology
MR MS
Alternative
topology (ALT)

IP backbone
ITR ETR

RLOC EID

ITR: Ingress Tunnel Router


ETR: Egress Tunnel Router
MR: Map Resolver (performs EID-to-RLOC mapping for ITR)
MS: Map Server (ETR registers EID-to-RLOC mappings with MS)
ALT: Alternate topology (BGP over GRE) propagates EID-to-RLOC mapping
information

© NIL Data
32This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

A Day in Life of a LISP Packet


MR MS
Alternative
topology (ALT)

IP backbone
ITR ETR

RLOC EID

1. Host sends an IP packet to ITR


2. ITR performs EID-to-RLOC lookup in local cache
3. ITR encapsulates IP packet into LISP+UDP+IP envelope
4. ITR sends IP packet addressed to ETR RLOC into IP backbone
5. ETR receives LISP packet, decapsulates it and performs EID lookup
6. ETR forwards original IP packet toward target EID

© NIL Data
33This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

EID-to-RLOC Mapping Service


Topology-driven actions MR MS
Alternative
• ETR registers EID-to-RLOC topology (ALT)
mapping with MS
• Mapping is propagated throughout ITR
IP backbone
ETR
the ALT backbone
RLOC EID

Data-driven actions
• ITR receives IP packet addressed to unknown EID
• ITR sends Map-Request to local MR
• MR forwards Map-Request onto ALT topology
• Map-Request reaches ETR
• ETR responds with Map-Reply (Map-Reply can be based on ITR location)
• Map-Reply reaches ITR
• ITR installs the reply into local LISP EID-to-RLOC mapping cache

© NIL Data
34This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

LISP Proxy Services


MR MS
Alternative
topology (ALT)

IP backbone LISP IP backbone


PITR ETR

• LISP will reach its full potential with global RLOC EID
deployment (every CE-router is an ITR)
• Local LISP deployment relies on proxy services
• PITR advertises EID prefixes into non-LISP IP backbone to attract traffic
• PITR performs IP-to-LISP translation
• Return traffic can flow through PITR, a dedicated PETR, or directly
• LISP and non-LISP IP traffic can use the same IP backbone

© NIL Data
35This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

LISP in the Data Center


Nexus 1000V = ETR Internet
DC edge router = ITR
PITR PITR

• Nexus 1000V registers all


VM IP addresses with MS DCI

• Nexus 1000V changes


LISP mappings after
MR MR
vMotion event Nexus 1000V
MS MS
Nexus 1000V

• L3 (LISP) transport
between PITR and
Nexus 1000V
• L2 DCI is no longer
required

© NIL Data
36This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
LISP in Layer-3 DCI

DC LISP Caveats
Traffic flow issues
Internet
• LISP with DC PITR does not solve PITR PITR

the ingress traffic trombone problems


• Remote ITR is required to get optimal DCI

ingress routing
• Output traffic flow is optimal Nexus 1000V MR MR Nexus 1000V
MS MS

Scalability
• EID prefix = host route (VM IP address)
• PITR EID-to-RLOC cache entry must expire soon after vMotion event
• Low TTL must be set on LISP mappings
• High volume of Map-Requests from PITRs
• Potential TCAM overflow on PITR

© NIL Data
37This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
38This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Drivers
VM load distribution
• Requirement: Migrate running VMs from overloaded data center
• Probable result: Overloaded WAN links due to traffic trombones

Disaster avoidance scenarios


• Requirement: Running virtual machines are migrated prior to DC shutdown
• Better alternative: use load balancers

Stretched clusters
• Requirement: Cluster members spread across multiple data centers
• Perfect recipe for disaster: DCI WAN link becomes the weakest link
• Split-brain problems

© NIL Data
39This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
40This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats
Stretched cluster issues
• Lost cluster quorum and split brain disasters
• Long-distance flooding
• Asymmetric traffic flows and traffic trombones
Long-distance vMotion issues
• Traffic trombones
Bridging-related problems
• Broadcast storms propagated over WAN
• Widespread spanning tree-related outages

Bridging over WAN has never worked well. Why should it work in a Data Center?

+41This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Stretched Clusters
High Availability Clusters (typical implementation)
• Multiple servers offering the same service
• Active/standby configuration
• Peer failure detection through network heartbeat, LUN locking or
shared file system
• Example: Windows Server Failover Clustering (WSFC)

Stretched clusters
• Members of the same cluster in different data centers
• Often requires L2 connectivity between cluster members

© NIL Data
42This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Example: Stretched Cluster


• Each node has a public
and an interconnect Internet
interface
• Requires distributed
storage
• Services run on any
node in the cluster
DCI
• L2 interconnect usually
required for IP address
redundancy

Issues
• Unpredictable or
suboptimal traffic
flows

Distributed storage
© NIL Data
43This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Example: Stretched Firewall

Internet

HSRP peers

Same IP subnet

Pseudowire or
VPLS service

Shared IP address

Primary DC site Alternate DC site


© NIL Data
44This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Stretched Cluster Problems


Technology challenges
• Usually requires L2 DCI
• Traffic flows across the DCI are unpredictable
• Long-distance flooding in suboptimal cluster implementations
(Microsoft NLB)

Data Center Interconnect becomes the weakest link


• DCI failure splits the cluster (half of the resources are gone)
• DCI failure can cause hard-to-recover split-brain problems

© NIL Data
45This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Cluster Split After DCI Failure


• DCI failure splits the
cluster Internet
• Half of the nodes lose
quorum and are shut
down
• All services running on
those nodes have to be FAILED
restarted
• Potential traffic black
holes (same as
L3 DCI)

Distributed storage
© NIL Data
46This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Split Brain Disaster After DCI Failure


Failure scenario:
• DCI failure splits the Internet
cluster, both disk arrays
become active
• Cluster Quorum in both
data centers
• Duplicate copies of FAILED
same services running
in both data centers
• Data gets out-of-sync

Extremely hard to
recover  extensive
rollback or restore

Distributed storage
© NIL Data
47This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Split-Brain Disaster: Stretched Firewalls

Internet

HSRP peers

Same IP subnet

FAILED

Shared IP address

DCI failure brings down both data centers

Primary DC site Alternate DC site


+48This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Long-Distance Flooding in Stretched Clusters

Internet

WAN
MS NLB
cluster

• Load balancers exist for a good reason


• Mission-critical problems require high-quality solutions

549This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI caveats
Stretched cluster issues
• Lost cluster quorum and split brain disasters
• Long-distance flooding
• Asymmetric traffic flows and traffic trombones
Long-distance vMotion issues
• Traffic trombones
Bridging-related problems
• Broadcast storms propagated over WAN
• Widespread spanning tree-related outages

Bridging over WAN has never worked well. Why should it work in a Data Center?

© NIL Data
50This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Example: Traffic Trombones After vMotion

Internet

WAN

Hypervisor Hypervisor

Distributed storage
751This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

What Causes Traffic Trombones?


Servers are communicating with numerous layer-3 addresses
• Default gateway (fix: FHRP filters)
• Load balancers and firewalls (hard to fix)
• Database servers (impossible to fix)

Inbound traffic flow does not shift after vMotion event


• IP prefix is still advertised through the same WAN connection
• Load balancers in the “old” site still receive external traffic
• LISP can help only when SPs perform ITR encapsulation

Proper fix: scalable application architecture and L3 DCI with load balancing

© NIL Data
52This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Fixing the Outbound Traffic Trombone

Internet

WAN
FHRP 10.0.0.1 Block HSRP FHRP 10.0.0.1
MAC address

• Same FHRP address is active in both Data Centers


• FHRP MAC addresses are blocked on L2 DCI link
• Outbound traffic goes through the closer active FHRP router
• Breaks any stateful device in the path (load balancers, firewalls)

253This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Symmetric Optimal Traffic Flow

WAN core

Adv: 10.1.1.1/32 Adv: 10.1.1.1/32

WAN
FHRP 10.0.0.1 Block HSRP+LB FHRP 10.0.0.1
MAC address

• Same FHRP address is active in both Data Centers


• Load balancers can reach only the local VMs (LB MAC address filtered on DCI link)
• Load balancers insert host routes for active VMs
• Breaks existing TCP sessions
• Same results (with more complexity) as L3 DCI

654This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI caveats
Stretched cluster issues
• Lost cluster quorum and split brain disasters
• Long-distance flooding
• Asymmetric traffic flows and traffic trombones
Long-distance vMotion issues
• Traffic trombones
Bridging-related problems
• Broadcast storms propagated over WAN
• Widespread spanning tree-related outages

Bridging over WAN has never worked well. Why should it work in a Data Center?

© NIL Data
55This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Broadcast Storms
Data Center A

Data Center B
FF FF FF FF FF
FF

WAN
FF FF FF FF

Every server can start a high-volume broadcast, multicast or unicast flood


• Traffic is flooded across both data centers (as far as the server VLAN is stretched)
• Inter-switch bandwidth is wasted
• All servers in the same VLAN receive all the traffic (congested server links)
• ESX/ESXi usually use promiscuous NIC mode: they have to process all the traffic

© NIL Data
56This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Limiting Broadcast Storms


Switches can block ports
with high broadcast/multicast

Data Center B
Data Center A
FF FF FF FF FF

traffic rate

FF
WAN
FF FF FF FF
Apply per-server or on
WAN edge

Dangerous to use in high-availability environment


• Cluster members or VMware physical servers can become isolated
• Cluster member isolation would cause service restart on a different member
• VMware HA feature might restart offending VM on another server

Dangerous to use on the DCI link


• Single server (or VM) can bring down the DCI link

© NIL Data
57This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats

Spanning Tree Problems

STP root
Data Center A

Data Center B
WAN

Spanning Tree Protocol must be used in most bridged environments to prevent loops
• Half of the DCI bandwidth is wasted
• Failures close to the root bridge affect both data centers
• Every DCI failure causes topology change and massive flooding in DC B

258This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats – Spanning Tree

STP Solution#1: MST Regions


• One region per DC
IST root IST root
• Multiple MST instance CIST root

MST Region 1

MST Region 2
in each DC
• VLANs mapped to
non-default MST WAN
instances
• IST = Internal ST
• CIST = Common and
Internal ST
• Half of the DCI bandwidth is still wasted
• Failures in one DC do not propagate to the other DC (hidden inside MST region)
• DCI failures cause CIST topology change, but not MSTI topology change
• DCI failure affect only inter-DC traffic

© NIL Data
59This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats – Spanning Tree

STP Solution#2: Multi-Chassis Link Aggregation


• Parallel inter-DC links
are bundled in LAG
• STP is disabled in

STP domain

STP domain
inter-DC LAG bundle
• Independent STP WAN
instance in each DC

• Full DCI bandwidth is can be used


• Intra-DC failure does not propagate across the DCI
• Partial DCI failure does not affect traffic forwarding
• Potential black hole without reliable link failure detection (use aggressive UDLD)
• Applicable only to networks with two data centers

© NIL Data
60This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Caveats – Spanning Tree

STP Solution#3: Non-STP Bridging

STP domain

STP domain
WAN

Non-STP
bridging

Non-STP bridging is used in the DCI WAN core:


• VPLS, TRILL, 802.1aq or FabricPath
• STP is not used outside of the Data Center
• Mandatory: Single forwarding device between Data Center and WAN for each VLAN

© NIL Data
61This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Agenda

External routing

Load balancing
Design
considerations Layer-3 DCI
Path isolation

Inter-DC vMotion

Drivers

Data Center
Layer-2 DCI Challenges
Interconnects

Technologies

© NIL Data
62This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI

Layer-2 DCI non-IP Implementation Options


Transport technology Implementation options
Dark fiber or DWDM Classic Ethernet bridging
Ethernet pseudowires • Disable spanning tree across WAN core
• Multi-chassis link aggregation for redundancy
TRILL, 802.1aq, FabricPath
SONET/SDH As above (on emulated Ethernet)
Bridging over PPP
VPLS over MPLS over PPP
VPLS MSTP with regions (assuming STP is available with VPLS
service)
Customer-side VPLS-over-MPLS across SP VPLS
TRILL
Any IP-based implementation

© NIL Data
63This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Technology Options

Multi-chassis LAG with DWDM or Pseudowires

Data Center B
Data Center A

Service Provider

• Each pseudowire is a point-to-point Ethernet link


• Best practice: merge links with VSS/vPC and disable STP between locations
• Warning: does not work with more than two locations
• Warning: Q-in-Q in SP network requires unique customer-side MAC addresses
Caveats: Check PW transparency, MTU size and duplicate MAC address handling

© NIL Data
64This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Technology Options

TRILL/SPB with DWDM or Pseudowires


• Non-STP bridging is used in TRILL/SPB core
the WAN core

Data Center B
Data Center A
Service Provider
• Any core topology can be used
Platforms: Nexus 7000/w FP
Caveats:
• Per-VLAN dedicated forwarder
Exception: VPC+ with FP on Nexus 7000
• Check the integration with L3 forwarding
Data Center C
(keep L3 separate)
• Check the PW transparency and MTU size
• SP pseudowire aggregation technology (Q-in-Q or MAC-in-MAC) might interfere
with SPB

© NIL Data
65This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Technology Options

Using VPLS Services

Data Center B
Data Center A

Service Provider

VPLS service behaves like a switched LAN Redundant designs


• Not transparent • End-to-end STP (dangerous)
• LACP (LAG) is no longer an option • Customer-side VPLS-over-MPLS across
• VPLS service might not offer STP emulated LAN
• TRILL

© NIL Data
66This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI Technology Options

Customer-side VPLS over Provider VPLS


VPLS service

Data Center B
Data Center A

Provider VPLS
IP subnet with MPLS/LDP

Use provider VPLS as IP subnet


• Run MPLS+LDP between edge DC switches
• Native IP transport for L3 interconnect
• VPLS transport for L2 interconnect
• Platform: Cisco 6500/7600

© NIL Data
67This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI – VPLS overview

Introduction to VPLS
Prerequisites: IP or IP+MPLS core

Data Center B
Data Center A
Concept:
• Full mesh of pseudowires
between PE-routers emulates a LAN IP+MPLS core
• Separate full mesh per VLAN
Transport: AToM or L2TPv3
Signaling: Directed LDP or BGP
Autodiscovery: BGP
Scalability: H-VPLS (full mesh of trees + Q-in-Q) Data Center C

Loop prevention: split-horizon bridging (no STP)


IP-only backbone: L2TPv3 or MPLS-over-GRE
Redundancy: ICCP (future)
Advanced scenarios: partial mesh for partial connectivity, per-PW STP

+68This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI – VPLS overview

VPLS in L2 DCI Scenarios


Platform support:

Data Center B
Data Center A
• Catalyst 6500/Cisco 7600 only
• ES or SIP linecard on the IP+MPLS core
WAN side
• No MPLS support in NX-OS

Core transport options:


• Manual configuration or BGP-based autodiscovery Data Center C
• MPLS or MPLS-over-GRE (no L2TPv3)
• LDP-based signaling

Redundancy solutions:
• Spanning tree on “loopback” pseudowires
• EEM-based loopback interface tracking
• A-VPLS with VSS

© NIL Data
69This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI – VPLS overview

Redundant VPLS Designs


Challenge: Data Center WAN core
• Redundant L2 DCI design STP root

without end-to-end STP requires a BPDU

per-VLAN dedicated forwarder

BPDU
• VPLS does not provide this functionality

BLOCK
BPDU

Solutions:
• Run STP on the pseudowire between redundant switches
• Enable/disable pseudowires on backup device based on
reachability of loopback interface on primary device (EEM based)
• VSS (A-VPLS)

470This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI – VPLS overview

What is A-VPLS
A-VPLS = VPLS + Cisco Enhancements Data Center WAN core
VSS support
• Solves redundant design issues
• VSS appears as a single PE-device
• NSF/SSO for pseudowires
Enhanced load balancing
• Multiple LSPs/GRE tunnels per pseudowire
VSS
• Port-channel-like load balancing between parallel LSPs
• Extra “flow” label to enable intra-LSP balancing in the MPLS core
Reduced configuration complexity
• interface virtual-ethernet behaves like a trunking LAN interface
• Parallel per-VLAN pseudowires are established automatically

© NIL Data
71This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
L2 DCI – VPLS overview

WAN Core Load Balancing with A-VPLS


Load balancing in IP-only cores
• Based on IP source and destination addresses
• No load balancing on contents of GRE packets
• Multiple LSPs over parallel GRE tunnels

Load balancing in MPLS cores


• Traffic toward the same IP destination (LDP-based LDP) load-balanced based
on bottom MPLS label
• Bottom MPLS label in A-VPLS is a flow label
(Hash of L2/L3/L4 payload)
• Payload of a single LSP can be load-balanced

L2 payload L2 payload
Flow label PW label LDP label L2 header

+72This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI

Layer-2 DCI – IP-based Implementations


Transport Implementation options
technology
L3 MPLS/VPN with Customer-side VPLS or A-VPLS
Carrier’s Carrier
support
Generic IP Pseudowires
connectivity • EoMPLS(AToM)-over-MPLS-over-GRE
• L2TPv3

VPLS or A-VPLS with MPLS-over-GRE


OTV (Cisco)
EtherIP (F5)
MAC VPN (Juniper)

© NIL Data
73This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

EoMPLSoGRE (or L2TPv3)


Data Center A

Data Center B
Core IP
backbone

Port Channel

Ethernet PW over AToM

MPLS over GRE


Platforms:
IP forwarding of GRE payload
• Nexus 7000 + ASR

© NIL Data
74This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

VPLS over GRE

Data Center B
Data Center A

VPLS over MPLS over GRE

IP core

MPLS over GRE

• MPLS over GRE is configured on edge L3 switches


• VPLS Virtual Forwarding Instance (VFI) is created on the these switches
• VLANs are bridged through the VFI instance or virtual-ethernet interface
• Platform: Cisco 6500, SIP400 or ES+

375This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

Overlay Transport Virtualization


OTV

Data Center B
Data Center A

IP core

• Bridging directly over IP (no MPLS or pseudowire glue)


• Control plane MAC address reachability exchange with IS-IS
• Flooding with IP multicast (if available)
• STP is limited to individual sites (OTV cloud provides global split horizon)
• Automatic multihoming (per-VLAN designated forwarding device)

+76This material
© NIL Data Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

OTV Terminology
Edge device: device performing Ethernet-to-IP encapsulation
Internal interface: DC-facing interface on edge device

Data Center A
• Regular L2 interface

IP core
Join interface: WAN-facing uplink interface on edge device
• Routed interface
• Edge device is an IP host
Overlay interface: virtual interface with OTV configuration
• Logical multi-access multicast-capable interface
• No spanning tree on overlay interface
ARP ND cache: ARP snooping reduces inter-site ARP traffic
Site VLAN: VLAN used for edge device discovery
• Must be configured on internal interface(s)
Authoritative Edge Device
• Edge device performing internal-to-overlay forwarding for a VLAN

© NIL Data
77This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

OTV Benefits and Limitations


Benefits Drawbacks (NX-OS release 5.1)
• Pure MAC-over-IP transport • Requires IP multicast (unicast mode in
• Extremely simple configuration NX-OS 5.2)
• Leverages IP multicast for neighbor • Must use physical IP interfaces
discovery and flooding (loopbacks in a future release)
• Automatic AED (Authoritative Edge • Redundancy only through port
Device) election channels
• Built-in multihoming • No interoperability with SVI
• Proactive MAC address reachability • Separate VDC recommended for OTV
propagation with IS-IS
• No unknown unicast flooding Maximum limits (NX-OS release 5.1)
• Load balancing with VPC+ • 3 sites, 6 devices (2 per site)
• ARP ND cache (ARP reply snooping)

© NIL Data
78This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

BGP MPLS Based MAC VPN: Overview


MAC VPN (draft-raggarwa-mac-vpn) = VPLS done right

Data Center B
Data Center A
True Layer-2 MPLS/VPN
• RD identifies unique MAC-VPN instances IP+MPLS core

• Layer-2 VPNs built with route targets


• VLAN and MAC address learning with BGP
• Automatic assignment of VPN labels with BGP
Data Center C
• No pseudowires, MPLS forwarding toward BGP next hop
• Flooding via ingress replication or P2MP LSPs
• Fast MAC moves supported by BGP withdrawals Developed by Juniper and
ALU, not yet available

Solves most L2 DCI issues


• Automatic per-VLAN Designated Forwarder election with split-horizon switching
• Flooding of unknown unicasts is optional
• Support of MLAG via Ethernet Segment Identifier
• Automatic load balancing to MLAG members
© NIL Data
79This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Layer-2 DCI – IP core

EtherIP Between Load Balancers (F5)


Load Balancer-based vMotion solution
• Redundant LB pair in each DC DCI

• EtherIP tunnel between load balancers


(active LB member forwards the traffic) EtherIP
• vMotion and Storage vMotion traffic is
routed, VM traffic is bridged
• Load balancing based on VM proximity
(each LB prefers local VM)
• Perfect integration with LB functionality

EtherIP caveats:
• No STP across EtherIP tunnel
• Network design must ensure loop-free topology

© NIL Data
80This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Conclusions
DCI is primarily a design problem
• Start with application architecture and business requirements
• Prefer L3 DCI (consider path isolation) and heavy use of local and
global load balancing
• Try to avoid live VM migration between data centers

Layer-3 DCI
• Use separate IP prefix for each data center
• Use DNS-based load balancing for application migration
• Use Multi-VRF (simple) or MPLS/VPN (scalable) for path isolation

© NIL Data
81This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Conclusions
Layer-2 DCI has numerous challenges
• Split subnets and split clusters after DCI failure
• Traffic trombones
• Broadcast storms
• Spanning tree issues (avoid long-distance STP)

Technology options for layer-2 DCI


• Point-to-point links or pseudowires with MLAG or TRILL/SPB
• VPLS across MPLS or IP core (includes SP VPLS service)
• BGP MPLS Based MAC VPN (future)
• Overlay Transport Virtualization (OTV – Cisco)
• EtherIP (F5)

© NIL Data
82This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars
Questions?

© NIL Data
83This material Communications
is copyrighted and2010 Data
licensed for the sole Center
use 3.0 forpHrOzEn-HeLL
by tEaM the Networking Engineers
([email protected] [8.28.167.154]). More information at http://www.ipSpace.net/Webinars

You might also like