0% found this document useful (0 votes)
1K views101 pages

BGP Scaling Strategies and Solutions

This document discusses solutions for scaling BGP in large networks. It describes challenges to scaling from increased number of prefixes, services using BGP, and larger networks. Memory utilization is a key scaling issue, which can be addressed through route aggregation, filtering, and soft reconfiguration inbound. Full iBGP meshes do not scale and are replaced by confederations or route reflectors to reduce the number of iBGP sessions.

Uploaded by

Kevin Kim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views101 pages

BGP Scaling Strategies and Solutions

This document discusses solutions for scaling BGP in large networks. It describes challenges to scaling from increased number of prefixes, services using BGP, and larger networks. Memory utilization is a key scaling issue, which can be addressed through route aggregation, filtering, and soft reconfiguration inbound. Full iBGP meshes do not scale and are replaced by confederations or route reflectors to reduce the number of iBGP sessions.

Uploaded by

Kevin Kim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
  • Goal of This Session
  • Scale Challenges
  • Growth of BGP
  • More Services Using BGP
  • Memory Utilization
  • Full Mesh iBGP
  • Hierarchical Route Reflectors
  • Update Groups
  • Slow Peer
  • RR Problems & Solutions
  • Deployment
  • Multisession
  • MPLS VPN
  • OS Enhancements

Scaling BGP

Luc De Ghein – Technical Leader Services


BRKRST-3321
Agenda
• Introduction

• Goal

• Scale Challenges

• Memory Utilization

• Full mesh iBGP

• Update Groups

• Slow Peer

• RR Problems & Solutions

• Deployment

• Multi-Session

• MPLS VPN

• OS Enhancements

• Conclusion
Jaws

4
Goal of This session
• Present causes of scale challenges
• Present solutions for scaling
• There are no scaling numbers
• What you can control the most:
• Buy a bigger box
• Design the network properly

5
Scale Challenges
• BGP is robust, simple and well-known
• We need to overcome the following:
• Newer services: new AFs
• More prefixes
• Larger scale: more (BGP) routers
• More multipath
• More resilience
• PIC Edge, Best External, leading to more prefixes/paths

6
BGP ASN
Growth of BGP 50k

Internet sees ~ 6 updates per sec


BGP IPv4 table
600k

BGP ASN
10k

BGP IPv4 table BGP IPv6 table


100k 25k

1994 2001 2010 2015


Source: [Link]

7
More Services Using BGP
1990 1995 1999 2002 2009 2012 2015

IPv4 IDR IPv4 enterprise MPLS VPN BGP FC PIC BGP flowspec

IPv6 IDR 6PE 6VPE BGP FC BGP monitoring


Path Diversity protocol
services

multicast mVPN mVPN Auto Discovery


C-signalling

BGP PW Signalling (VPLS) BGP - LS

BGP MAC Signaling (EVPN) AIGP


scaling

Inter-AS MPLS DMVPN Unified MPLS


VPN

8
For Your
Service Address Families Reference

IPv4 unicast vpn Layer 2


IPv6 multicast multicast in overlay linkstate

IPv4 unicast IPv6 unicast vpnv4 unicast nsap unicast IPv4 Flowspec

IPv4 multicast IPv6 multicast vpnv4 multicast l2vpn vpls IPv6 Flowspec

IPv4 MVPN IPv6 MVPN vpnv6 unicast l2vpn evpn vpnv4 Flowspec

IPv4 MDT vpnv6 multicast l2vpn mspw vpnv6 Flowspec

IPv4 tunnel rtfilter unicast linkstate

9
Memory Utilization

10
High Memory Utilization - Solutions

# Prefixes # Paths # Attributes


possibly many per prefix possibly many per path

reduce # BGP filter (extended)


aggregate
peers communities

filter prefixes do not use limit own


iBGP full mesh attributes

partial routing
table

11
High Memory Utilization
soft reconfiguration inbound route refresh

BGP Table Pre-filter BGP Table


BGP Table
AS 10 BGP updates AS 20
AS 10 BGP updates AS 20

inbound filter
inbound filter
• Filtered prefixes are dropped
• Filtered prefixes are stored: much more memory used
• Support needed on peer, but this a very old feature
• Support only on router itself
• Changed filter: router sends out route refresh request to
• Changed filter: re-apply policy to table with filtered prefixes peer to get the full table from peer again

Recommendation: Use soft reconfiguration inbound only on eBGP peering when


you need to know what the peer previously advertised and you filtered out

12
Full Mesh iBGP

13
Is Full Mesh iBGP Scalable?
• Per BGP standard: iBGP needs to be full mesh
• Total iBGP sessions = n * (n-1) / 2
• Sessions per BGP speaker = n - 1

• Two solutions
1. Confederations
2. Route reflectors

14
AS 100
Confederations subAS 65001
subAS 65003
• Create # of sub-AS inside the larger subAS 65002
R1 R8
confederation R2

R13 R9
• Conferation AS looks like normal AS to
R5 R6
the outside R3 R4

R12 R10

R7 subAS 65004
R11

• Full mesh iBGP still needed inside subAS


R14
• No full mesh needed between subAS (it’s
eBGP)

• Every BGP peer needs to be in a subAS


R15
R16
• Each subAS can have different IGP with
next-hop-self within confed confed eBGP
• Flexible confed eBGP peerings iBGP
• No connectivity needed between any eBGP
subAS’s • Redundancy needed vs increased memory/CPU

• But full mesh between subAS’s is not needed

15
Route Reflectors
• A route reflector is an iBGP speaker that reflects routes learned from iBGP RR
peers to other iBGP peers, called RR clients
• iBGP full mesh is turned into hub-and-spoke
• RR is the hub in a hub-and-spoke design

router bgp 65002 OR router bgp 65002


neighbor R1 route-reflector-client no bgp client-to-client reflection
neighbor R2 route-reflector-client RR RR
neighbor R1 route-reflector-client
neighbor R2 route-reflector-client

RR clients are regular client-to-client reflection


R1 R2 R3 R1 R2 R3
iBGP peers can be turned off

RR clients are interconnected

16
Route Reflector
What’s Possible?

iBGP
eBGP

AS 101 AS 100

R5
non-client

Any router can peer RR RR


eBGP

R1 R2 R3 R4 clients
cluster cluster

17
Route Reflector - Cluster
• Redundancy needed, min of 2 RRs per cluster
Full mesh between RRs
• Cluster = RR and its clients and non-clients should be
R5
kept small

RR RR RR RR

R1 R2 R3 R4

Client can peer with RRs in multiple clusters


18
Hierarchical Route Reflectors
RR
• Chain RRs to keep the full mesh between RRs and non-clients small
• Make RRs clients of other RRs RR & RR client

• An RR is a RR and RR client at the same time


• iBGP topology should follow physical topology
• Prevents suboptimal routing, blackholing, and routing loops

• RRs in top tier need to Tier 1


be fully meshed

Tier 2
• There is no limit to the
amount of tiers
Tier 3 19
Route Reflector – Same Cluster-ID or Not?
• RR1 has only 1 path for routes from RRC2
– RR1 and RR2 have the same cluster-ID

• If one link RR to RR-client fails


– iBGP session remains up, it is between loopback IP addresses

• If different cluster-ID: RR1 RR2

– RR1 stores the path from RR2


– RR1 uses additional CPU and memory
– Potentially for many routes

• Using the same or different cluster-ID for RRs in one set? RRC1 RRC2

– Different cluster-ID
• Additional memory and processor overhead on RR
– Same cluster-ID
• Less redundant paths

20
Picking RRs
How many? Where? Which kind?

Redundancy Location Dedicated RR


Sets of two • Geo No forwarding (no FIB)
• Datacenter RIB and BGP/IGP
• Region
Services Needed Resources
• To scale: sets (per group of ) Memory
address families CPU

7200 ASR1K

primary backup primary backup


Virtual router
service 1 service 2
(one or more AFs) (one or more AFs) • Mobility
• Manageability ASR9Kv
CSR1000V
• Same BGP implementation and software version (vRR)
as deployed on the Edge (XE/XR)
• Reduced physical footprint (power/cooling/cabling)
• Performance (multi-core) / memory (64-bit)
21
BGP RR Scale - Selective RIB Download
• To block some or all of the BGP prefixes into the RIB (and FIB) • For AFs IPv4/6
• not needed for AFs vpnv4/6
• Only for RR which is not in the forwarding path
• Benefit
• Saves on memory and CPU
• ASR testing indicated 300% of RR-client
• Implemented as filter extension to table-map command session scaling (in order of 1000s)

configuration no BGP prefixes in RIB no BGP prefixes in FIB

router bgp 1 RR1#show ip cef


RR1#show ip route bgp
address-family ipv4
table-map block-into-fib filter
RR1# RR1#
route-map block-into-fib deny 10

configuration IOS-XR
route-policy block-into-fib router bgp 1
if destination in (...) then
drop address-family ipv4 unicast
else table-policy block-into-fib
pass
end-if
22
Multi-Cluster ID
router bgp 1
no bgp client-to-client reflection intra-cluster cluster-id [Link]
no bgp client-to-client reflection intra-cluster cluster-id [Link]

• An RR can belong to multiple clusters


• On IBGP neighbor of RR: cluster IDs on a per- cluster ID 1 cluster ID 2
neighbor basis
• The global cluster ID is still there
• Intra-cluster client-to-client reflection can be disabled, PE1 PE3

when clients are meshed


• Can be disabled for all clusters or per cluster
RR1
• More work - sending more updates - for RR clients
• Less work - sending fewer updates - for RRs
no reflection no reflection
PE2 PE4

• Each set of peers in cluster ID has its own update group


• Loop-prevention mechanism is modified
• Taking into account multiple cluster IDs
23
Comparing Confederation vs. RR For Your
Reference

Confederation Route Reflector


Loop prevention AS Confederation Set Originator/Cluster ID

Break up a single AS Sub-AS’ (possibly with separate IGPs) Clusters

Redundancy Multiple connections between sub-AS’ Client connects to several reflectors

External Connections Anywhere in the network Anywhere in the network

Multilevel Hierarchy RRs within sub-AS’ Clusters within clusters

Policy Control Along outside borders and between sub-AS’ Along outside border
Dampening possible on eBGP confed
Scalability Medium; still requires full iBGP within Very high
each sub-AS
Migration Very difficult (impossible in some situations) Moderately easy
But easy when merging two companies
Deployment of (new) features Decentralized Central on RR

Note: Route reflectors can exist inside subAS


24
router bgp 999

BGP Route Server route-server-context rs-context


!
address-family ipv4 unicast
import-map rs-import-map
• Alternative to eBGP full mesh !
neighbor [Link] remote-as 100
• Used by IX (Internet eXchange) providers !
address-family ipv4
• Operational simplicity neighbor [Link] route-server-client context rs-context
!
ip as-path access-list 100 permit ^200$
• Reduces CPU/memory/configuration !
route-map rs-import-map permit 10
• Context policy can be used match as-path 100

AS 200 AS 300 AS 200 AS 300


no bgp enforce-first-as

R2 R3 R2 R3

AS 100 AS 400 AS 100 AS 999 AS 400

R1 R4 R1 R5 R4

Transparent AS
R6 R5
Next-hop preserved R6 R5

AS 600 AS 500 AS 600 AS 500


25
eBGP
Update Groups

26
Grouping of BGP Neighbors: Optimization

Configuration/administration Performance/scalability

• peer groups • update groups


• templates, session-groups, af-groups, neighbor-group

• CLI only • Dynamic grouping BGP of peers according to common outbound


policy
• Networks that have the same best-path attributes can be grouped
into the same message improving packing efficiency
• BGP formats the update messages once and then replicates to
• BGP neighbors with same outbound policy will be put in all members of the update group
the same update group regardless if • replication instead of formatting updates per neighbor: efficiency
• peer-groups are defined
• dynamic = policy changes, update group membership changes
• templates are defined
• AF independent : a peer can belong to different update groups in
• neighbors are individually defined
different address families

27
Update Group on RR
• Update groups are very usefull on all BGP speakers
– but mostly on RR due to
• # of peers
• equal outbound policy

• iBGP typically has no outbound policy


– RRs have large number of iBGP peers in one update group

RR#show bgp ipv4 unicast update-group 2


BGP version 4 update-group 2, internal, Address Family: IPv4 Unicast
BGP Update version : 3201/0, messages 0
Route-Reflector Client RR
Topology: global, highest version: 3201, tail marker: 3201
Format state: Current working (OK, last not in list)
Refresh blocked (not in list, last not in list)
Update messages formatted 2013, replicated 24210, current 0, refresh 0,limit 2000

Number of NLRIs in the update sent: max 812, min 0


Minimum time between advertisement runs is 0 seconds
Has 101 members:
[Link] [Link] [Link] [Link]
[Link] [Link] [Link] [Link]
...

29
IOS
Update Group Replication
RR

1 BGP update
format replicate

BGP update
BGP update
BGP update

Same outbound policy BGP update

...
RR#show ip bgp replication 2 n BGP update
Current
Next
Index Members Leader MsgFmt MsgRepl Csize Version
Version
2 101 [Link] 2013 24210 0/2000 3201/0
update total # of formatting # of
# of size of
group 2 members according to formatted
replications cache
leader’s policy messages

30
IOS
Adaptive Message Cache Size
• Cache = place to store formatted BGP message, before they are send
• Update message cache size throttles update groups during update generation and controls
transient memory usage
• Is now adaptive
• Variable (change over time) queue depth from 100 to 5000
• Number of peers in an update groups
• Installed system memory
• Type of address family
• Type of peers in an update group
• Benefits
• Update groups with large number of peers get larger update cache
• Allows routers with bigger system memory to have *appropriately* bigger cache size and thereby queue
more update messages
• vpnv4 iBGP update groups have larger cache size
• Old cache sizing scheme could not take advantage of expanded memory available on new platforms
• Results in faster convergence

31
Parallel Processing of Route-refresh (and New Peers)
Refresh Group Re-announcements Transient Updates

Table Versions of Prefixes


Version 0 Version X
Original update group handles new transient
updates while refresh update group handles
• IOS re-announcements

• Serving all peers for which route-refresh is needed, at once


• Tracking the refreshing peers
• Maintains a copy of neighbor instance that needs to re-announce the table
• Allows original update group and its members to advertise transient updates without getting blocked
• Tracked with flags
– SPE flags to indicate refresh
– E flag : refresh end marker. Peer is scheduled or participating in a refresh.
– S flag : refresh has started. If not present: waiting for its refresh to start either because another refresh for the same group is in progress
or because the net prepend is not yet complete
– P flag : refresh is paused waiting for other refresh members that started later to catch up

• IOS-XR: Refresh sub-goup

32
IOS-XR
Update Groups in IOS XR
RP/0/6/CPU0:router#show bgp vpnv4 unicast update-group

Update group for VPNv4 Unicast, index 0.2:


Attributes:
Internal
address family Common admin
First neighbor AS: 1
Send communities
update groups Send extended communities
Route Reflector Client
4-byte AS capable
sub-groups Send AIGP
Minimum advertisement interval: 0 secs
Update group desynchronized: 0
refresh sub-groups Sub-groups merged: 5
Number of refresh subgroups: 0
Messages formatted: 36, replicated: 68
filter groups All neighbors are assigned to sub-group(s)
Neighbors in sub-group: 0.2, Filter-Groups num:3
Neighbors in filter-group: 0.3(RT num: 3)
neighbors [Link]
Neighbors in filter-group: 0.1(RT num: 3)
[Link]
Neighbors in filter-group: 0.2(RT num: 3)
[Link]

33
Slow Peer

34
IOS
Slow Peer
update group 1
detection phase track peer queue
protection phase “slow” update group
• slow peer = cannot keep up with the
rate at which we are generating
recovery phase update messages over a prolonged
slow update group is period of time (order of minutes)
no longer slow • filled up cache: blocking all peers
RR

Possible causes
• High CPU
convergence • Transport issues (packet loss/loaded
speed of OK links/TCP)
update goup

%BGP-5-SLOWPEER_DETECT: Neighbor IPv4 Unicast [Link] has been detected as a slow peer

%BGP-5-SLOWPEER_RECOVER: Slow peer IPv4 Unicast [Link] has recovered

Allows for fast and slow peers to proceed at the their own speed
35
Slow Peer CLI
per AF
configuration detection per VRF
per peer
per peer policy template

per AF
static per peer(-group)
protection
per peer policy template

dynamic per VRF


per peer
optional: permanent = peer is not moved
back automatically to the update group per peer policy template

show commands show bgp ... slow command

This is a forced clear of the slow-peer status; the peer is


clear commands clear bgp ... slow command
moved to the original update group

36
Old Slow Peer Solution
Solution before this feature: manual movement

• Create a different outbound policy for the slow peer

• Policy must be different than any other


• You do not want the slow peer to move to another already existing update group

• Use something that does not affect the actual policy


• For example: change minimum advertisement interval (MRAI) of the peer (under AF)
• Also avoiding the cause for a full update (equivalent of a route-refresh)

router bgp 1
address-family vpnv4
neighbor [Link] advertisement-interval 1

37
Slow Peer Mechanism Details For Your
Reference
• Identifying Slow Peer

RR#show bgp ipv4 unicast update-group 1 summary


Summary for Update-group 1, Address Family IPv4 Unicast
BGP router identifier [Link], local AS number 1
BGP table version is 500001, main routing table version 500001
100000 network entries using 14400000 bytes of memory
BGP using 24373520 total bytes of memory convergence is achieved if all
BGP activity 115574/15574 prefixes, 300000/200000 paths, scan interval 60 secs peers are at the table version

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd


output queue is not empty,
persistently?
[Link] 4 1 1257 67368 402061 0 2000 [Link] 0
[Link] 4 1 1219 23362 402061 0 0 [Link] 0
[Link] 4 1 1257 23398 402061 0 0 [Link] 0
[Link] 4 1 10002 1891 402061 0 0 [Link] 100000

38
Slow Peer Mechanism Details For Your
Reference

• Identifying Slow Peer


RR#show bgp ipv4 unicast replication 1
Current Next
Index Members Leader MsgFmt MsgRepl Csize Version Version
1 4 [Link] 78114 144314 2000/2000 402061/500001

RR#show bgp ipv4 unicast update-group 1


BGP version 4 update-group 1, internal, Address Family: IPv4 Unicast
BGP Update version : 402061/500001, messages 2000
Topology: global, highest version: 500001, tail marker: 402061 (pending)
Format state: Current blocked (no message space, last no message space)
Refresh blocked (not in list, last not in list)
Update messages formatted 78115, replicated 144318, current 2000, refresh 0, limit 2000
Minimum time between advertisement runs is 0 seconds
Has 4 members:
[Link] [Link] [Link] [Link]

39
RR Problems & Solutions

40
Best Path Selection Route Advertisement on RR
P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2
P: Z ingress PE does not
Path 1: NH: PE1, best learn 2nd path
NH: PE1, P: Z
NH: PE1, P: Z
PE1

P:Z CE1 RR PE3 CE2

PE2 NH: PE2, P: Z

• The BGP4 protocol specifies the selection and propagation of a single best path for each prefix
• If RRs are used, only the best path will be propagated from RRs to ingress BGP speakers
• Multipath on the RR does not solve the issue of RR only sending best path
• This behavior results in number of disadvantages for new applications and services

41
Why Having Multiple Paths?
• Convergence
• BGP Fast Convergence (multiple paths in local BGP table)
• BGP PIC Edge (backup paths ready in forwarding plane)

• Multipath load balancing


• ECMP

• Allow hot potato routing


• = use optimal route
• The optimal route is not always known on the border routers

• Prevent oscillation
• The additional info on backup paths leads to local recovery as opposed to relying on iBGP
• Stop persistent route oscillations caused by comparison of paths based on MED in
topologies where route reflectors or the confederation structure hide some paths

42
Diverse BGP Path Distribution
Overview

• VPN unique RD (Route Distinguisher)


• BGP Best External
• BGP shadow RR / session
• BGP Add-Path

43
Unique RD for MPLS VPN
VRF
P: Z
Path 1: NH: PE1
Path 2: NH: PE2
NH: PE1, P: Z/RD1
RD1 NH: PE1, P: Z/RD1
VRF PE1

RR1 PE3 CE2


P:Z CE1 RD2
NH: PE2, P: Z/RD2
VRF
PE2
NH: PE2, P: Z/RD2

• Unique RD per VRF per PE


• One IPv4 prefix in one VRF becomes unique vpnv4 prefix per VPN per PE
• RR advertises all paths
• Available since the beginning of MPLS VPN, but only for MPLS VPN

44
Shadow Route Reflector (aka RR Topologies)
P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2 P: Z
Path 1: NH: PE1
NH: PE1, P: Z Path 2: NH: PE2
NH: PE1, P: Z

PE1 RR1

P:Z PE3 CE2


CE1 shadow RR

PE2 RR2
NH: PE2, P: Z
NH: PE2, P: Z

router bgp 1
P: Z
Path 1: NH: PE1, best
address-family ipv4
Path 2: NH: PE2, 2nd best bgp additional-paths select backup
neighbor [Link] advertise diverse-path backup

• Easy deployment
• One additional “shadow” RR per cluster
• RR2 does announce the 2nd best path, which is different from the primary best path
on RR1 by next hop

45
Shadow Route Reflector Note: primary RRs do not need diverse path code

P: Z
Path 1: NH: PE1, best
equal distance Path 2: NH: PE2 RR and shadow RR are co-located.
They‘re on same vlan with same IGP metric towards
prefix.
P: Z
PE1 RR1 Path 1: NH: PE1, best
Path 2: NH: PE2, 2nd best Note: primary and shadow RRs do not need
P:Z to turn off IGP metric check
P
shadow RR
PE2
RR2

P: Z
all links have the same IGP cost Path 1: NH: PE1, best RR and shadow RR are not co-located.
Path 2: NH: PE2

Note: primary and shadow RRs need to turn IGP metric


check off.
P: Z All RRs to calculate the same best path so that primary
PE1 RR1
Path 1: NH: PE1, 2nd best
P:Z Path 2: NH: PE2, best and shadow RRs do not advertise the same path
P shadow RR

PE2 RR2

solution RR(config-router-af)#bgp bestpath igp-metric ignore RR2 advertises same path as RR1 !

46
Shadow Session
Note: second session from RR to RR-client (PE3) has diverse-path
command in order to advertise 2nd best path

P: Z P: Z
Path 1: NH: PE1, best Path 1: NH: PE1
Path 2: NH: PE2, 2nd best Path 2: NH: PE2

NH: PE1, P: Z
NH: PE1, P: Z
PE1

P:Z
CE1 RR PE3 CE2
CE1

NH: PE2, P: Z
PE2
NH: PE2, P: Z

• Easy deployment – only RR needs diverse path code and new iBGP session per each
extra path (CLI knob on RR)

• Shadow iBGP session does announce the 2nd best path


• 2nd session between a pair of routers is no issue (use different loopback interfaces)

48
BGP PIC (Prefix Independent Convergence) Edge
Problem

• Convergence in flat FIB is prefix dependent


• More prefixes -> more convergence time

• Classical convergence (flat FIB)


• Routing protocols react - update RIB - update Solution
CEF table (for affected prefixes)
• Time is proportional to # of prefixes • The idea of PIC:
• In both SW and HW:
• Pre-install a backup path in RIB
• Pre-install a backup path in FIB
Result • Pre-install a backup path in LFIB

• Improved convergence
• Reduce packet loss
• Have the same convergence time for all BGP prefixes
(PIC)

49
MPLS VPN Dual Homed CE - No PIC Edge
P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2
NH: PE1, P: Z

PE1 PE3 CE2

P:Z
CE1

PE2 NH: PE2, P: Z

Steps in convergence Steps in convergence on ingress PE


1. Egress PE goes down 1. Ingress PE recomputes BGP bestpath
2. IGP notifies ingress PE in sub-second 2. Ingress PE installs new BGP bestpath in RIB
3. Ingress PE installs new BGP bestpath in FIB
4. Ingress PE reprograms hardware

50
MPLS VPN Dual-Homed CE - PIC Edge
P: Z
Path 1: NH: PE1, best router bgp 1
Path 2: NH: PE2, backup/repair address-family vpnv4
bgp additional-paths install

NH: PE1, P: Z

PE1 PE3 CE2

P:Z
CE1

PE2 NH: PE2, P: Z

Steps in convergence Steps in convergence on ingress PE


1. Egress PE goes down 1. Switch to repair path with new Next Hop
2. IGP notifies ingress PE in sub-second 2. Ingress PE reprograms hardware
We eliminate convergence dependence on: this scales to the number of prefixes
• Scanning of the BGP table
• Bestpath calculation (because there is a pre-computed backup/repair path)
• Time to generate and propagate updates (PE and RR)
• Updating the FIB (with PIC the FIB update is prefix independent)
51
No BGP Best External – Default BGP Policy
P: Z P: Z
Path 1: NH: CE1, localpref 100, external, best Path 1: NH: PE1, internal, localpref 100, best
Path 2: NH: PE2, internal, localpref 100, backup/repair

NH: PE1, localpref: 100, P: Z

PE1 PE3 CE2

P:Z NH: PE2,


CE1
localpref: 100,
NH: PE1, P: Z
localpref: 100,
P: Z
NH: PE2, localpref: 100, P: Z

PE2

Full mesh iBGP


BGP policies are all default

P: Z
Path 1: NH: CE1, localpref 100, external, best

52
No BGP Best External - Changed BGP Policy
P: Z
Path 1: NH: CE1, localpref 200, external, best P: Z
Path 1: NH: PE1, internal, localpref 200, best

local preference 200


NH: PE1, localpref: 200, P: Z

no backup/repair
PE1 PE3 CE2 path
P:Z
CE1

NH: PE1,
localpref: 200,
P: Z

PE2

Even with full mesh in iBGP,


policy can prevent egress PE
P: Z from learning all paths
Path 1: NH: CE1, localpref 100, external, best
Path 2: NH: PE1, localpref: 200, internal, best
If default policy is changed, one egress PE could
have iBGP path to other egress PE as best path and
not its own external BGP path

53
BGP Best External - Changed BGP Policy
P: Z P: Z
Path 1: NH: CE1, external, best Path 1: NH: PE1, internal, localpref 200, best
Path 2: NH: PE2, localpref 100, internal, backup/repair Path 2: NH: PE2, localpref 100, internal, backup/repair

local preference 200


NH: PE1, localpref: 200, P: Z

PE1 PE3 CE2

P:Z NH: PE2,


CE1
localpref: 100,
NH: PE1, P: Z • With Best External, the backup PE
localpref: 200, (PE2) still propagates its own best
P: Z
external path to the RRs or iBGP peers
NH: PE2, localpref: 100, P: Z
• PE1 and PE3 learn 2 paths
PE2

router bgp 1
address-family vpnv4
bgp additional-paths install P: Z
bgp additional-paths select best-external Path 1: NH: CE1, external, best backup/repair, advertise-best-external
neighbor x.x.x.x advertise best-external Path 1: NH: PE1, localpref: 200, internal, best

54
ADD Path router bgp 1
address-family ipv4
bgp additional-paths select best 2
bgp additional-paths send
neighbor PE3 advertise additional-paths best 2

P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2, best2 P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2, backup/repair
NH: PE1, P: Z
NH: PE1, P: Z
PE1

P:Z
CE1 RR PE3 CE2
CE1

NH: PE2, P: Z
PE2
NH: PE2, P: Z

router bgp 1
address-family ipv4
• PE routers need to run newer code in bgp additional-paths receive
order to understand second path bgp additional-paths install
• Path-identifier used to track ≠ paths

55
Add Path - Possibilities
add-all-path add-n-path
• RR will do best path computation for up to n paths and send
n paths to the border routers
• RR will do the first best path computation and then send
• This is the only mandatory selection mode
all paths to the border routers
• Pros
• Pros
• less storage used for paths
• all paths are available on border routers
• less BGP info exchanged
• Cons
• Cons
• all paths stored
• more best path computation
• more BGP info is exchanged
• Usecase: Primary + n-1 backup scenario
• Usecase: ECMP, hot potato routing
(n is limited to 3 (IOS) or 2 (IOS-XR), to preserve CPU
power) = fast convergence

bgp additional-paths select all bgp additional-paths select best<N>

multipath
• RR will do the first best path computation and then send all IOS-XR
multipaths to the border routers only
• Use case: load balancing and primary + backup scenario
56
For Your
Add-Path - IOS-XR example config Reference
router bgp 1
address-family vpnv4
• Path selection is configured additional-paths install backup (deprecated)
additional-paths advertise
in a route-policy additional-paths receive
additional-paths selection route-policy apx
• Global command, per
address family, to turn on example RPL config
add-path in BGP route-policy ap1
if community matches-any (1:1) then
• Configuration in VPNv4 set path-selection backup 1 install
elseif destination in ([Link]/16, [Link]/16)then add-n-path
mode applies to all VRF
set path-selection backup 1 advertise install
IPv4-Unicast AF modes endif
unless overridden at
individual VRFs route-policy ap2 add-all-path
set path-selection all advertise

route-policy ap3
set path-selection multipath advertise multipath
needed to have a non-
multipath path as backup path route-policy ap4
set path-selection backup 1 install multipath-protect advertise

57
Hot Potato Routing - No RR
• Hot potato routing = packets are passed on (to next AS) as soon as received
• Shortest path though own AS must be used
• In transit AS: same prefix could be announced many times from many eBGP peers

P: Z
eBGP: P: Z
Path 1: NH: PE1
Path 2: NH: PE2
Path 3: NH: PE3, best
PE3 NH: PE3, P: Z
NH: PE1, P: Z
eBGP: P: Z
PE1

eBGP: P: Z
PE2
PE4

NH: PE2, P: Z

58
Hot Potato Routing - With RR
• Introducing RRs break hot potato routing
• Solutions: Unique RD for MPLS VPN or Add Path

P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2
Path 3: NH: PE3 eBGP: P: Z

PE3
NH: PE1, P: Z
eBGP: P: Z P: Z
PE1 NH: PE3, P: Z Path 1: NH: PE1, best

RR
eBGP: P: Z
PE2
NH: PE1, P: Z PE4
NH: PE2, P: Z

59
Hot Potato Routing in Large Transit SP
BR BR
BR BR

BR
BR BR RR
BR RR

RR RR
RR RR BR BR
BR BR
RR

RR
RR RR BR
RR BR
BR
BR

add-path
• Large transit ISPs with full mesh iBGP • add-all-path could be deployed between
between regional RRs and hub/spoke centralized and regional RR’s
between local BR and RR • Also possible: remove the need for regional
• Full mesh and global hot potato routing RR if all BR routers support add-path

BR
Border Router
60
Deployment

61
BGP Selective Download
RIB – Full Internet Routes
FIB – Full Internet Routes
• Access router RIB holds full Internet routing table,
but fewer routes in FIB
• Example: ME switches, ASR900
ASBR ASBR
• FIB holds default route and selective more
specific routes
iBGP iBGP
ISP ISP
• Enterprise CPE devices will receive full Internet
routes through their BGP peering with the access access router
router(s) ASBR

configuration eBGP eBGP


router bgp 1 bRIB – Full Internet Routes
FIB – Default & Filtered routes RIB – Full Internet Routes
address-family ipv4 CPE FIB – Full Internet Routes
table-map filter-into-fib filter

route-map filter-into-fib deny 10


match community 100 enterprise

ip community-list 100 permit 65510:100

62
Path MTU Discovery (PMTUD)
• MSS (Max Segment Size) – Limit on the largest segment that can traverse a TCP session
• Anything larger must be fragmented & re-assembled at the TCP layer
• MSS is 536 bytes by default for client BGP without PMTUD
• Enable PMTU for BGP with
• Older command “ip tcp path-mtu-discovery”
• Newer command “bgp transport path-mtu-discovery” (PMTUD now on by default)

• 536 bytes is inefficient for Ethernet (MTU of 1500) or POS (MTU of 4470) networks
• TCP is forced to break large segments into 536 byte chunks
• Adds overheads
• Slows BGP convergence and reduces scalability

• TCP MSS set per neighbor (IOS-XR 5.4)

63
Session/Timers
• Timers = keepalive and holdtime • Do not use Fast Session Deactivation (FSD)
• Default is ok – Tracks the route to the BGP peer
• Smallest is 3/9 for keepalive/holdtime – A temporary loss of IGP route, will kill off the iBGP sessions
• Scaling <> small timers – Very dangerous for iBGP peers
• IGP may not have a route to a peer for a split second
• Use BFD • FSD would tear down the BGP session
• Built for speed – It is off by default
• When failure occurs, BFD notifies BFD neighbor x.x.x.x fall-over
client (in 10s of msecs) – Next Hop Tracking (NHT), enabed by default, does the job
fine

BFD Clients BFD Clients


OSPF OSPF
IS-IS
BFD Control Packets IS-IS
EIGRP BF D BFD EIGRP
BGP BGP

64
IOS
Dynamic Neighbors
• Remote peers are defined by IP address range
• Less configuration for defining neighbors DMVPN
1

• Remote initiate BGP session iBGP


n
iBGP iBGP

• Enterprise networks (DMVPN, ...)


• iBGP and limited eBGP (limited nr of ASNs) R1

eBGP eBGP eBGP


configuration
router bgp 1
bgp listen range [Link]/16 peer-group 192-16 1 n
bgp listen range [Link]/24 peer-group 10-24
bgp listen limit 1000
neighbor 10-24 peer-group
neighbor 10-24 remote-as 1
neighbor 192-16 peer-group
neighbor 192-16 remote-as 2 alternate-as 3 4 5 6 7
neighbor 192-16 ebgp-multihop 2
neighbor 192-16 update-source Loopback0

65
IOS-XR
BGP Attribute Download
bgp attribute-download
• Attributes communities, extended
BGP originating AS communities, and AS-path are
communities downloaded to the RIB & FIB
ext communities
RIB

show bgp process performance-statistics detail


FIB
show bgp attribute-key
show rib attributes summary

NetFlow
66
Multisession

67
IOS
Multisession
• BGP Multisession = multiple BGP (TCP) sessions between 2 single session
BGP speakers carrying all AFs

• Even if there is only one BGP neighbor statement defined between the R1 R2
BGP speakers in the configuration

• Introduced with Multi Topology Routing (MTR) multisession


1 topo per session
• One session per topology

R1 R2

• Now: possibility to have one session per AF/group of AFs


• Good for incremental deployment of AFs
• Avoids a BGP reset
• But multisession needs to be enabled beforehand
• Good for troubleshooting multisession
• Good for issues when BGP session resets 1 AF per session
• For example “malformed update”
• Not so good for scalability R1 R2

• IOS only and not enabled by default

68
IOS
Multisession For Your
Reference
BGP: [Link] passive rcvd OPEN w/ optional parameter type 2 (Capability)
len 3
capability BGP: [Link] passive OPEN has CAPABILITY code: 131, length 1
BGP: [Link] passive OPEN has MULTISESSION capability, without grouping

R2#show bgp ipv4 unicast neighbors


BGP neighbor is [Link], remote AS 1, internal link

BGP multisession with 3 sessions (3 established), first up for [Link]

multisession for Neighbor sessions:


3 active, is multisession capable
Session: [Link] session 1
MTR Topology IPv4 Unicast
Session: [Link] session 2
1 session
Topology IPv4 Unicast voice per topology
Session: [Link] session 3
Topology IPv4 Unicast video

R2#show ip bgp neighbors [Link] | include session|address family


BGP multisession with 3 sessions (3 established), first up for [Link]
Neighbor sessions:
3 active, is multisession capable
Session: [Link] session 1
Session: [Link] session 2
Session: [Link] session 3
multisession Route refresh: advertised and received(new) on session 1, 2, 3
Multisession Capability: advertised and received

without MTR For address family: IPv4 Unicast


Session: [Link] session 1
1 session
session 1 member
For address family: IPv6 Unicast
Session: [Link] session 2
per address
session 2 member family
For address family: VPNv4 Unicast
Session: [Link] session 3
session 3 member
69
IOS
Multisession
Conclusion

• Increases # of TCP sessions


• Not really needed
• Current default behavior = multisession is off
• Can be turned on by “neighbor x.x.x.x transport multi-session”
• Makes sense to have IPv4 and IPv6 on seperate TCP sessions
• IPv6 over IPv4 (or IPv4 over IPv6) can be done, but next hop mediation is needed

70
MPLS VPN

71
RR-groups
• Use one RR (set of RRs) for a subset of prefixes
• By carving up range of RTs

• Only for vpnv4/6


• RR only stores and advertises the specific range of prefixes

• Less storage on RR, but more RRs needed + more peerings

rr-group 1

RR1

vpnv4/6 RR2 vpnv4/6

PE1 rr-group 2 PE2


vpnv4/6 vpnv4/6

RR1

RR2
72
RR-groups Configuration Example
address-family vpnv4
• Dividing of RTs done by simple ext bgp rr-group 100
community list 1-99 or ext community list address-family vpnv6
with regular expression 100-500 bgp rr-group 100

rr-group 1 ip extcommunity-list 100 permit RT:1:(1|3|5)....


• PEs still send all vpnv4/6 prefixes to RR,
but RR filters them
RR1

vpnv4/6 RR2 vpnv4/6

PE1 PE2
rr-group 2
vpnv4/6 vpnv4/6

RR1 address-family vpnv4


• Dividing RT = more work bgp rr-group 100
RR2 address-family vpnv6
• PEs are not involved, only RRs bgp rr-group 100

ip extcommunity-list 100 permit RT:1:(2|4|6)....


BGP(4): [Link] rcvd UPDATE w/ attr: nexthop [Link], origin ?, localpref 100, metric 0, extended community RT:1:10001
BGP(4): [Link] rcvd 1:10001:[Link]/32, label 22 -- DENIED due to: extended community not supported;

73
Route Target Constraint (RTC)
• Current behavior:
• RR sends all vpnv4/6 routes to PE
• PE routers drops vpnv4/6 for which there is no importing VRF
• RTC behavior: RR sends only “wanted” vpnv4/6 routes to PE
• “wanted”: PE has VRF importing the Route Targets for the specific routes
• RFC 4684
• New AF “rtfilter”

• Received RT filters from neighbors are translated into oubound filtering policies for
vpnv4/6 prefixes

• The RT Filtering information is obtained from the VPN RT import list

• Result: RR does not send vpnv4/6 unnecessarily prefixes to the PE routers

74
Route Target Constraint (RTC)
CE1 CE2
BGP capability exchange
OPEN message PE1 RR PE2

CE3 CE4
capability 1/132 (RTFilter)
for vpnv4 & vpnv6

origin AS origin AS origin AS


AF RTFilter exchange MP_REACH_NLRI Route Route Route
AF RTFilter Target 1:2 Target 1:1 Target 0:0

PE1 installs Default RR installs RT filter RT:1:1 & RT 1:2


RT filter for RR for PE1 (implicitly denying all else)

PE sends all its


AF vpnv4/6 prefixes
vpnv4/6 prefixes to RR RR sends only RED and green (not
exchange
blue) vpnv4/6 prefixes to PE1

75
Route Target Constraint (RTC)

• Results
• Eliminates the waste of processing power on the PE and the waste of bandwidth
• Number of vpnv4 formatted message is reduced by 75%
• BGP Convergence time is reduced by 20 - 50%
• The more sparse the VPNs (few common VPNs on PEs), the more performance gain

• Note: PE and RR need the support for RTC


• Incremental deployment is possible (per PE)
• Behavior towards non-RT Constraint peers is not changed

• Note
• RTC clients of RR with different set of importing RTs will be in the same update group on the RR
• In IOS-XR, different filter group under same subgroup

76
Legacy PE RT Filtering

• Problem: If one PE does not support RTC (legacy prefix), then all RRs in one cluster must
store and advertise all vpn prefixes to the PE
• Solution: Legacy PE sends special prefixes to mimic RTC behavior, without RTC code

Legacy PE RR
• Collect import RTs • The presence of the community triggers the RR to
• Create route-filter VRF (same RD for all these VRFs extract the RTs and build RT membership
across all PEs) information
• Originate special route-filter route(s) with • RR only advertises wanted vpn prefixes towards
• the import RTs attached legacy PE
• one of 4 route-filter communties
• NO-ADVERTISE community

4 route-filter communties
0xFFFF0002 ROUTE_FILTER_TRANSLATED_v4
0xFFFF0003 ROUTE_FILTER_v4
0xFFFF0004 ROUTE_FILTER_TRANSLATED_v6
0xFFFF0005 ROUTE_FILTER_v6
77
Legacy PE RT Filtering Legacy
PE Import
Import CE2 RT 1:1
RT 1:1 CE1

PE1 RR PE2 Import


Import RT 1:3
CE4
RT 1:2 CE3
RTC no RTC
Route-filter Export RT
for vpnv4 for vpnv6
VRF 1:1 1:3

legacy PE sends route-filter VRF


route(s) with unique RD, route-filter
community and importing RTs
vpnv4/6 update with [Link][Link]/32 RD:prefix
prefix(es) RT membership
Community: 4294901762 One of 4 route-filter communities
information
Extended Community: RT:[Link]:1 All import RTs of the legacy PE
RT:[Link]:3 NO-ADVERTISE NO-EXPORT
no-export no-advertise community

AF vpnv4/6 prefixes PE1 sends all its vpnv4/6 RR sends only RED (not green)
exchange prefixes to RR vpnv4/6 prefixes to PE2

78
Legacy PE RT Filtering - Configuration For Your
Reference

Legacy PE config

ip vrf route-filter
RR config
rd 9999:9999
router bgp 1 export map SET_RT
address-family vpnv4
neighbor [Link] route-reflector-client router bgp 1
neighbor [Link] accept-route-legacy-rt address-family vpnv4
neighbor [Link] route-map legacy_PE out
address-family ipv4 vrf route-filter
network [Link] mask [Link]

ip route vrf route-filter [Link] [Link] Null0


ip prefix-list match_RT_1 seq 5 permit [Link]/32

route-map SET_RT permit 10


match ip address prefix-list match_RT_1
Import
set community 4294901762 (equals 0xFFFF0002)
CE2 RT 1:1 set extcommunity rt [Link]:1 [Link]:3 additive
RR PE2

new old Import route-map legacy_PE permit 10


CE4 RT 1:3 match ip address prefix-list match_RT_1
code code set community no-export no-advertise additive

Route-filter Export Map


VRF 1:1 1:3 79
Full Internet in a VRF?
• Why? Because design dictates it
• Unique RD, so that RR can advertise 2 paths?

PRO CON

• Remove Internet routing table from P routers • Increased memory and bandwidth
• Security: move Internet into VPN, out of global consumption
• Added flexibility
• More flexible DDOS mitigation

• Platform must support enough MPLS labels


– Label allocation is per-prefix by default
– Perhaps per-ce or per-vrf label allocation is wanted here
– Now also per-CE and per-VRF label allocation for 6PE (in IOS-XR)

81
Full Internet in a VRF?
• Considerations

• Two Internet gateways for redundancy


• RRs are present: unique RDs needed
• Then double # vpn prefixes
• ADD-PATHS increases paths too Internet Peerings

RD 1:1

PE1 PE3

RR

RD 1:2

PE2 PE4

82
Per-CE Label
• One unique label per prefix is always the default
• Per-CE : one MPLS label per next-hop (so per connected CE router) 2 CEs = 2 labels
• No IP lookup needed after label lookup
• Caveats
• No granular load balancing because the bottom label is the same for all prefixes from one CE, if
platform load balances on bottom label
• eBGP load balancing & BGP PIC is not supported (it makes usage of label diversity), unless resilient
per-ce label
• Only single hop eBGP supported, no multihop • Number of prefixes (n) is much larger than
number of CE routers (x) per VPN
NH: PE1, P: Z1, label L1 • Number of MPLS labels used is very low

CE1

P:Z1-n PE1 RR PE2 CE3


CE2

CEx NH: PE1, P: Zx, label Lx

83
Per-VRF Label
• Per-VRF : one MPLS label per VRF (all CE routers in the VRF)
- Con: IP lookup needed after label lookup
- Con: No granular load balancing because the bottom label is the same for all prefixes, if platform
load balances on bottom label
- Potential forwarding loop during local traffic diversion to support PIC
- No support for EIBGP multipath
Number of MPLS labels used per VRF is 1 !

NH: PE1, P: Z1, label L1

CE1

P:Z1-n PE1 RR PE2 CE3


CE2

CEx NH: PE1, P: Zx, label L1

IOS-XR can do selective label mode (prefix | CE | VRF) with RPL


84
IOS-XR
Selective VRF Download (SVD)
• Download to a line card only those prefixes and labels from a VRF that are actively required to forward
traffic through that line card
Linecard Role Which routes are present?
• In IOS-XR 4.2.0 and enabled by default
Core facing routes for all VRFs, but only the local routes
Local routes
Customer facing routes only for VRFs which the LC is
interested in (local and remote routes)
Standard all routes are present
CE

CE
L L L MPLS
CE C C C PE
CE

CE CE

Customer Core Remote routes


facing facing

85
OS Enhancements

86
ASR9K: Scaling Enhancement
• BGP RIB Scale enhancement in 5.1.1
• Only for RSP440-SE
• Reload is needed

• Get more virtual address space for BGP process


• From 2 GB to 2.5 GB RP/0/RSP1/CPU0:router(admin-config)#hw-module profile scale ?
default Default scale profile
l3 L3 scale profile
l3xl L3 XL scale profile

Profile Layer 3 (Prefixes) Layer 2 (MAC Table)


default Small (512k) Large (512k)
l3 Large (1,000k) Small (128k)
l3xl Extra large (1,300k) Minimal
l3xl (5.1.1 RSP3) Extra large (2,500k) Minimal

87
Multi-Instance BGP
multi-instance
• A new IOS-XR BGP architecture to support multiple BGP
BGP instances
• Each BGP instance is a separate process running RR1
on the same or a different RP/DRP node multi-instance BGP
[Link]
BGP vpnv4
• Different prefix tables
• Multiple ASNs are possible PE1 BGP
[Link] IPv4
BGP
• Solves the 32-bit OS virtual memory limit vpnv4 [Link]

• Different BGP routers: isolate services/AFs on


common infrastructure BGP
IPv4 [Link] single-instance
• Achieve higher prefix scale (especially on a RR) by BGP
having different instances carrying different BGP [Link]
tables RR2
• Achieve higher session scale by distributing the [Link] BGP
overall peering sessions between instances different peerings
88
For Your
OS Scaling Enhancements for BGP Reference

OS releases
BGP Keepalive
Enhancements

IOS
• Priority queues for
reading/writing Keepalive/Update BGP PE-CE Scale
BGP Generic Scale messages
Enhancements

IOS
Enhancements • Results = avoid neighbor flaps /
IOS

ability to support small keepalive • Modified internal data structures


• Parceling of BGP processes values in a scaled setup and optimized internal algorithms for
• Created new BGP task IOS VRF based update generation
process: “BGP Task” • Result = faster convergence /
• Result = optimized update greater VRF and PE-CE session
generation / faster convergence scaling
BGP PE Enhancements

IOS-XR
• Optimised BGP processing of label
on PE router
• Result = reduced CPU usage BGP PE Enhancements

IOS-XR
BGP PE-CE Scale
Enhancements
IOS

• Modified BGP import processing


• Modified internal data structures BGP PE Scale on PE router
and optimized internal algorithms • Result = reduced CPU usage
Enhancements
IOS

for VRF based update generation


• Result = faster convergence /
greater VRF and PE-CE session • Modified internal data structures
scaling for VRFs BGP RIB Scale

IOS-XR
• Result = considerable memory Enhancement
savings / greater prefix scalability

• Only for ASR9K


89
• Result = more prefixes
For Your
When is the Boat Not Big Enough? Reference

IOS IOS-XR NX-OS


Convergence
show bgp convergence show bgp convergence detail

Measure Prefix instability show bgp all summary show bgp table
Traffic drops
Table Versions show bgp process performance-statistics
Timestamps detail

IOS IOS-XR NX-OS


Memory
show bgp all summary show bgp table show bgp internal mem-stats detail
- look for “Grand total”, “Private
memory”, “Shared memory”
show processes memory sorted show process memory <job-id> location <> show system resource
show watchdog memory-state
show memory compare start | end | report
show bgp scale

IOS IOS-XR NX-OS


CPU
show processes cpu history show processes cpu show processes cpu history
show processes cpu | include show processes bgp show processes cpu | include bgp
BGP show processes cpu | include bgp show process cpu detailed <bgp pid>
90
Call to Action
• Visit the World of Solutions for
• Cisco Campus
• Walk in Labs
• Technical Solution Clinics

• Meet the Engineer


• Lunch and Learn Topics
• DevNet zone related sessions
Complete Your Online Session Evaluation
• Please complete your online session
evaluations after each session.
Complete 4 session evaluations
& the Overall Conference Evaluation
(available from Thursday)
to receive your Cisco Live T-shirt.

• All surveys can be completed via


the Cisco Live Mobile App or the
Communication Stations
Thank you
Slow Peer - Configuration For Your
Reference

Phase Where? Command

detection per AF/per VRF bgp slow-peer detection [threshold <seconds>]


default is 5 min
per peer neighbor {<nbr-addr>/<peer-grp-name>} slow-peer detection
[threshold < seconds >]
neighbor {<nbr-addr>/<peer-grp-name>} disable
per peer policy template slow-peer detection [threshold < seconds >]

protection static per neighbor/ neighbor {<nbr-addr>/<peer-grp-name>} slow-peer split-update-group


per peer-group static
static via peer policy slow-peer split-update-group static
template
automatic per AF/ bgp slow-peer split-update-group dynamic [permanent]
per VRF
automatic per neighor neighbor {<nbr-addr>/<peer-grp-name>} slow-peer split-update-group
dynamic [permanent]
automatic per peer policy slow-peer split-update-group dynamic [permanent]
template
permanent = peer is not moved back 95
automatically to the update group
Slow Peers: Displaying/Clearing For Your
Reference

• CLI to display slow peers


• Applicable to all address families
• show bgp all summary slow
• show bgp ipv4 unicast neighbors slow
• show bgp ipv4 unicast update-group summary slow

• This is a forced clear of the slow-peer status; the peer is moved to the original
update group
• Needed when the permanent keyword is configured
• clear bgp AF {unicast|multicast} * slow
• clear bgp AF {unicast|multicast} <AS number> slow
• clear bgp AF {unicast|multicast} peer-group <group-name> slow
• clear bgp AF {unicast|multicast} <neighbor-address> slow

96
Slow Peer Mechanism Details For Your
Reference
Clearing

• CLI to clear
• This is a forced clear of the slow-peer status; the peer is moved to the original update group
• Needed when the permanent keyword is configured

• clear bgp AF {unicast|multicast} * slow


• clear bgp AF {unicast|multicast} <AS number> slow
• clear bgp AF {unicast|multicast} peer-group <group-name> slow
• clear bgp AF {unicast|multicast} <neighbor-address> slow

97
Route-Refresh Update Group: When is a For Your
Reference
Route Refresh Request Sent?
• A route refresh request is sent, when:
• a user types clear ip bgp [AF] {*|peer} in
• a user types clear ip bgp [AF] {*|peer} soft in
• adding or changing the inbound filtering on the BGP neighbor
• via route-map
• configuring allowas-in for the BGP neighbor
• configuring soft-configuration inbound on the BGP neighbor
• in MPLS VPN (for AFI/SAFI 1/128)
• a user adds a route-target import to a VRF
• in 6VPE (for AFI/SAFI 2/128)
• a user adds a route-target import to a VRF

98
Route Reflector For Your
Reference
Loop Prevention
• Because we have RRs and same prefix can be advertised multiple times within iBGP
cloud: loop prevention needed in iBGP cloud
• Two BGP attributes, two ways
• Originator ID
• Set to the router ID of the router injecting the route into the AS
• Set by the RR
• Cluster List
• Each route reflector the route passes through adds their cluster-ID to this list
• Cluster-id = Router ID by default
•“bgp cluster-id x.x.x.x” command to set cluster-id

• Router discard routes if:


• If ORIGINATOR-ID = our ROUTER-ID
• If CLUSTER_LIST contains our CLUSTER-ID
99
Route Reflector For Your
Reference

Loop Prevention

 RR1 and RR2 have different cluster-IDs


AS Path: {65000}
Originator-ID: RRC1
Cluster-List: {RR1}

RR1 RR2

AS Path: {65000}

AS Path: {65000}
Originator-ID: RRC1
RRC1 Cluster-List:
RRC2 {RR1, RR2}

AS Path: {65000}

loop detected
100
Route Reflector RR & RR client
Route Advertisement RR
iBGP
eBGP
prefix coming from eBGP peer prefix coming from RR client prefix coming from a non-client

RR sends prefix to clients and RR reflects prefix to clients RR reflects prefix to clients
non-clients (and sends to other and sends to non-clients (and (and sends to eBGP peers)
eBGP peers) sends to eBGP peers)
101
IOS-XR
Update Groups in IOS XR
RP/0/6/CPU0:router#show bgp vpnv4 unicast update out update-group 0.2
VRF "default", Address-family "VPNv4 Unicast"

Update-group 0.2
Flags: 0x0010418b
address family Sub-groups: 1 (0 throttled)
Refresh sub-groups: 0 (0 throttled)
Filter-groups: 3
update groups Neighbors: 3 (0 leaving)
Update OutQ: 0 bytes (0 messages)
Update generation recovery pending ? [No]
sub-groups
Last update timer start: Apr 3 [Link].425
Last update timer stop: ---
refresh sub-groups Last update timer expiry: Apr 3 [Link].435 (1w4d ago)
Update timer running ? [No] (0.000 sec remaining; last started for 0.010 sec)

filter groups History:


Update OutQ Hi: 3600 bytes (5 messages)
Update OutQ Cumulative: 38700 bytes (54 messages)
neighbors Update OutQ Discarded: 0 bytes (0 messages)
Update OutQ Cleared: 0 bytes (0 messages)
Last discarded from OutQ: --- (never)
“show bgp ... replication” Last cleared from OutQ: --- (never)
in IOS equivalent Update generation throttled 0 times, last event --- (never)
Update generation recovered 0 times, last event --- (never)
Update generation mem alloc failed 0 times, last event --- (never)
102
IOS-XR
Update Groups in IOS XR
RP/0/6/CPU0:router#show bgp vpnv4 unicast update-group 0.2 performance-statistics
Update group for VPNv4 Unicast, index 0.2:
Attributes:
Internal
Common admin
address family First neighbor AS: 1
Send communities
Send extended communities
update groups Route Reflector Client
4-byte AS capable
Minimum advertisement interval: 0 secs
sub-groups Update group desynchronized: 0
Sub-groups merged: 5
Number of refresh subgroups: 0
refresh sub-groups Messages formatted: 36, replicated: 68
All neighbors are assigned to sub-group(s)
Neighbors in sub-group: 0.2, Filter-Groups num:3
filter groups Neighbors in filter-group: 0.3(RT num: 3)
[Link]
Neighbors in filter-group: 0.1(RT num: 3)
neighbors [Link]
Neighbors in filter-group: 0.2(RT num: 3)
[Link]

Updates generated for 0 prefixes in 26 calls(best-external:0) (time spent: 0.002 secs)


Update timer last started: Apr 3 [Link].425

103

You might also like