BGP
BGP
This lesson will be interesting! BGP (Border Gateway Protocol) is the routing protocol
that glues the Internet together. I’m going to explain in which situations we need BGP
and how it works.
Before you continue reading I should tell you to “forget” everything you know about
routing protocols like RIP, OSPF and EIGRP so far…Those three routing protocols have
one thing in common: they are all IGPs (Interior Gateway Protocols). We only use them
within our autonomous system but they are not scalable to use for a network as large as
the Internet.
RIP, OSPF and EIGRP are all different but they have one thing in common…they want
to find the shortest path to the destination. When we look at the Internet we don’t care
as much as to find the shortest path, being able to manipulate traffic paths is far more
important. There is only one routing protocol we currently use on the Internet which is
BGP.
Nowadays almost everything is connected to the Internet. In the picture above we have
a customer network connected to an ISP (Internet Service Provider). Our ISP is making
sure we have Internet access. Our ISP has given us a single public IP address we can
use to access the Internet. To make sure everyone on our LAN at the customer side
1
can access the Internet we are using NAT/PAT (Network / Port address translation) to
translate our internal private IP addresses to this single public IP address. This scenario
is excellent when you only have clients that need Internet access. On our customer LAN
we only need a default route pointing to the ISP router and we are done. For this
scenario we don’t need BGP…
Maybe the customer has a couple of servers that need to be reachable from the
Internet…perhaps a mail- or webserver. We could use port forwarding and forward the
correct ports to these servers so we still only need a single IP address. Another option
would be to get more public IP addresses from our ISP and use these to configure the
different servers. For this scenario we still don’t need BGP…
What if I want a bit more redundancy? Having a single point of failure isn’t a good idea.
We could add another router at the customer side and connect it to the ISP. You can
use the primary link for all traffic and have another link as the backup. We still don’t
require BGP in this situation, it can be solved with default routing:
Advertise a default route in your IGP on the primary customer router with a low metric.
Advertise a default route in your IGP on the secondary customer router with a high metric.
This will make sure that your IGP sends all traffic using the primary link. Once the link
fails your IGP will make sure all traffic is sent down the backup link. Let me ask you
something to think about…can we do any load balancing across those two links? It’ll be
difficult right?
2
Your IGP will send all traffic down the primary link and nothing down the backup link
unless there is a failure. You could advertise a default route with the same metric but
you’d still have something like a 50/50% load share. What if I wanted to send 80% of
the outgoing traffic on the primary link and 20% down the backup link? That’s not going
to happen here but with BGP it’s possible.
This scenario is a bit more interesting. Instead of being connected to a single ISP we
now have two different ISPs. For redundancy reasons it’s important to have two
different ISPs, in case one fails you will always have a backup ISP to use. What about
our Customer network? We still have two servers that need to be reachable from the
Internet.
In my previous examples we got public IP addresses from our ISP. Now I’m connected
to two different ISPs so what public IP addresses should I use? From ISP1 or ISP2? If
we use public IP addresses from ISP1 (or ISP2) then these servers will be unreachable
once the ISP has connectivity issues.
Instead of using public IP addresses from the ISP we will get our own public IP
addresses.The IP address space is maintained by IANA (Internet Assigned Numbers
Authority – http://www.iana.org/ ). IANA is assigning IP address space to a number of
large Regional Internet Registries like RIPE or ARIN. Each of these assign IP address
space to ISPs or large organizations.
When we receive our public IP address space then we will advertise this to our ISPs.
Advertising is done with a routing protocol and that will be BGP.
If you are interested here’s an overview of the IPv4 space that has been allocated by
IANA:
3
IANA IPv4 address space
Autonomous Systems
Besides getting public IP address space we also have to think about an AS
(Autonomous System):
For routing between the different autonomous systems we use an EGP (external
gateway protocol). The only EGP we use nowadays is BGP.
How do we get an autonomous system number? Just like public IP address space you’ll
need to register one.
Autonomous system numbers are 16-bit which means we have 65535 numbers to
choose from. Just like private and public IP addresses, we have a range of public and
private AS numbers.
Range 1 – 64511 are globally unique AS numbers and range 64512 – 65535 are private
autonomous system numbers.
If you are interested, see if you can find the AS number of your ISP:
4
BGP has two flavors:
CIDR Report
On the internet there are a number of looking glass servers. These are routers that have
public view access and you can use them to look at the Internet routing table. If you
want to see what it looks like check out:
5
Above in our picture our customer network has an autonomous system number (AS 1)
and some IP address space (10.0.0.0 /8), let’s pretend that these are public IP
addresses. We are connected to two different ISPs and you can see their AS number
(AS2 and AS3) and IP address space (20.0.0.0/8 and 30.0.0.0/8). We can reach the
rest of the internet through both ISPs.
We can use BGP to advertise our address space to the ISPs but what are the ISPS
going to advertise to our customer through BGP? There are a number of options:
Default Route
Receiving a default route requires the fewest resources on your routers since you only
have a single entry to reach any external network. The customer router will advertise its
10.0.0.0 /8 network to both ISPs which will advertise it to any other AS they are
connected to and we will use a default route to reach anything on the Internet. The
downside of this configuration is that our customer network doesn’t know what is behind
ISP1 and ISP2. We have connectivity because of the default routes but this can lead to
sub-optimal routing. If we only have the default routes then we will send all traffic to one
of the ISPs.
Here’s what could happen if you only use default routes:
6
Our customer network only received a default route from both ISPs and we have
chosen to use the default route of ISP1 to send all our outgoing traffic to. This means
that whenever we send traffic meant for 30.0.0.0 /8 (ISP2) it’s going to be sent to ISP1
and then to ISP2. It’s not a problem but it’s not optimal.
7
We can also receive a partial routing table plus a default route. This partial update
might include all the IP address space that the ISPs have assigned to their customers.
Just like in real life…the more you know the better off you are (unless you believe
ignorance is bliss). In the world of routing having more routing information means you
can make better routing decisions. We’ll have less sub-optimal routing problems than
when we only have the default route.
The last option that we have is that we receive the full Internet routing table from both
ISPs. This requires more resources but we’ll be able to make the best routing decisions.
Path Vector
BGP is called a path vector routing protocol. What does this mean? Take a look at this
image:
8
We have 4 autonomous systems and we are running BGP to exchange routing
information. In AS 1 we have network 1.1.1.0 /24 and this is advertised to AS 2, AS 3
and AS 4.
If we would look at the BGP table of the router in AS4 then we will see network
1.1.1.0 /24 but it also stores the path we have to get through in order to get there. It will
store the prefix but also the paths it has to cross in order to get to 1.1.1.0 /24. Here’s an
example of a real BGP router:
route-views.optus.net.au>show ip bgp
BGP table version is 128380331, local router ID is 203.202.125.6
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external
Origin codes: i - IGP, e - EGP, ? - incomplete
The output above is from one of the BGP looking glass servers.
By using the show ip bgp command I can look at the BGP table and we see this router
knows about network 1.0.0.0 /24. The next-hop IP address is 202.160.242.71. At the
9
end of the line you see path with the numbers 7473 15169. These are the autonomous
systems we have to get through in order to get to this network.
BGP Route Selection
What all IGPs have in common is that all of them want to find the shortest path to the
destination. BGP works differently, since autonomous systems belong to different ISPs
or organizations we want to be able to selectively influence our routing. Take a look at
this example:
10
BGP allows us to use routing policies at the autonomous system level. In the picture
above I have 9 autonomous systems and in AS 9 we have network 192.168.9.0 /24. If
we look at AS 1 then we have a lot of different paths we can take to reach network
192.168.9.0 /24 in AS 9.
Does this mean the network administrator at AS 1 can choose the path we are going to
use? Not really because of the following reasons:
You can choose the exit path…AS1 can send traffic to AS 2 or AS4 but you don’t make routing
decisions for other autonomous systems.
Each autonomous system will only advertise the best path to your autonomous system. AS 1
will only learn about the best path from AS 2 and AS 4 unless their best path fails…only then
you will learn about the second best path.
BGP uses a set of BGP attributes to select a path, these are covered in other lessons.
Conclusion
Hopefully this lesson has been helpful to understand the basics of BGP and why we use
it. In other lessons we will take a closer look at the configuration of external and internal
BGP and also how the BGP path selection works.
When talking about ISPS, BGP, and connections, sometimes you will hear terminology
like “single homed”, “dual homed”,”single multi-homed” or “dual multi-homed”. These
are different design topologies where we describe how a customer is connected (using
BGP) to one or more ISPs.
Video 1
Single Homed
The single homed design means you have a single connection to a single ISP. With this
design, you don’t need BGP since there is only one exit path in your network. You might
as well just use a static default route that points to the ISP.
11
The advantage of a single-homed link is that it’s cost effective, the disadvantage is that
you don’t have any redundancy. Your link is a single point of failure but so is using a
single ISP.
Dual Homed
The dual homed connection adds some redundancy. You are still only connected to a
single ISP, but you use two links instead of one. There are some variations for this
design. Here’s the first one:
With this design, we use a single router on both ends, but we do have redundant links.
12
In the example above, the ISP has a second router. We also could have used a second
router at the customer’s side and a single router at the ISP. For even more redundancy,
add a second router at both sides:
The example above offers the most redundancy when you are connected to a single
ISP. We have two links and two routers on both ends. One disadvantage of this design
is that we are still using a single ISP.
Single Multi-homed
Multihomed means we are connected to at least two different ISPs. The most simple
design looks like this:
13
Above you see that we have a single router at the customer, connected to two different
ISPs. The single point of failure in this design is that you only have one router at the
customer. When it fails, you won’t be able to connect to any ISP. We can improve this
by adding a second router:
This is a pretty good design, we only use single links, but we are connected to two
different ISPs using different routers.
Dual Multihomed
The dual multihomed designs means we are connected to two different ISPs, and we
use redundant links. There are some variations, here’s the first one:
14
Above you can see that we are connected to two different ISPs, using one router and
two links to each ISP. We have redundant ISPs and links, but the router is still a single
point of failure. We can improve this by adding a second router:
The design above is better; it has two customer routers. One disadvantage, however, is
that once one of your router fails, you will lose the connection to one of the ISPs. Using
the same number of routers and links, the following design might be better:
This design has redundant ISPs, routers, and links. Both customer routers are
connected to both ISPs. This design does offer the highest redundancy but it’s also an
expensive option.
15
Conclusion
You have now learned what the different (BGP) connection options to an ISP are:
Single homed: you are connected to a single ISP using a single link.
Dual homed: you are connected to a single ISP using dual links.
Single multi-homed: you are connected to two ISPs using single links.
Dual multi-homed: you are connected to two ISPs using dual links.
In this lesson I will show you how to configure EBGP (External BGP) and how to
advertise networks. I will be using the following topology:
Let’s start with a simple topology. Just two routers and two autonomous systems. Each
router has a network on a loopback interface which we are going to advertise in BGP.
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
Use the router bgp command with the AS number to start BGP. Neighbors are not
configured automatically this is something you’ll have to do yourself with the neighbor
x.x.x.x remote-ascommand. This is how we configure external BGP.
R1# %BGP-5-ADJCHANGE: neighbor 192.168.12.2 Up
R2# %BGP-5-ADJCHANGE: neighbor 192.168.12.1 Up
If everything goes ok you should see a message that we have a new BGP neighbor
adjacency.
16
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 password MYPASS
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 password MYPASS
Show ip bgp summary is an excellent command to check if you have BGP neighbors.
You also see how many prefixes you received from each neighbor.
R1(config)#router bgp 1
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
R2(config)#router bgp 2
R2(config-router)#network 2.2.2.0 mask 255.255.255.0
17
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 0.0.0.0 0 32768 i
*> 2.2.2.0/24 192.168.12.2 0 0 2 i
Use show ip bgp to look at the BGP database. You can see that R1 has learned about
network 2.2.2.0 /24 and the next hop IP address is 192.168.12.2. It also shows
the path information. You can see that network 2.2.2.0 /24 is from AS 2.
R2#show ip bgp
BGP table version is 3, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
In the routing table we can find an entry for BGP with an administrative distance of 20
for external BGP.
EBGP Multihop
eBGP (external BGP) by default requires two Cisco IOS routers to be directly connected
to each other in order to establish a neighbor adjacency. This is because eBGP routers
use a TTL of one for their BGP packets. When the BGP neighbor is more than one hop
away, the TTL will decrement to 0 and it will be discarded.
When these two routers are not directly connected then we can still make it work but
we’ll have to use multihop. This requirement does not apply to internal BGP.
Here’s an example:
18
Above we will try to configure eBGP between R1 and R3. Since R2 is in the middle,
these routers are more than one hop away from each other. Let’s take a look at the
configuration:
First I will create some static routes so that R1 and R3 are able to reach each other.
Now we can configure eBGP:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.23.3 remote-as 3
R3(config)#router bgp 3
R3(config-router)#neighbor 192.168.12.1 remote-as 1
Even though this configuration is correct, BGP will not even try to establish a eBGP
neighbor adjacency. BGP knows that since these routers are on different subnets, they
are not directly connected. We can verify this with the following command:
Just for fun, let’s disable this check so that R1 and R3 try to become eBGP neighbors.
We can do that like this:
Our routers will now try to become eBGP neighbors even though they are not directly
connected. Here’s what happens now:
19
The wireshark capture above shows us that R1 is trying to connect to R3. As you can
see the TTL is 1. Once R2 receives this packet it will decrement the TTL by 1 and drop
it:
Above you can see that R2 is dropping this packet since the TTL is exceeded. It will
send an ICMP time-to-live exceeded message to R1. Our BGP routers will show a
message like this:
20
R1#
BGP: 192.168.23.3 open failed: Connection timed out; remote host
not responding, open active delayed 27593ms (35000ms max, 28%
jitter)
This is R1 telling us that it couldn’t connect to R3. To fix this issue, we’ll tell eBGP to
increase the TTL. First let’s enable the directly connected check again:
R1 and R3 both agree that the BGP neighbor could be 2 hops away. Here’s what the
BGP packet looks like in wireshark:
This capture shows us the TTL of 2. After a few seconds, our routers will become eBGP
neighbors:
21
R1#
%BGP-5-ADJCHANGE: neighbor 192.168.23.3 Up
R3#
%BGP-5-ADJCHANGE: neighbor 192.168.12.1 Up
Even though R1 and R3 are now neighbors, having a non-BGP in router in between R1 and R3 is a
bad idea. R1 and R3 might exchange prefixes through BGP but once packets reach R2, it will have
no clue where to forward these packets to…
Now you understand how eBGP multihop works, let’s take a look at a more useful
scenario:
Above we have two routers…R1 and R2. They are directly connected but we have two
links in between them and we would like to use these for load balancing. Instead of
using the IP addresses on these FastEthernet interfaces for the eBGP neighbor
adjacency we will use the IP addresses on the loopback interfaces for this. Let’s take a
look at the configuration:
On each router we will configure two static routes, this allows us to use load balancing
to reach the loopback interfaces. Now we can configure eBGP:
R1(config)#router bgp 1
R1(config-router)#neighbor 2.2.2.2 remote-as 2
R1(config-router)#neighbor 2.2.2.2 update-source loopback 0
R1(config-router)#neighbor 2.2.2.2 ebgp-multihop 2
R2(config)#router bgp 2
R2(config-router)#neighbor 1.1.1.1 remote-as 1
22
R2(config-router)#neighbor 1.1.1.1 update-source loopback 0
R2(config-router)#neighbor 1.1.1.1 ebgp-multihop 2
Besides configuring the TTL to 2 with the ebgp-multihop command we also have to use
the update-source command to tell the routers to use the IP address on their loopback
interface as the source IP address for the eBGP neighbor adjacency. After a few
seconds, these routers will become neighbors:
R1#
%BGP-5-ADJCHANGE: neighbor 2.2.2.2 Up
R2#
%BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up
Thanks to our static routes, we will use load balancing between the two routers:
In this tutorial we’ll take a look at IBGP (Internal BGP). Students who are new to BGP
often wonder why we have “external” and “internal” BGP. I’m not going to show you just
a couple of quick commands but we’ll take a close look at IBGP and its configuration.
Video 2
Let’s start with an example topology and I’ll explain a couple of things:
23
Above you see 3 autonomous systems and 5 routers. When AS1 wants to reach AS3
we have to cross AS2, this makes AS2 our transit AS. This is a typical scenario where
AS1 and AS3 are customers and AS2 is the ISP.
In our scenario AS1 has a loopback interface with network 1.1.1.0 /24 and AS3 wants to
reach this network. This means we’ll have to advertise this network through BGP.
Here’s what it looks like:
24
So what is going on here? Let me explain it step-by-step:
1. We need EBGP between AS1 and AS2 because these are two different autonomous systems.
This allows us to advertise a prefix on R1 in BGP so that AS2 can learn it.
2. We also need EBGP between AS2 and AS3 so that R5 can learn prefixes through BGP.
3. We need to get the prefix that R2 learned from R1 somehow to R5. We do this by configuring
IBGP between R2 and R4, this allows R4 to advertise it to R5.
So that’s the first reason why we need IBGP…so you can advertise a prefix from one
autonomous system to another. You might have a few questions after reading this:
1. Why don’t we use OSPF (or EIGRP) on AS2 instead and redistribute the prefix on R2 from BGP
into OSPF and on R4 from OSPF back into BGP?
2. Doesn’t IBGP have to be directly connected?
3. How are R2 and R4 able to reach each other through IBGP if we don’t have any routing protocol
within AS2?
4. What about R3? do we need IBGP?
These are some of the questions I get all the time from students who are learning BGP,
here are the answers:
1. Technically this is possible…we can run OSPF (or EIGRP) within AS2 and use redistribution
between BGP and OSPF. In my example R1 will only have a single prefix so it’s no problem but
what if R1 had a full internet routing table? (over 500.000 prefixes since 2014). IGPs like OSPF
or EIGRP are not able to handle that many prefixes so you’ll need BGP for this.
2. IBGP does not have to be directly connected, this might be a little confusing when you only
know about OSPF or EIGRP since they always form adjacencies on directly connected links.
3. They are not! This is why we need an IGP within the AS. Since R2 and R4 are not directly
connected we’ll configure an IGP so that they can reach each other.
4. I’ll give you the answer to this question in a bit…I want to show you what will go wrong if we
Configuration
First we’ll configure R1 and R2. I am also advertising a prefix (on a loopback interface)
in BGP:
25
That’s easy enough, just a few commands. Our next step will be to configure IBGP
between R2 and R4…what IP addresses are we going to use for this? Let’s look at our
options:
I can use any of these IP addresses but we need connectivity. That’s why we need an
IGP like we talked about earlier. So which IP addresses will we select? In this particular
scenario it really doesn’t matter since there is only 1 path between R2 and R4. What if
we had multiple paths between R2 and R4?
R2(config)#interface loopback 0
R2(config-if)#ip address 2.2.2.2 255.255.255.0
R4(config)#interface loopback 0
R4(config-if)#ip address 4.4.4.4 255.255.255.0
That takes care of the loopback interfaces, now we can enable OSPF:
26
R2(config)#router ospf 1
R2(config-router)#network 192.168.23.0 0.0.0.255 area 0
R2(config-router)#network 2.2.2.0 0.0.0.255 area 0
R3(config)#router ospf 1
R3(config-router)#network 192.168.23.0 0.0.0.255 area 0
R3(config-router)#network 192.168.34.0 0.0.0.255 area 0
R4(config)#router ospf 1
R4(config-router)#network 192.168.34.0 0.0.0.255 area 0
R4(config-router)#network 4.4.4.0 0.0.0.255 area 0
Excellent, R2 and R4 will now be able to reach each others loopback interfaces. It’s not
a bad idea to test this though:
Alright we are now prepared for IBGP between R2 and R4. Here’s what it looks like:
R2(config)#router bgp 2
R2(config-router)#neighbor 4.4.4.4 remote-as 2
R2(config-router)#neighbor 4.4.4.4 update-source loopback 0
R4(config)#router bgp 2
R4(config-router)#neighbor 2.2.2.2 remote-as 2
R4(config-router)#neighbor 2.2.2.2 update-source loopback 0
This takes care of our IBGP session. Note that we have to use the update-
source command to specify that we will use the loopback interfaces as the source for
the IBGP session.
Last but not least, let’s configure EBGP between R4 and R5:
R4(config)#router bgp 2
R4(config-router)#neighbor 192.168.45.5 remote-as 3
R5(config)#router bgp 3
R5(config-router)#neighbor 192.168.45.4 remote-as 2
27
Great, that takes care of that. Whenever you configure BGP you will see a message on
the console that shows you that the neighbor adjacency has been established. You can
also check it with the show ip bgp summary command.
Verification
If everything went OK, all routers should have learned about the 1.1.1.0 /24 prefix that I
advertised on R1. Let’s see if that is true:
R1#show ip bgp
BGP table version is 2, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
You can see that it is in the BGP table. This means that I succesfully used the network
command to advertise into BGP. The next hop is 0.0.0.0 since it originated on this
router. If you don’t see anything here then normally there are two reasons for this:
You can’t advertise something in BGP that is not in your routing table, make sure the interface is
up/up.
You typed an incorrect subnet mask when you used the network command (has to be exact
match!).
Let’s see what R2 thinks about this:
R2#show ip bgp
BGP table version is 2, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
That’s looking good too. R2 knows about our prefix, you can see that the next hop is the
IP address of R1. If you take a closer look you can see the > symbol in front of the
28
prefix, this means that the router selected this entry as the best one and that it installed
it in the routing table. Let’s check R4, it should receive this information from R2:
R4#show ip bgp
BGP table version is 1, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R4 learned about the prefix but there’s something going on here…there is no > symbol
before the prefix so R4 didn’t install this in the routing table. Can you tell why this is
happening? Take a close look at the next hop…I’ll give you the answer in a sec, let’s
check R5 first:
R5#show ip bgp
Does R4 have any idea how to reach the next hop? BGP doesn’t change the next hop
IP address by default so this can cause some issues. Let’s verify if R4 knows how to
reach the next hop:
R4#show ip route 192.168.12.1
% Network not in table
No next hop, so we can’t install the prefix from BGP into the routing table…how are we
going to fix this? As always there are multiple options:
R2(config)#router bgp 2
R2(config-router)#neighbor 4.4.4.4 next-hop-self
R4(config)#router bgp 2
29
R4(config-router)#neighbor 2.2.2.2 next-hop-self
I’m doing this on both R2 and R4. For this scenario I don’t have to do it but if I would
advertise something on R5 then R2 would have the same problem as R4. Take a look
again R4 to see the changes:
R4#show ip bgp
BGP table version is 2, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Excellent…two important changes here. First of all you see the > symbol which means
R4 was able to install this prefix in the routing table. Secondly, the next hop IP address
has been changed to something R4 knows (the loopback interface of R2).
Since R4 is now able to install it in the routing table, it can advertise the prefix to R5:
R5#show ip bgp
BGP table version is 2, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R5 has learned about the prefix…so far so good, you can see that it’s in the routing
table:
That’s looking good. So are we done? Is there connectivity? Let’s find out:
30
R5#ping 1.1.1.1
Uh-oh…something went wrong. This is often a very frustrating moment for many BGP
students, they see something in the routing table but it doesn’t work. What is going on
here?
Let’s do a quick trace from R5 to see how far we can get to R1:
R5#traceroute 1.1.1.1
So our IP packet reaches R4 but after that it went somewhere into oblivion. R4 is not
the problem so we’ll have to check the next device in the path towards R1, that’s R3.
R3 is an interesting router since it doesn’t run BGP, only OSPF. Let’s check R3:
There’s our problem, R3 receives an IP packet with destination 1.1.1.1 but has no clue
where to send it so it will be dropped. How do we fix this?
Once again, you could redistribute BGP into OSPF but that’s a bad idea…1 prefix could
work but an entire internet routing table…not gonna happen!
This is why you need IBGP on all your routers in your transit AS. We need to configure
IBGP on R3 so it learns about our 1.1.1.0 /24 prefix and it will know how to reach the
destination.
31
Just like R2 and R4, I’ll use a loopback interface on R3 as the source of our IBGP
session.
I will configure IBGP between R2/R3 and R3/R4. Let’s create a loopback, advertise it in
OSPF and configure BGP:
R3(config)#interface loopback 0
R3(config-if)#ip address 3.3.3.3 255.255.255.0
R3(config)#router ospf 1
R3(config-router)#network 3.3.3.0 0.0.0.255 area 0
R3(config)#router bgp 2
R3(config-router)#neighbor 2.2.2.2 remote-as 2
R3(config-router)#neighbor 2.2.2.2 update-source loopback 0
R3(config-router)#neighbor 4.4.4.4 remote-as 2
R3(config-router)#neighbor 4.4.4.4 update-source loopback 0
That takes care of R3, now we’ll configure R2 and R4 to peer with R3:
R2(config)#router bgp 2
R2(config-router)#neighbor 3.3.3.3 remote-as 2
R2(config-router)#neighbor 3.3.3.3 update-source loopback 0
R2(config-router)#neighbor 3.3.3.3 next-hop-self
R4(config)#router bgp 2
R4(config-router)#neighbor 3.3.3.3 remote-as 2
R4(config-router)#neighbor 3.3.3.3 update-source loopback 0
R4(config-router)#neighbor 3.3.3.3 next-hop-self
This will establish IBGP between R2/R3 and R3/R4. Take a look at the BGP table of R3:
R3#show ip bgp
BGP table version is 2, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Very nice…R3 now knows how to reach the 1.1.1.0 /24 network so it’s no longer the
problem. Can R5 finally reach R1? Let’s find out:
32
R5#ping 1.1.1.1
It still doesn’t work, this is where the frustration turns into a BGP hate rage (just kidding
hehe). I’ll show you what the problem is here…
It’s a good idea to check some of the routers that are closer to R1, see if they are able
to ping 1.1.1.1. Let’s start with R2:
R2#ping 1.1.1.1
R3#ping 1.1.1.1
R1#show ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter
area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type
2
E1 - OSPF external type 1, E2 - OSPF external type 2
33
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-
IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user
static route
o - ODR, P - periodic downloaded static route
This is all that R1 has in its routing table. What happens is that R1 receives an IP
packet from R3 that looks like this:
R1#debug ip packet
IP packet debugging is on
This will show us what happens when R1 receives the IP packet. Don’t do this on a
production router as it will produce way too much debug information:
R1#
IP: s=1.1.1.1 (local), d=192.168.23.3, len 100, unroutable
R1 says it’s unroutable, the destination is unknown. To fix this problem we have to
advertise some additional networks. I don’t really care about R3 being able to reach R1
but I do want R5 to reach R1.
What we’ll do is advertise the 192.168.45.0 /24 prefix into BGP, we can do this on R4 or
R5:
R5(config)#router bgp 3
R5(config-router)#network 192.168.45.0 mask 255.255.255.0
34
R1#show ip bgp
BGP table version is 3, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R5#ping 1.1.1.1
Finally! it’s working! If you also want to ping R1 from any of the other routers then you
need to make sure R1 knows where to send the return traffic.
Are we done now? Almost…there’s one more thing I want to learn you about the IBGP neighbor
adjacencies…
IBGP Neighbor Adjacencies
Right now our routers within AS2 are configured like this:
35
This is called full-mesh IBGP. All routers within AS 2 are neighbors with each other. Do
we really need the IBGP peering between R2 and R4? Let’s find out what happens
when I remove it…
R2(config)#router bgp 2
R2(config-router)#no neighbor 4.4.4.4
R4(config)#router bgp 2
R4(config-router)#no neighbor 2.2.2.2
R3#show ip bgp
BGP table version is 3, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
36
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R3 learned about 1.1.1.0 /24 from R2 and 192.168.45.0 /24 from R4. This is good,
these are prefixes that we advertised before. Now look at R2:
R2#show ip bgp
BGP table version is 4, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R2 only knows about 1.1.1.0 /24, it didn’t learn about 192.168.45.0 /24 from R3. What
about R4?
R4#show ip bgp
BGP table version is 5, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R4 only learned about 192.168.45.0 /24 from R5, we don’t see 1.1.1.0 /24 here.
The problem here is that IBGP does not advertise prefixes from one IBGP neighbor to
another IBGP neighbor. This is called BGP split horizon.
There is a good reason why IBGP works like this…
Between different ASes, BGP uses the AS_PATH attribute to avoid routing loops. A
prefix will not be accepted by a BGP router if it sees its own AS number in it…plain and
37
simple. However, within the autonomous system the AS number does not change so we
can’t use this loop prevention mechanism.
R1 could receive an update about a prefix that it originated itself…not a good idea. With
BGP split horizon this can’t occur:
R
2 will never forward the IBGP prefixes that it learns from R1 towards R3. This means
that all your IBGP routers have to become neighbors with all other IBGP routers (full-
mesh!). If you have a lot of IBGP routers then this can be a lot of work, the number of
required adjacencies is:
X*(X-1)/2
38
So with 10 IBGP routers you will need to configure 45 IBGP neighbor adjacencies.
There are two techniques to reduce this number:
All prefixes that BGP learns are stored in the BGP table. In this lesson we’ll take a look
at this table and you will learn how to read it. We’ll start with a simple topology and
finish with a quick peek at a full Internet routing table.
Configuration
Here’s the topology we will use. 4 routers, each in a different autonomous system:
39
neighbor 192.168.13.3 remote-as 3
no auto-summary
R2#show run | section bgp
router bgp 2
no synchronization
bgp log-neighbor-changes
neighbor 192.168.12.1 remote-as 1
neighbor 192.168.24.4 remote-as 4
no auto-summary
R3#show run | section bgp
router bgp 3
no synchronization
bgp log-neighbor-changes
neighbor 192.168.13.1 remote-as 1
neighbor 192.168.34.4 remote-as 4
no auto-summary
R4#show run | section bgp
router bgp 4
no synchronization
bgp log-neighbor-changes
network 4.4.4.4 mask 255.255.255.255
neighbor 192.168.24.2 remote-as 2
neighbor 192.168.34.3 remote-as 3
no auto-summary
The BGP configurations are pretty straight-forward, we are using eBGP here. Note that
R4 has advertised a network (loopback interface) in BGP.
R4#show ip bgp
BGP table version is 2, local router ID is 192.168.34.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Ok so what do we see here? Let’s start with the items I highlighted in red first. This
router has network 4.4.4.4/32 in its BGP table and in front of the network there’s
the *> symbol:
40
The * means that this is a valid route and that BGP is able to use it.
The > means that this entry has been selected as the best path.
The next hop is 0.0.0.0. The next hop of 0.0.0.0 means that this network originated on
this router, that makes sense since I used the network command on R4 to advertise this
network into BGP.
Further to the right you see metric, local preference and weight. These are the BGP
attributes that are used to select the best path.
Path will show the AS path, there’s nothing there since this network was advertised in
BGP on this router. On the other routers you’ll see something here.
The ‘i’ is the origin code and indicates that this network was advertised into BGP using
the network command, the table says it refers to IGP but it doesn’t have anything to do
with “interior gateway protocols”. When you redistribute something into BGP it will show
up with the ? symbol. You will never see the ‘e’ symbol, this refers to EGP (Exterior
Gateway Protocol) which is the predecessor of BGP.
Some of the other things you see here is the BGP table version, every time the best
path changes this number will increase. You can see the BGP router ID of this router
and there are some other status codes:
supressed: BGP knows the network but won’t advertise it, this can occur when the network is
part of a summary.
damped: BGP doesn’t advertise this network because it was flapping too often (network
appears, disapears, appears, etc.) so it got a penalty.
history: BGP learned this network but doesn’t have a valid route at the moment.
RIB-failure: BGP learned this network but didn’t install it in the routing table. This occurs when
another routing protocol with a lower administrative distance also learned it.
stale: this is used for non-stop forwarding, this entry has to be refreshed when the remote BGP
neighbor has returned.
Let’s look at the BGP tables of the other routers, we’ll continue with R2:
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.24.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
41
The output of R2 is similar to what we have seen on R4 but there are two important
differences. The first one is the next hop, R2 learned about this network from
192.168.24.4. The second thing is the AS path, it’s showing AS 4.
R1#show ip bgp
BGP table version is 2, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
This output is even more interesting, this router has learned about our network from R2
and R3. Both entries are valid, they have * in front of them. BGP selected the path
through R2 as the best path, you can see the > in front of this entry.You can also see
the AS paths to reach this network.
If you want you can take a closer look at one of the entries in the BGP table, this is
useful when you have a lot of networks:
42
Origin IGP, localpref 100, valid, external, best
The information you see above tells us that we have two paths for this network, the
second one has been selected as the best path. Last but not least, let’s take a look at
R3:
R3#show ip bgp
BGP table version is 2, local router ID is 192.168.34.3
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Above you can see that R3 has two entries, it can use R1 or R4 to reach 4.4.4.4/32.
43
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ
Up/Down State/PfxRcd
192.65.89.161 4 7474 1318568 910036 244799121 0 0
41w2d 21006
202.139.124.130 4 7474 1306375 910020 244799121 0 0
41w2d 21024
202.160.242.71 4 7473 0 0 1 0 0
never Active
203.13.132.29 4 7474 2107424 910019 244799121 0 0
41w2d 20649
203.13.132.35 4 7474 59948648 910162 244799121 0 0
41w2d 538696
203.13.132.37 4 7474 1571073 601855 244799121 0 0
27w2d 20651
203.13.132.41 4 7474 2254818 910043 244799121 0 0
41w2d 20649
203.13.132.47 4 7474 46416 19778 244799121 0 0
6d07h 20649
203.13.132.49 4 7474 2260238 910030 244799121 0 0
41w2d 20649
203.13.132.51 4 7474 2296993 910146 244799121 0 0
41w2d 20649
203.13.132.53 4 7474 59909540 910088 244799121 0 0
41w2d 538696
203.202.143.3 4 7474 0 0 1 0 0
never Idle (Admin)
203.202.143.33 4 7474 34662511 910049 244799121 0 0
41w2d 539054
203.202.143.34 4 7474 33523616 910040 244799121 0 0
41w2d 539065
This router has over 500.000 networks and knows about more than 2.000.000 paths for
these networks. It is connected to 14 neighbors (2 are down) and here’s what the BGP
table looks like:
route-views.optus.net.au>show ip bgp
BGP table version is 244797821, local router ID is 203.202.125.6
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external
Origin codes: i - IGP, e - EGP, ? - incomplete
44
* 1.0.0.0/24 203.13.132.47 0 7474
15169 i
* 203.13.132.37 0 7474
15169 i
* 192.65.89.161 1 0 7474
15169 i
* 203.202.143.34 0 7474
15169 i
*> 203.202.143.33 0 7474
15169 i
* 203.13.132.49 0 7474
15169 i
* 202.139.124.130 1 0 7474
15169 i
* 203.13.132.51 0 7474
15169 i
* 203.13.132.53 0 7474
15169 i
* 203.13.132.41 0 7474
15169 i
* 203.13.132.35 0 7474
15169 i
* 203.13.132.29 0 7474
15169 i
* 1.0.4.0/24 203.13.132.47 10 0 7474
4826 56203 i
* 203.13.132.37 10 0 7474
4826 56203 i
* 203.202.143.34 0 7474
4826 56203 i
*> 203.202.143.33 0 7474
4826 56203 i
* 203.13.132.49 10 0 7474
4826 56203 i
* 192.65.89.161 1 0 7474
4826 56203 i
* 203.13.132.51 1 0 7474
4826 56203 i
* 202.139.124.130 1 0 7474
4826 56203 i
* 203.13.132.53 1 0 7474
4826 56203 i
* 203.13.132.41 1 0 7474
4826 56203 i
* 203.13.132.35 1 0 7474
4826 56203 i
* 203.13.132.29 1 0 7474
4826 56203 i
45
You can keep pressing enter, this BGP table is very long. I’m only showing the first two
networks. As you can see this router has network 1.0.0.0/24 in its BGP table and knows
about 12 different paths to get there. It decided to use 203.202.143.33 as the next hop.
It also learned about network 1.0.4.0/24 and is using 203.202.143.33 as the next hop.
This router probably knows about a couple of networks with issues, a fun way to find
these is by searching in the BGP table and excluding everything that starts with a *. This
removes all the valid networks from the BGP table:
I’m using the exclude command to filter every line that has a * in it. I have to use the
symbol in front of it since the * is a wildcard for regular expressions.
46
Above you can see two networks with issues. The first one (2.93.235.0/24) has a d in
front of it which means it’s dampened. The second network 31.131.7.0/24 has a h in
front of it that indicates that it’s a history entry.
In this lesson we’ll take a look how you can advertise networks in BGP. There are two
methods how we can do this:
Network command
Redistribution
Just like our IGPs we can use the network command to advertise something or we can
redistribute networks into BGP. There’s one big difference though, the network
command for BGP behaves differently.
When you use any of the IGPs (RIP, OSPF or EIGRP) then the network command is
used to activate the IGP on all interfaces that fall within the range of the network
command.
BGP doesn’t care about interfaces, it doesn’t even look at them. When we use the
network command in BGP then BGP will only look at the routing table. When it finds the
network that matches the network command, it will install it in the BGP table.
Let me show you some examples to explain what I’m talking about. We will use the
following two routers:
R1 and R2 are in different autonomous systems so we use eBGP. Here is the BGP
configuration:
47
R1#show running-config | section bgp
router bgp 1
bgp log-neighbor-changes
neighbor 192.168.12.2 remote-as 2
R2#show running-config | section bgp
router bgp 2
bgp log-neighbor-changes
neighbor 192.168.12.1 remote-as 1
Nothing special here, just plain eBGP between R1 and R2. Let’s advertise some
networks in BGP…
Network Command
Let’s create a loopback interface with a network and advertise it in BGP:
R1(config)#interface loopback 1
R1(config-if)#ip address 1.1.1.1 255.255.255.0
R1(config)#router bgp 1
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
Above we have created a loopback interface with network 1.1.1.0 /24, this is what we
will advertise in BGP. Since we created a loopback interface, this network will be
directly connected for R1:
Since it’s in the routing table, BGP will be able to install this network in the BGP table:
R1#show ip bgp
BGP table version is 2, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
48
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
That’s all there is to it. Just use the network command to put the networks you want in
the BGP table. One thing you have to be aware of is that you have to use the exact
network and subnet mask for the network command. Let me give you an example:
R1(config)#interface loopback 2
R1(config-if)#ip address 11.11.11.11 255.255.255.255
R1(config)#router bgp 1
R1(config-router)#network 11.11.11.0 mask 255.255.255.0
I created a loopback interface with network 11.11.11.11 /32. BGP uses the network
command to advertise 11.11.11.0 /24. This network will never be placed in the BGP
table since the subnet mask doesn’t match:
Be aware of this. Make sure you type the exact network address and subnet mask when
advertising something in BGP. Let’s fix this:
R1(config)#router bgp 1
R1(config-router)#no network 11.11.11.0 mask 255.255.255.0
R1(config-router)#network 11.11.11.11 mask 255.255.255.255
With the correct network command, BGP will be able to advertise this network in the
BGP table:
49
R1#show ip bgp 11.11.11.11
BGP routing table entry for 11.11.11.11/32, version 5
Paths: (1 available, best #1, table default)
Advertised to update-groups:
1
Local
0.0.0.0 from 0.0.0.0 (192.168.12.1)
Origin IGP, metric 0, localpref 100, weight 32768, valid,
sourced, local, best
And because R1 has it in its BGP table, R2 will be able to learn it:
Alright so far so good. What if we want to advertise a network that we don’t have? Let’s
say that I want to advertise network 1.0.0.0 /8 in BGP. We won’t be able to advertise
this network in BGP if it’s not in the routing table. To achieve this, we’ll put this network
in our routing table:
This can be done with a static route that points to the null interface, everything you
send to the null interface will be discarded. Using a static route like this is also called
a discard route.
Network 1.0.0.0 /8 is now in the routing table:
R1(config)#router bgp 1
R1(config-router)#network 1.0.0.0 mask 255.0.0.0
50
Take a look at the BGP table of R1 and R2:
R1 was able to install network 1.0.0.0 /8 in its BGP table and advertises it to R2.
Redistribution
Instead of using the network command we can also redistribute something into BGP.
To demonstrate this I will create a new loopback interface, advertise it in OSPF and
then redistribute it into BGP:
R1(config)#interface loopback 3
R1(config-if)#ip address 111.111.111.111 255.255.255.0
R1(config-if)#exit
R1(config)#router ospf 1
R1(config-router)#network 111.111.111.0 0.0.0.255 area 0
R1(config)#router bgp 1
R1(config-router)#redistribute ospf 1
51
*> 1.0.0.0 0.0.0.0 0 32768 i
*> 1.1.1.0/24 0.0.0.0 0 32768 i
*> 11.11.11.11/32 0.0.0.0 0 32768 i
*> 111.111.111.0/24 0.0.0.0 0 32768 ?
R2#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 1.0.0.0 192.168.12.1 0 0 1 i
*> 1.1.1.0/24 192.168.12.1 0 0 1 i
*> 11.11.11.11/32 192.168.12.1 0 0 1 i
*> 111.111.111.0/24 192.168.12.1 0 0 1 ?
There we go, R1 placed the network in its BGP table and was able to advertise it to R2.
One potential issue with iBGP is that it doesn’t change the next hop IP address.
Sometimes this can cause reachability issues. Let’s look at an example:
Once R1 learns about prefix 3.3.3.0 /24 then the next hop IP address will remain
192.168.23.3. When R1 doesn’t know how to reach this IP address then it will fail to
install 3.3.3.0 /24 in its routing table.
52
Let’s take a look at the configuration, I’ll show you two methods how we can deal with
this issue.
Configuration
Here’s the BGP configuraton that we will use:
R1(config)#router bgp 12
R1(config-router)#neighbor 192.168.12.2 remote-as 12
R2(config)#router bgp 12
R2(config-router)#neighbor 192.168.12.1 remote-as 12
R2(config-router)#neighbor 192.168.23.3 remote-as 3
R3(config)#router bgp 3
R3(config-router)#neighbor 192.168.23.2 remote-as 12
R3(config-router)#network 3.3.3.0 mask 255.255.255.0
The configuration is pretty straight forward. We use iBGP between R1/R2 and eBGP
between R2/R3. On R3 we advertised 3.3.3.0 /24 in BGP. Let’s take a look at the BGP
tables:
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R2 has installed 3.3.3.0 /24 in its BGP table and it is a valid route, the next hop is
192.168.23.3. Let’s check R1:
R1#show ip bgp
BGP table version is 1, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
53
R1 learns the prefix but it’s unable to install it in the routing table:
The problem here is that the next hop IP address is 192.168.23.3. Does R1 have any
clue how to reach this address?
R1#show ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter
area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type
2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-
IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user
static route
o - ODR, P - periodic downloaded static route
R1 doesn’t know so it’s impossible to install 3.3.3.0 /24 in the routing table. How can we
fix this? I’ll show you two different methods.
Advertise Network
The first solution is simple, we can advertise the network in iBGP (or an IGP if you use
one) so that R1 is able to reach the next hop. Let’s advertise 192.168.23.0 /24 in BGP:
R2(config)#router bgp 12
R2(config-router)#network 192.168.23.0 mask 255.255.255.0
R1#show ip bgp
BGP table version is 3, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
54
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R1 learns about 192.168.23.0 /24 so now it knows how to reach the next hop for 3.3.3.0
/24. It can now install this network in the routing table:
R2(config)#router bgp 12
R2(config-router)#neighbor 192.168.12.1 next-hop-self
From now on, when R2 advertises something to R1 then it will include it’s own IP
address as the next hop. Let’s verify this:
R1#show ip bgp
BGP table version is 6, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
55
Above you can see that R1 learns 3.3.3.0 /24 with 192.168.12.2 as the next hop. Since
this is directly connected, we can use this information:
BGP Auto-Summary
In a previous lesson I explained how the BGP network command works. When we
enable auto-summary for BGP, the way the network command works changes slightly.
Normally when you advertise a network in BGP you have to type in the exact network
and subnet mask that you want to advertise or it won’t be placed in the BGP table.
With auto-summary enabled, you can advertise a classful network and you don’t have to
add the mask parameter. BGP will automatically advertise the classful network if you
have the classful network or a subnet of this network in your routing table. Let me give
you an example to explain what I’m talking about. I’ll use these two routers:
These routers are configured for eBGP, there’s a loopback interface on R1 with network
1.1.1.1 /32. Here’s the configuration:
56
The configuration is straight-forward, we only configured eBGP, no networks have been
advertised and auto-summary is disabled. Let’s see if we can advertise classful network
1.0.0.0/8:
R1(config)#router bgp 1
R1(config-router)#network 1.0.0.0
Note that I didn’t specify a subnet mask with the mask parameter. Take a look at the
BGP table now:
As expected there is nothing in the BGP table since we require the exact network and
subnet mask. Let’s enable auto-summary now so you can see the difference:
R1(config)#router bgp 1
R1(config-router)#auto-summary
After enabling auto-summary things will change. Take a look at the BGP table of R1:
R1 now has an entry for classful network 1.0.0.0/8. It was able to install this in its BGP
table because auto-summary is enabled and we have 1.1.1.1/32 in our routing table.
This network will also be advertised to R2:
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
57
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Just like OSPF or EIGRP, BGP establishes a neighbor adjacency with other BGP
routers before they exchange any routing information. Unlike other routing protocols
however, BGP does not use broadcast or multicast to “discover” other BGP neighbors.
1. Idle:This is the first state where BGP waits for a “start event”. The start event occurs when
someone configures a new BGP neighbor or when we reset an established BGP peering. After
the start event, BGP will initialize some resources, resets a ConnectRetry timer and initiates a
TCP connection to the remote BGP neighbor. It will also start listening for a connection in case
the remote BGP neighbor tries to establish a connection. When successful, BGP moves to the
Connect state. When it fails, it will remain in the Idle state.
2. Connect: BGP is waiting for the TCP three-way handshake to complete. When it is successful, it
will continue to the OpenSent state. In case it fails, we continue to the Active state. If the
ConnectRetry timer expires then we will remain in this state. The ConnectRetry timer will be
reset and BGP will try a new TCP three-way handshake. If anything else happens (for example
resetting BGP) then we move back to the Idle state.
3. Active: BGP will try another TCP three-way handshake to establish a connection with the
remote BGP neighbor. If it is successful, it will move to the OpenSent state. If the ConnectRetry
timer expires then we move back to the Connect state. BGP will also keep listening for incoming
connections in case the remote BGP neighbor tries to establish a connection. Other events can
cause the router to go back to the Idle state (resetting BGP for example).
4. OpenSent: In this state BGP will be waiting for an Open message from the remote BGP
neighbor. The Open message will be checked for errors, if something is wrong (incorrect version
numbers, wrong AS number, etc.) then BGP will respond with a Notification message and jumps
back to the Idle state. This is also the moment where BGP decides whether we use EBGP or
IBGP (since we check the AS number). If everything is OK then BGP starts sending keepalive
58
messages and resets its keepalive timer. At this moment, the hold time is negotiated (lowest
value is picked) between the two BGP routers. In case the TCP session fails, BGP will jump
back to the Active state. When any other errors occur (expiration of hold timer), BGP will send a
notification message with the error code and jumps back to the Idle state. In case someone resets
the BGP process, we also jump back to the Idle state.
5. OpenConfirm: BGP waits for a keepalive message from the remote BGP neighbor. When we
receive the keepalive, we can move to the established state and the neighbor adjacency will be
completed. When this occurs, it will reset the hold timer. If we receive a notification message
from the remote BGP neighbor then we fall back to the Idle state. BGP will keep sending
keepalive messages.
6. Established: The BGP neighbor adjacency is complete and the BGP routers will send update
packets to exchange routing information. Every time we receive a keepalive or update message,
the hold timer will be resetted. In case we receive a notification message we will jump back to
the Idle state.
This whole process of becoming BGP neighbors can be visualized, this might be a bit
easier then just reading about it. The official name of a “diagram” that shows the
different states and we can move from one state to another is called a FSM (Finite State
Machine). For BGP, it looks like this:
Now you know about the different states, let’s take a look at some Cisco BGP routers to
see what it actually looks like on two routers. I’ll use the following topology for this:
59
Just two routers in two different autonomous systems. Before I configure BGP, let’s
enable a debug:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1#
BGP: 192.168.12.2 active went from Idle to Active
BGP: 192.168.12.2 open active, local address 192.168.12.1
BGP: 192.168.12.2 open failed: Connection refused by remote host
BGP: 192.168.12.2 Active open failed - tcb is not available, open
active delayed 9216ms (35000ms max, 60% jitter)
BGP: ses global 192.168.12.2 (0x4B43F3FC:0) act Reset (Active open
failed).
BGP: 192.168.12.2 active went from Active to Idle
BGP: nbr global 192.168.12.2 Active open failed - open timer
running
As soon as I configure BGP on R1 it will try to connect to R2. You can see the debug
says that the state moves from Idle to Active (it doesn’t show the Connect state in the
debug). When it fails, it falls back to the Idle state. Now let’s configure BGP on R2 as
well so we can see a successful progress through the states:
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
Above you can see that the BGP state moves from Idle to Active and then to OpenSent.
Some Open messages are sent and received, the BGP routers are exchanging some of
their capabilities. From there we move to the OpenConfirm and Established state.
Finally you see the BGP neighbor as up. On R2 we see something similar:
61
R2#
BGP: 192.168.12.1 passive open to 192.168.12.2
BGP: 192.168.12.1 passive went from Idle to Connect
BGP: ses global 192.168.12.1 (0x4B269374:0) pas Setting open delay
timer to 60 seconds.
BGP: ses global 192.168.12.1 (0x4B269374:0) pas read request no-op
BGP: 192.168.12.1 passive rcv message type 1, length (excl. header)
34
BGP: ses global 192.168.12.1 (0x4B269374:0) pas Receive OPEN
BGP: 192.168.12.1 passive rcv OPEN, version 4, holdtime 180 seconds
BGP: 192.168.12.1 passive rcv OPEN w/ OPTION parameter len: 24
BGP: 192.168.12.1 passive rcvd OPEN w/ optional parameter type 2
(Capability) len 6
BGP: 192.168.12.1 passive OPEN has CAPABILITY code: 1, length 4
BGP: 192.168.12.1 passive OPEN has MP_EXT CAP for afi/safi: 1/1
BGP: 192.168.12.1 passive rcvd OPEN w/ optional parameter type 2
(Capability) len 2
BGP: 192.168.12.1 passive OPEN has CAPABILITY code: 128, length 0
BGP: 192.168.12.1 passive OPEN has ROUTE-REFRESH capability(old)
for all address-families
BGP: 192.168.12.1 passive rcvd OPEN w/ optional parameter type 2
(Capability) len 2
BGP: 192.168.12.1 passive OPEN has CAPABILITY code: 2, length 0
BGP: 192.168.12.1 passive OPEN has ROUTE-REFRESH capability(new)
for all address-families
BGP: 192.168.12.1 passive rcvd OPEN w/ optional parameter type 2
(Capability) len 6
BGP: 192.168.12.1 passive OPEN has CAPABILITY code: 65, length 4
BGP: 192.168.12.1 passive OPEN has 4-byte ASN CAP for: 1
BGP: nbr global 192.168.12.1 neighbor does not have IPv4 MDT
topology activated
BGP: 192.168.12.1 passive rcvd OPEN w/ remote AS 1, 4-byte remote
AS 1
BGP: ses global 192.168.12.1 (0x4B269374:0) pas Adding topology
IPv4 Unicast:base
BGP: ses global 192.168.12.1 (0x4B269374:0) pas Send OPEN
BGP: 192.168.12.1 passive went from Connect to OpenSent
BGP: 192.168.12.1 passive sending OPEN, version 4, my as: 2,
holdtime 180 seconds, ID C0A80C02
BGP: 192.168.12.1 passive went from OpenSent to OpenConfirm
BGP: 192.168.12.1 passive went from OpenConfirm to Established
BGP: ses global 192.168.12.1 (0x4B269374:1) pas Assigned ID
BGP: nbr global 192.168.12.1 Stop Active Open timer as all
topologies are allocated
BGP: ses global 192.168.12.1 (0x4B269374:1) Up
%BGP-5-ADJCHANGE: neighbor 192.168.12.1 Up
62
The output of these debug messages are nice and easy to read. If for some reason your
neighbor adjacency doesn’t appear, these debugs can be helpful to solve the problem.
BGP Messages
BGP uses a variety of messages for establishing the connection, exchanging routing
information, checking if the remote BGP neighbor is still there and/or notifying the
remote side if any errors occur.
Open Message
Update Message
Keepalive Message
Notification Message
All of these BGP messages use a fixed-size header, it includes a type field that
indicates what type of message it is.
To explain these BGP messages I will show you some Wireshark captures. I will use the
following topology for this:
Open Message
Once two BGP routers have completed a TCP 3-way handshake they will attempt
to establish a BGP session, this is done using open messages. In the open message you
will find some information about the BGP router, these have to be negotiated and
accepted by both routers before we can exchange any routing information. Here are
some of the items you will find in the open message:
Version: this includes the BGP version that the router is using. The current version of BGP is
version 4 which is described in RFC 4271. Two BGP routers will try to negotiate a compatible
version, when there is a mismatch then there will be no BGP session.
63
My AS: this includes the AS number of the BGP router, the routers will have to agree on the AS
number(s) and it also defines if they will be running iBGP or eBGP.
Hold Time: if BGP doesn’t receive any keepalive or update messages from the other side for the
duration of the hold time then it will declare the other side ‘dead’ and it will tear down the BGP
session. By default the hold time is set to 180 seconds on Cisco IOS routers, the keepalive
message is sent every 60 seconds. Both routers have to agree on the hold time or there won’t be a
BGP session.
BGP Identifier: this is the local BGP router ID which is elected just like OSPF does:
o Use the router-ID that was configured manually with the bgp router-id command.
o Use the highest IP address on a loopback interface.
o Use the highest IP address on a physical interface.
Optional Parameters: here you will find some optional capabilities of the BGP router. This
field has been added so that new features could be added to BGP without having to create a new
version.Things you might find here are:
o support for MP-BGP (Multi Protocol BGP).
o support for Route Refresh.
o support for 4-octet AS numbers.
Here’s an example of a wireshark capture of an open message between R1 and R2:
Above you can see the open message from R1 to R2. You can see the things that we
discussed, the BGP version, AS number, hold time, BGP ID and the optional
parameters (MP-BGP and route refresh). The marker field on top is used to indicate if
we use MD5 authentication or not. When it’s filled with 1’s then we are not using
authentication.
Update Message
64
Once two routers have become BGP neighbors, they can start exchanging routing
information. This is done with the update message. In the update message you will find
information about the prefixes that are advertised.In “BGP language” a prefix is referred
to as NLRI (Network Layer Reachability Information) . Here are some of the things you
will find in an update message:
Withdrawn Route Length: this field shows the length of the Withdrawn Routes field in bytes.
When it is set to 0, there are no routes withdrawn and the Withdrawn Routes field will not show
up.
Withdrawn Routes: this field shows all the prefixes that should be removed from the BGP table.
Total Path Attribute Length: here you will find the total length of the Path Attributes field.
Path Attributes: the BGP attributes for the prefix are stored here, for example: origin, as_path,
next_hop, med, local preference, etc. These path attributes are stored in TLV-format (Type,
Length, Value).
Each of the BGP attributes also has an attribute flag that tells the BGP router how to
treat the attribute. Here are the different bit flags:
Optional: when the attribute is well-known this bit is set to 0, when its optional it is set to 1.
Transitive: when an optional attribute is non-transitive this bit is set to 0, when it is transitive it is
set to 1.
Partial: when an optional attribute is complete this bit is set to 0, when it’s partial it is set to 1.
Extended Length: when the attribute length is 1 octet it is set to 0, for 2 octets it is set to 1. This
extended length flag may only be used if the length of the attribute value is greater than 255
octets.
Let’s take a look at an update message from R1:
R1(config)#router bgp 1
R1(config-router)#network 1.1.1.1 mask 255.255.255.255
65
Above you can see a update message from R1. No routes are withdrawn and there are
a couple of BGP attributes. You can see the ORIGIN, AS_PATH and
MULTI_EXIT_DISC (MED). I also highlighted some of the flags. The AS_PATH attribute
is transitive while MULTI_EXIT_DISC is optional. At the bottom you can find the NLRI
information with our prefix.
Let’s remove the network command for the loopback interface on R1 so that we can see
a withdrawn in the update message:
R1(config)#interface loopback 0
R1(config-if)#shutdown
66
Here you can see the withdrawn routes length which is 5 bytes. In the Withdrawn
Routes field we see our 1.1.1.1 /32 prefix that should be removed.
Keepalive Message
When there are no routes to be advertised or withdrawn, there’s not much our BGP
neighbors have to share with each other. To make sure the other side is “still there” we
use these periodic keepalive messages. By default, BGP sends 19 byte long keepalive
messages every 60 seconds. When a remote BGP neighbor misses three keepalives (3
x 60 = 180 seconds, the value of the hold time) it will flush the routes from the BGP
neighbor.
The keepalive message is really simple, it’s just a basic header with the length (19
bytes) and the type.
Notification Message
67
The notification message is used when an error occurs which will result in termination of
the BGP neighbor adjacency. When something goes wrong, the notification message
will be sent and the session will be terminated.
The TCP session will be cleared, all entries from this BGP neighbor will be removed
from the BGP table and update messages with route withdrawals will be sent to other
BGP neighbors.
There is a list with BGP error codes and each error code has a sub-type. Here are some
examples:
By changing the AS number on one of the routers we will have a mismatch. Here’s the
wireshark capture:
68
R1 is sending R2 a notification message with a major error “open message error” and
the minor error code (subtype) is bad peer AS.
BGP is a complex routing protocol and there are quite some things that could go
possibly wrong. Besides being complex it’s also completely different compared to our
IGPs (OSPF and EIGRP). In this lesson we’ll start with troubleshooting BGP neighbor
adjacencies. Once the neighbor adjacency is working, you can focus on troubleshooting
missing route advertisements.
Two BGP routers which are connected and configured for EBGP. Unfortunately we are
seeing this when check the BGP neighbor adjacency:
69
R1#show ip bgp summary
BGP router identifier 192.168.12.1, local AS number 1
BGP table version is 1, main routing table version 1
When two EBGP routers that are directly connected do not form a working BGP
neighbor adjacency there could be a number of things that are wrong:
R1#ping 192.168.12.2
I can do a quick ping and I’ll see that I’m unable to reach the other side. Since layer 3
isn’t working, let’s take a look at layer 1 and 2:
70
Interface IP-Address OK? Method Status
Protocol
FastEthernet0/0 192.168.12.2 YES manual administratively
down down
We’ll check the interfaces and find out that someone left a shutdown command on the
interface…let’s fix it:
Same topology but another issue. The goal in this scenario is to establish the EBGP
neighbor adjacency between the IP addresses on the loopback interfaces.
71
router bgp 2
no synchronization
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 1
no auto-summary
Here’s the BGP configuration, you can see that we are using the loopback interfaces to
establish a BGP neighbor adjacency. There’s no BGP neighbor adjacency:
Both routers show their BGP neighbor as idle. There are a number of things we have to
check here:
Is the IP address of the BGP neighbor reachable? We are not using the directly connected links
so we might have routing issues.
The TTL of IP packets that we use for external BGP is 1. This works for directly connected
networks but if it’s not directly connected we need to change this behavior.
By default BGP will source its updates from the IP address that is closest to the BGP neighbor.
In our example that’s the FastEthernet interface. This is something we’ll have to change.
Let’s check if the IP address of the remote neighbor is reachable, take a look at the
routing tables:
R1#show ip route
72
C 192.168.12.0/24 is directly connected, FastEthernet0/0
Both routers only know about their directly connected networks. In order to reach each
other’s loopback interfaces we’ll use static routing:
Sending a ping to IP address 2.2.2.2 and sourcing it from my own loopback interface
proves that both routers know how to reach each other’s loopback interface. Since we
don’t use the directly connected interfaces for the peering, we also have to increase the
TTL:
The ebgp-multihop command changes the TTL to 2. Now take a look at a debug:
R2#debug ip bgp
BGP debugging is on for address family: IPv4 Unicast
73
We can enable a debug to see the progress. You can clearly see that R2 is using IP
address 192.168.12.2 and that R1 is refusing the connection. This is because we use
the wrong source IP address. We have to tell BGP to use another IP address:
Use the update-source command to change the source IP address for the BGP
updates. After making these changes, the problem should be fixed:
R1#
%BGP-5-ADJCHANGE: neighbor 2.2.2.2 Up
R2#
%BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up
Lesson learned: BGP routers don’t have to establish a neighbor adjacency using the
directly connected interfaces. Make sure the BGP routers can reach each other, that
BGP packets are sourced from the correct interface and in case of EBGP don’t forget
to use the multihop command.
BGP TCP Port Filtering
Let’s take a look at an IBGP issue:
74
no auto-summary
R2#show run | section bgp
router bgp 1
no synchronization
bgp log-neighbor-changes
neighbor 192.168.12.1 remote-as 1
no auto-summary
Plain and simple. The routers use the directly connected IP addresses for the BGP
neighbor adjacency. Let’see if we have neighbors:
Too bad…we are not becoming neighbors. What could possibly be wrong? We are
using the directly connected interfaces so there’s not that much that could go wrong
except for L2/L2 issues. Let’s try a simple ping:
R1#ping 192.168.12.2
Sending a ping from one router to the other proves that L2 and L3 are working fine.
What about L4? We could have issues with the transport layer. Let’s give it a try:
75
R1#telnet 192.168.12.2 179
Trying 192.168.12.2, 179 ...
% Destination unreachable; gateway or host down
R2#telnet 192.168.12.1 179
Trying 192.168.12.1, 179 ...
I’m unable to connect to TCP port 179 from both routers. This should ring a bell, maybe
something is blocking BGP ?
R2#show ip access-lists
Extended IP access list 100
10 deny tcp any eq bgp any (293 matches)
15 deny tcp any any eq bgp (153 matches)
20 permit ip any any (109 matches)
Someone decided it was a good idea to “secure” BGP and block it with an access-list.
Let’s get rid of it:
76
Next IBGP issue. This one is similar to the EBGP situation I showed you before…we
are using the loopback interfaces to establish the BGP neighbor adjacency, here are the
configurations:
Nothing special, IBGP and we are using the loopback interfaces for the neighbor
adjacency. There are no neighbors though:
Let’s first check if the routers can reach each other’s loopback interfaces:
77
R1#show ip route
A quick look at the routing table shows us that this is not the case. We could fix this with
a static route or an IGP. Normally we use an IGP for IBGP to advertise the loopback
interfaces, let’s use OSPF:
R1(config)#router ospf 1
R1(config-router)#network 1.1.1.0 0.0.0.255 area 0
R1(config-router)#network 192.168.12.0 0.0.0.255 area 0
R2(config)#router ospf 1
R2(config-router)#network 192.168.12.0 0.0.0.255 area 0
R2(config-router)#network 2.2.2.0 0.0.0.255 area 0
Smashing in the correct OSPF commands should do the job! Let’s try a quick ping:
78
1.1.1.1 4 1 0 0 0 0 0 never
Active
R1#debug ip bgp
BGP debugging is on for address family: IPv4 Unicast
BGP: 2.2.2.2 open active, local address 192.168.12.1
BGP: 2.2.2.2 open failed: Connection refused by remote host, open
active delayed 32957ms (35000ms max, 28% jitter)
R2#debug ip bgp
BGP debugging is on for address family: IPv4 Unicast
BGP: 1.1.1.1 open active, local address 192.168.12.2
BGP: 1.1.1.1 open failed: Connection refused by remote host, open
active delayed 32957ms (35000ms max, 28% jitter)
A debug shows up that the connection is refused and it also shows us the local IP
address that is used for BGP. Seems someone forgot to add the update-source
command so let’s fix it!
R1(config)#router bgp 1
R1(config-router)#neighbor 2.2.2.2 update-source loopback 0
R2(config)#router bgp 1
R2(config-router)#neighbor 1.1.1.1 update-source loopback 0
Just like EBGP we have to set the correct source for our BGP packets. After a few
seconds you’ll see this:
Problem solved! The only difference with EBGP is that we don’t have to change the TTL
with the ebgp-multihop command.
Lesson learned: Its common practice to configure IBGP between loopback interfaces.
Make sure these loopbacks are reachable and that the BGP updates are sourced from
the loopback interface.
These are the most common issues why BGP doesn’t form a neighbor adjacency. The
routers are now up and running so we can continue to troubleshoot (missing) route
advertisements.
79
Troubleshooting BGP Route Advertisement
Once your BGP neighbor adjacency is up and running then you can try to troubleshoot
issues with route advertisements. In a previous lesson I explained how to fix BGP
neighbor adjacencies, this time we’ll focus on route advertisements. Let’s look at some
examples!
BGP Network Command
Let’s start with an EBGP scenario:
At first sight there seems to be nothing wrong here. Let’s see if R2 learned anything:
80
192.168.12.1 4 1 4 4 1 0 0 00:01:26
0
However R2 didn’t learn any prefixes from R1. Perhaps we have a filter?
Maybe there’s a distribute-list but that’s not the case here. Let’s check the network
commands on R1:
The problem is the network command, it works differently for BGP vs our IGPs. If we
configure a network command for BGP it has to be an exact match. In this case I forgot
to add the subnet mask…let’s fix it:
R1(config)#router bgp 1
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
I have to make sure I type the correct subnet mask. Now check R2 again:
Now you can see we learned the prefix and R2 installs it in the routing table…problem
solved!
81
Lesson learned: Type in the exact correct subnet mask…BGP is picky!
BGP Summarization
Let’s move onto the next scenario.
The network engineer from AS1 wants to advertise a summary to AS 2. The network
engineer from AS 2 is complaining however that he’s not receiving anything…let’s find
out what is going wrong! Here are the configurations:
You can see the aggregate-address command on R1 for network 172.16.0.0 /16. Did
R2 receive anything?
Too bad…no prefixes have been received by R2. There are two things I could check
here:
82
See if a distribute-list is blocking prefixes inbound like I did in the previous example.
See what R1 has in its routing table (can’t advertise what I don’t have!).
Let’s start with the routing table of R1 since I think by now you know what a distribute-
list looks like..
R1#show ip route
There’s nothing here that looks even close to 172.16.0.0 /16. If we want to advertise a
summary we have to put something in the routing table of R1 first. Let me show you the
different options:
R1(config)#interface loopback 0
R1(config-if)#ip address 172.16.0.1 255.255.255.0
R1(config-if)#exit
R1(config)#router bgp 1
R1(config-router)#network 172.16.0.0 mask 255.255.255.0
This is option 1: I’ll create a loopback interface and configure an IP address that falls
within the range of the aggregate-address command. The summary can now be
advertised to R2:
Now we see the summary in the routing table of R2. By default it will still advertise the
other prefixes. If you don’t want this you need to use the aggregate-address summary-
only command!
Let me show you option 2 of advertising the summary:
83
First we’ll put the 172.16.0.0 /16 network in the routing table by creating a static route
and pointing it to the null0 interface. Secondly I’ll use a network command for BGP to
advertise this network. The result will be this:
Lesson learned: You can’t advertise what you don’t have. Create a static route and
point it to the null0 interface or create a loopback interface that has a prefix that falls
within the summary address range.
BGP Auto-Summary
Next problem coming up, this is the topology:
Onto the next scenario. You are working as a network engineer for AS 1 and one day
you get a phone call from the network engineer at AS 2 asking you why you are
advertising a summary for 1.0.0.0 /8. You have no idea what the hell he is talking about
so you decide to do some research. Here’s what we see on R2:
This is what the network engineer on R2 is seeing. Let’s check why R1 is advertising
this:
84
Advertised to update-groups:
1
Local
0.0.0.0 from 0.0.0.0 (1.1.1.1)
Origin incomplete, metric 0, localpref 100, weight 32768,
valid, sourced, best
We can see that we have network 1.0.0.0 /8 in the BGP table of R1. Let’s check its
routing table:
Network 1.1.1.0 /24 is configured on the loopback interface but it’s in the BGP table as
1.0.0.0 /8. This could mean only 1 thing….summarization. Take a look below:
R1#show ip protocols
Routing Protocol is "bgp 1"
Outgoing update filter list for all interfaces is not set
Incoming update filter list for all interfaces is not set
IGP synchronization is disabled
Automatic route summarization is enabled
A quick look at show ip protocols reveals that automatic summarization is enabled. Let’s
disable it:
R1(config)#router bgp 1
R1(config-router)#no auto-summary
85
Lesson learned: If you see classful networks in your BGP table you might have auto-
summary enabled.
Some of the problems I’ve been showing you could be resolved easily by just looking and/or
comparing the output of a “show run”. This might be true but keep in mind that you don’t always
have access to all BGP routers in the network so maybe there’s no way to compare configurations.
There could be a switch or another router in between the devices you are trying to troubleshooting
that are causing issues. Using the appropriate show and debug commands will show you exactly what
your router is doing and what it is advertising to other routers.
BGP Route-Maps
Same topology, different problem:
The people from AS 2 are complaining that they are not receiving anything from AS 1.
To keep it interesting I’m not going to show you the configurations…
For starters, we can see that R2 is not receiving any prefixes. Do we have any filters?
I can also verify that R1 doesn’t have any distribute-lists. Let’s check if R1 has 1.1.1.0
/24 in its BGP table:
86
Not advertised to any peer
Local
0.0.0.0 from 0.0.0.0 (1.1.1.1)
Origin incomplete, metric 0, localpref 100, weight 32768,
valid, sourced, best
I can confirm that R1 does have network 1.1.1.0 /24 in its routing table so why is it not
advertising this to R2?
Let’s see if R1 has configured anything special for its neighbor R2:
87
Used as bestpath: n/a 0
Used as multipath: n/a 0
R1#show ip prefix-list
ip prefix-list PREFIXES: 1 entries
seq 5 deny 1.1.1.0/24
There’s our troublemaker…its denying network 1.1.1.0 /24! Let’s get rid of this route-
map:
R1(config)#router bgp 1
R1(config-router)#no neighbor 192.168.12.2 route-map NEIGHBORS out
Lesson learned: Make sure there are no route-maps blocking the advertisement of
prefixes.
IBGP Split Horizon
Here’s a new topology:
88
R1 is advertising network 1.1.1.0 /24 but R3 is not learning this prefix. Here are the
configurations:
The neighbor adjacencies have been configured,R1 is advertising network 1.1.1.0 /24.
Let’s see if R2 and/or R3 have learned about it:
We can see network 1.1.1.0 /24 in the routing table of R2 but it’s not showing up on R3.
89
Technically there is no problem. If you look closely at the BGP configuration of all three
routers you can see there is only a BGP neighbor adjacency between R1 & R2 and
between R2 & R3. Because of IBGP split horizon R2 does not forward network
1.1.1.0 /24 towards R3. In order to fix this we need to configure R1 and R3 to become
neighbors. To accomplish this, R1 and R3 should be able to reach each other. I’ll keep it
simple and use a static route for this:
If I’m going to configure the BGP neighbor adjacency between R1 and R3 I’ll need to
make sure they can reach each other. I can use a static route or an IGP…to keep things
easy I’ll use a static route this time. Now let’s configure IBGP between R1 and R3:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.23.3 remote-as 1
R3(config)#router bgp 1
R3(config-router)#neighbor 192.168.12.1 remote-as 1
Lesson learned: IBGP neighbor adjacencies have to be full mesh! Another solution
would be by using a route-reflector or confederation.
BGP Next Hop
90
R3 is advertising network 3.3.3.0 /24 through EBGP and R2 installs it in the routing
table. R1 however doesn’t have this network in its routing table. Here are the
configurations:
Here are the configurations. To keep things easy I’m using the physical interface IP
addresses to configure the BGP neighbor adjacencies. Let’s see if R2 learns about
3.3.3.0 /24:
We can verify that network 3.3.3.0 /24 is in the routing table of R2. What about R1?
There’s nothing in the routing table of R1 however. The first thing we should check is if
it’s the BGP table or not:
R1#show ip bgp
91
BGP table version is 1, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
We can see it’s in the BGP table and the * indicates that this is a valid route. However I
don’t see the > symbol which indicates the best path. For some reason BGP is unable
to install this entry in the routing table. Take a close look at the next hop IP address
(192.168.23.3). Is this IP address reachable?
R1 has no idea how to reach 192.168.23.3 so our next hop is unreachable. There are 2
ways how we can deal with this issue:
Use a static route or routing protocol to make this next hop IP address reachable.
Change the next hop IP address.
We’ll change the next hop IP address since I think you’ve seen enough static routes and
routing protocols so far:
R2(config)#router bgp 1
R2(config-router)#neighbor 192.168.12.1 next-hop-self
This command will change the next hop IP address to the IP address of R2. Check out
the changes on R1:
R1#show ip bgp
BGP table version is 2, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
92
You can see the > symbol that indicates that this path has been selected as the best
one. The next hop IP address is now 192.168.12.2.
Hooray! It’s in the routing table now. Are we done now? If my goal was to make this
show up in the routing table then we are now finished…there’s another issue however:
R1#ping 3.3.3.3
My ping is unsuccessful. R1 and R2 both have network 3.3.3.0 /24 in their routing table
so we know that they know where to forward the IP packets to. Let’s take a look at R3:
R3#show ip route
R3 will receive an IP packet with destination 3.3.3.3 and source 192.168.12.1. You can
see in the routing table that it has no idea where to send IP packets meant for
192.168.12.1. Let’s change that:
R2(config)#router bgp 1
R2(config-router)#network 192.168.12.0 mask 255.255.255.0
We’ll advertise network 192.168.12.0 /24 on R2. Now R3 knows how to reach it:
93
R1#ping 3.3.3.3
Problem solved!
Lesson learned: Make sure the next hop IP address is reachable so routes can be
installed in the routing table and that all required networks are reachable.
Video 1
IGPs select the path with the lowest metric. For example:
94
Network Next Hop Metric LocPrf Weight Path
* 1.0.0.0/24 203.202.143.34 0 7474 4826 13335 i
* 192.65.89.161 1 0 7474 4826 13335 i
* 202.139.124.130 1 0 7474 4826 13335 i
* 203.13.132.7 10 0 7474 4826 13335 i
*> 203.202.143.33 0 7474 4826
13335 i
This BGP router has 5 paths for network 1.0.0.0/24. Look at the > symbol at the bottom
left. The > symbol means that BGP has selected this path as the best path. This path
will be installed in the routing table.
Out of all those 5 paths, why did BGP select this path as the best path?
Attributes
This path was selected based on the following attributes:
Priority Attribute
1 Weight
2 Local Preference
3 Originate
4 AS path length
5 Origin code
6 MED
95
8 Shortest IGP path to BGP next hop
9 Oldest path
10 Router ID
11 Neighbor IP address
Let me give you a quick overview of each attribute. We will cover these in other lessons
in detail.
Weight
Prefer the path with the highest weight. This is a value that is local to the router and
it’s Cisco proprietary. The default value is 0 for all routes that are not originated by the
local router. You can learn how it works in the BGP weight attribute lesson.
Local Preference
The local preference is used within an autonomous system and exchanged between
iBGP routers. We prefer the path with the highest local preference. The default value
is 100. To learn more, take a look at the BGP local preference attribute lesson.
Originate
Prefer the path that the local router originated. In the BGP table, you will see next hop
0.0.0.0. You can get a path in the BGP table through the BGP network command,
aggregation, or redistribution. A BGP router will prefer routes that it installed into BGP
itself over a route that another router installed in BGP.
AS path length
Prefer the path with the shortest AS path length. For example, AS path 1 2 3 is
preferred over AS path 1 2 3 4 5. You can learn more about AS path length here.
Origin code
96
IGP is lower than EGP and EGP is lower than INCOMPLETE. You can learn how it works
in the origin code lesson.
MED
Prefer the path with the lowest MED. The MED is exchanged between autonomous
systems. For a detailed explanation, take a look at the MED lesson.
eBGP path over iBGP path
Prefer the path within the autonomous system with the lowest IGP metric to the BGP
next hop.
Oldest Path
Prefer the path that we received first, in other words, the oldest path.
Router ID
Prefer the path with the lowest BGP neighbor router ID. The router ID is based on the
highest IP address. If you have a loopback interface, then the IP address on the
loopback will be used. The router ID can also be manually configured.
Neighbor IP address
Prefer the path with the lowest neighbor IP address. If you have two eBGP routers
and two links in between then the router ID will be the same. In this case, the neighbor
IP address is the tiebreaker.
Path Selection
When BGP has multiple paths to a destination they are stored in the BGP table. All
paths are in the BGP table but only one gets installed in the routing table.
Which path do we select? We start at the top of the list with BGP attributes and work
our way to the bottom:
1. We start with weight because it’s at the top of the BGP attributes list. We now have
two options:
1. If one path has a better weight then we select this path as the best path.
2. If the weight is equal, we move down to the next attribute.
2. The next attribute is local preference. Once again, we have two options:
1. If one path has a better local preference then we select this path as the best
path.
2. If the local preference is equal, we move down to the next attribute.
97
3. We work our way down this attribute list until we have a tiebreaker to select the
best path. If all paths have the same BGP attributes then we end up with the
neighbor IP address.
I hope this lesson has been useful to understand how BGP selects the best path.
Weight is a Cisco proprietary BGP attributes that can be used to select a certain path.
Here’s what you need to know about weight:
98
Above we have a simple scenario with two autonomous systems. R2 and R3 both have
network 2.2.2.0/24 configured on their loopback0 interface and I’ll advertise that in BGP.
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1(config-router)#neighbor 192.168.13.3 remote-as 2
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R2(config-router)#neighbor 192.168.23.3 remote-as 2
R2(config-router)#network 2.2.2.0 mask 255.255.255.0
R3(config)#router bgp 2
R3(config-router)#neighbor 192.168.13.1 remote-as 1
R3(config-router)#neighbor 192.168.23.2 remote-as 2
R3(config-router)#network 2.2.2.0 mask 255.255.255.0
Above you’ll find the configuration for BGP, now let’s take a detailed look at R1:
R1#show ip bgp
99
BGP table version is 2, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Router R1 decided to use 192.168.12.2 as the next hop. All the BGP attributes are the
same so it came down to the router ID to select a winner. Now let’s change this
behavior using the weight attribute…
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.13.3 weight 500
Sometimes BGP behaves like an oil tanker so to speed things up in your lab, reset it.
R1#show ip bgp
BGP table version is 2, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Now you can see that 192.168.13.3 has been selected as the next hop because the
weight is now 500.
What if I want to set the weight to 500 for just a couple of prefixes from AS 2?
R2(config)#interface loopback 1
R2(config-if)#ip address 22.22.22.22 255.255.255.0
100
R2(config)#router bgp 2
R2(config-router)#network 22.22.22.0 mask 255.255.255.0
R3(config)#interface loopback 1
R3(config-if)#ip address 22.22.22.22 255.255.255.0
R3(config)#router bgp 2
R3(config-router)#network 22.22.22.0 mask 255.255.255.0
I’ll create a new loopback interface on router R2 and R3 and I’ll advertise network
22.22.22.0/24 in BGP. Here’s what router R1 now looks like:
R1#show ip bgp
BGP table version is 5, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
As you can see above router R1 will use 192.168.13.3 as the next hop for both prefixes.
What if I want to change the weight for just 1 prefix? Route-maps to the rescue!
R1(config)#router bgp 1
R1(config-router)#no neighbor 192.168.13.3 weight 500
Here’s the route-map that I will use. If the prefixes match access-list 1 we will set the
weight to 400.
101
R1(config-router)#neighbor 192.168.13.3 route-map SETWEIGHT in
R1#clear ip bgp *
R1#show ip bgp
BGP table version is 3, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
BGP attribute local preference is the second BGP attribute and it can be used to choose
the exit path for an autonomous system. Here are the details:
102
You can use local preference to configure your autonomous system to select a certain
exit point. Instead of configuring weight on each router you can use local preference
because it is exchanged on all internal BGP routers. By increasing the local preference
to 800 we can make AS 1 send all traffic towards AS 2.
103
In the picture above we have two autonomous systems. R1 will advertise network
1.1.1.0/24 towards AS 2 and R4 will have to make a choice when it wants to reach this
network. It can go through router R2 or R3, we’ll see how local preference influence
this.
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1(config-router)#neighbor 192.168.13.3 remote-as 2
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
R2(config)#interface loopback 0
R2(config-if)#ip address 2.2.2.2 255.255.255.0
R2(config)#router ospf 1
R2(config-router)#network 192.168.24.0 0.0.0.255 area 0
R2(config-router)#network 2.2.2.0 0.0.0.255 area 0
R3(config)#interface loopback 0
R3(config-if)#ip address 3.3.3.3 255.255.255.0
R3(config)#router ospf 1
R3(config-router)#network 192.168.34.0 0.0.0.255 area 0
104
R3(config-router)#network 3.3.3.0 0.0.0.255 area 0
R4(config)#interface loopback 0
R4(config-if)#ip address 4.4.4.4 255.255.255.0
R4(config)#router ospf 1
R4(config-router)#network 192.168.24.0 0.0.0.255 area 0
R4(config-router)#network 192.168.34.0 0.0.0.255 area 0
R4(config-router)#network 4.4.4.0 0.0.0.255 area 0
R3(config)#router bgp 2
R3(config-router)#neighbor 192.168.13.1 remote-as 1
R3(config-router)#neighbor 2.2.2.2 remote-as 2
R3(config-router)#neighbor 2.2.2.2 update-source loopback0
R3(config-router)#neighbor 4.4.4.4 remote-as 2
R3(config-router)#neighbor 4.4.4.4 update-source loopback0
R3(config-router)#neighbor 4.4.4.4 next-hop-self
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R2(config-router)#neighbor 3.3.3.3 remote-as 2
R2(config-router)#neighbor 3.3.3.3 update-source loopback0
R2(config-router)#neighbor 4.4.4.4 remote-as 2
R2(config-router)#neighbor 4.4.4.4 update-source loopback0
R2(config-router)#neighbor 4.4.4.4 next-hop-self
R4(config)#router bgp 2
R4(config-router)#neighbor 2.2.2.2 remote-as 2
R4(config-router)#neighbor 2.2.2.2 update-source loopback 0
R4(config-router)#neighbor 3.3.3.3 remote-as 2
R4(config-router)#neighbor 3.3.3.3 update-source loopback 0
Now let’s find out what path R4 will use to reach network 1.1.1.0/24:
R4#show ip bgp
BGP table version is 2, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
105
All attributes are the same so it’s the router ID that makes the decision. All traffic is sent
to R2 right now. Let’s play with the local preference…
R3(config)#router bgp 2
R3(config-router)#bgp default local-preference 600
The default local preference is 100 and you can change it if you like with the bgp
default local-preference command.
R4#show ip bgp
BGP table version is 3, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Now we see that R4 prefers to send traffic to network 1.1.1.0/24 towards R3 because
the local preference is 600 > 100.
Of course we can accomplish the same thing with a route-map, here’s how:
R3(config)#router bgp 2
R3(config-router)#no bgp default local-preference 600
R3(config)#router bgp 2
R3(config-router)#neighbor 192.168.13.1 route-map LOCALPREF in
Route-maps are a more flexible solution. If you don’t use a match statement in a route-
map then everything is matched by default. You can use it to set the local preference to
another value. Don’t forget to activate the route-map by binding it to a BGP neighbor.
R4#show ip bgp
106
BGP table version is 5, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
In my example AS 1 wants to make sure traffic enters the autonomous system through
R2. We can add our own autonomous system number multiple times so the as path
becomes longer. Since BGP prefers a shorter AS path we can influence our routing.
107
This is called AS path prepending. Let’s see what this looks like on Cisco routers, this is
the topology that I will use:
Above we have 3 routers. R1 and R3 are both in AS 1 advertising the same network
(1.1.1.0/24) to R2. We can use AS Path prepending to make R2 prefer a certain path.
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R3(config)#router bgp 1
R3(config-router)#neighbor 192.168.23.2 remote-as 2
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R2(config-router)#neighbor 192.168.23.3 remote-as 1
R1(config)#router bgp 1
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
R3(config)#router bgp 1
R3(config-router)#network 1.1.1.0 mask 255.255.255.0
R2#show ip bgp
108
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
In the table above you can see that it prefers 192.168.12.1 as its path. Because
everything is the same it boils down to the router ID. Let’s change the AS path so that
we’ll use 192.168.23.3 as the preferred path.
Here’s an example for you. First create a route-map and use set as-path prepend to
add your own A number multiple times.
Don’t forget to add the route-map to your BGP neighbor configuration and since you are
sending this to your remote neighbor it should be outbound!
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Now we see that 192.168.23.3 is the next hop IP address that we use. You can also see
that the AS Path has become longer for the second entry.
The BGP Origin Code is one of the attributes that is used for path selection. There are
three origin codes that the BGP table can show:
109
IGP (shows up as i)
EGP (shows up as e)
Incomplete (shows up as ?)
You will see IGP when you use the network command for BGP. It means you
advertised the network yourself in BGP. EGP is historical and you won’t see it in the
BGP table anymore. EGP is an old routing protocol we don’t use it anymore. Incomplete
means you have redistributed something into BGP. Here’s a demonstration:
Above you can see the topology that I will use. R1 and R3 are in AS1 and connected to
R2 in AS2. Both routers have a loopback0 interface with network 1.1.1.0/24 configured
on it.
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R3(config)#router bgp 1
R3(config-router)#neighbor 192.168.23.2 remote-as 2
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R2(config-router)#neighbor 192.168.23.3 remote-as 1
First we’ll configure BGP. Next step is to get network 1.1.1.0/24 in the BGP table:
R1(config)#router bgp 1
110
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
R3(config)#router bgp 1
R3(config-router)#redistribute connected
On R1 I’ll advertise network 1.1.1.0/24 in BGP with the network command, on R3 we’ll
redistribute it. Let’s see what R2 thinks of this…
R2#show ip bgp
BGP table version is 4, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
In the output above you can see that R2 learned both networks through BGP. There’s
one small difference however. The first entry shows a ? symbol and the second entry
shows an ‘i’.
111
MED (also called metric) is exchanged between autonomous systems and you can use
it to let the other AS know which path they should use to enter your AS. R2 is sending a
MED of 200 towards AS 3. R3 is sending a MED of 300 to AS 3. AS 3 will prefer the
lower metric and send all traffic for AS 1 through R2. Let me show you how to configure
this on a Cisco router:
Above we have two autonomous systems. R1 and R3 will both advertise network
1.1.1.0 /24 in BGP. We can use MED to tell AS 1 which path to use to reach this
network.
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
R3(config)#router bgp 1
R3(config-router)#neighbor 192.168.23.2 remote-as 2
R3(config-router)#network 1.1.1.0 mask 255.255.255.0
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R2(config-router)#neighbor 192.168.23.3 remote-as 1
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
112
Origin codes: i - IGP, e - EGP, ? - incomplete
You have seen the example above before. R2 prefers the path through 192.168.12.1.
Note that the metric (MED) is 0. Let’s play with the MED now:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 route-map MED out
R3(config)#route-map MED permit 10
R3(config-route-map)#set metric 500
R3(config-route-map)#exit
R3(config)#router bgp 1
R3(config-router)#neighbor 192.168.23.2 route-map MED out
I’ll use route-maps so that R1 advertises everything with a med of 700 and R3 will
advertise everything with a med of 500.
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
You can see R2 prefers the path through 192.168.23.3 because the med is lower.
That’s all there is to it.
113
A BGP community is bit of “extra information” that you can add to one of more prefixes
which is advertised to BGP neighbors. This extra information can be used for things like
traffic engineering or dynamic routing policies. There are 4 well known BGP
communities that you can use or you can pick a numeric value that you can use for
your own policies.
Here are the 4 well known BGP communities:
To give you an idea, here are some examples that I found from Level 3 (large ISP in the
US):
--------------------------------------------------------
customer traffic engineering communities - Prepending
--------------------------------------------------------
65001:0 - prepend once to all peers
65001:XXX - prepend once at peerings to AS XXX
65002:0 - prepend twice to all peers
65002:XXX - prepend twice at peerings to AS XXX
65003:0 - prepend 3x to all peers
65003:XXX - prepend 3x at peerings to AS XXX
65004:0 - prepend 4x to all peers
65004:XXX - prepend 4x at peerings to AS XXX
--------------------------------------------------------
customer traffic engineering communities - Regional
114
--------------------------------------------------------
Will only work for regional peers
64980:0 - announce to customers but not to EU peers
64981:0 - prepend once to all EU peers
64982:0 - prepend twice to all EU peers
64983:0 - prepend 3x to all EU peers
64984:0 - prepend 4x to all EU peers
--------------------------------------------------------
customer traffic engineering communities - LocalPref
--------------------------------------------------------
3356:70 - set local preference to 70
3356:80 - set local preference to 80
3356:90 - set local preference to 90
This list might not be up-to-date anymore but it gives you an impression of how BGP
communities are used. If a customer of Level 3 tags their prefixes with 3356:90 then
they will set the local preference to 90. If you tag them with 64983:0 then they will
prepend the AS number three times to all their BGP neighbors in Europe.
These BGP communities are 32-bit values that are divided in two sections. For labs you
can pick whatever values you like but normally the first 16 bits are used to indicate the
AS number that originates the community, the next 16 bits are assigned by the AS. For
example, Level 3 uses these communities:
--------------------------------------------------------
customer traffic engineering communities - LocalPref
--------------------------------------------------------
3356:70 - set local preference to 70
3356:80 - set local preference to 80
3356:90 - set local preference to 90
The first 16 bits is their AS number (3356) and the next 16 bits (70, 80 and 90)
corresponds with the local preference value. On their routers they configured a policy
that sets the local preference to these values if they receive prefixes with these BGP
communities.
115
For this example I will use the following topology:
On the left side we have a customer router that is connected to ISP1. This ISP is
connected to ISP2 and ISP3. Let’s imagine that ISP2 is somewhere in Europe and that
ISP1 has a policy that they will prepend their AS number four times to BGP neighbors in
Europe whenever a customer adds BGP community value 64984:0 to their prefixes.
Let’s see how we can configure this on the ISP1 and customer router.
BGP Configuration
Here is the BGP configuration, it’s straight-forward eBGP:
116
Customer#show running-config | section bgp
router bgp 10
no synchronization
bgp log-neighbor-changes
network 10.10.10.10 mask 255.255.255.255
neighbor 192.168.10.1 remote-as 1
no auto-summary
ISP1#show running-config | section bgp
router bgp 1
no synchronization
bgp log-neighbor-changes
neighbor 192.168.10.10 remote-as 10
neighbor 192.168.12.2 remote-as 2
neighbor 192.168.13.3 remote-as 3
no auto-summary
ISP2#show running-config | section bgp
router bgp 2
no synchronization
bgp log-neighbor-changes
neighbor 192.168.12.1 remote-as 1
no auto-summary
ISP3#show running-config | section bgp
router bgp 3
no synchronization
bgp log-neighbor-changes
neighbor 192.168.13.1 remote-as 1
no auto-summary
Let’s see if ISP1 has learned any prefixes from the customer router:
ISP1 has learned the network on the loopback interface of the customer router. Right
now we don’t have any BGP communities. Let’s start with the configuration of ISP1…
117
ISP1(config)#ip community-list 1 permit 64984:0
The community-list is similar to an access-list or prefix-list but only used for BGP
communities. Our next step is to create a route-map that will prepend the AS path
whenever we see this value:
This route-map matches on community-list 1 and prepends the AS path four times. Let’s
attach it outbound to ISP2:
ISP1(config)#router bgp 1
ISP1(config-router)#neighbor 192.168.12.2 route-map PREPEND_EU out
This takes care of the configuration of ISP1. Let’s configure our customer router to send
the BGP community with its prefix advertisement now…
Everything that matches the prefix-list will have a community value of 64984:0. Now we
have to activate this route-map:
118
Customer(config)#router bgp 10
Customer(config-router)#neighbor 192.168.10.1 route-map
SET_COMMUNITY out
Customer(config-router)#neighbor 192.168.10.1 send-community
Take a close look at the second command, we have to use the neighbor send-
community command because the router doesn’t automatically send BGP communities
to its neighbors. Everything is in place, let’s verify our work…
Verification
To speed things up I will reset BGP:
Customer#clear ip bgp *
This looks interesting, it did receive our community value but it’s showing it as a big 32-
bit decimal number. There’s a command on Cisco IOS that lets you choose between
this output and the output with two 16-bit values. Let’s change it:
Use the ip bgp community new-format command and it will now look like this:
ISP1#show ip bgp 10.10.10.10
BGP routing table entry for 10.10.10.10/32, version 6
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Advertised to update-groups:
1 2
10
192.168.10.10 from 192.168.10.10 (10.10.10.10)
Origin IGP, metric 0, localpref 100, valid, external, best
119
Community: 64984:0
ISP2#show ip bgp
BGP table version is 12, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Great! Just like expected, the ISP1 router has prepended the AS path when it
advertised the prefix to ISP2. Next time our customer wants something prepended, they
only have to set the correct community value.
I hope this example has been useful to understand what BGP communities are about
and how to implement them. Of course we still have our well known communities:
Internet
No-Advertise
No-Export
Local-AS
120
When you add the no-advertise community to a prefix then the receiving BGP router will
use and store the prefix in its BGP table but it won’t advertise the prefix to any other
neighbors.
Let’s look at an example, this is the topology I will use:
Above you can see R1 with a loopback interface that has network 1.1.1.1 /32. We will
advertise this network in BGP towards R2 with the no advertise community set. As a
result, R2 will not advertise it to R3 (iBGP) or R4 (eBGP).
Configuration
Here’s the basic BGP configuration in case you want to try this example yourself:
121
neighbor 192.168.24.4 remote-as 24
neighbor 192.168.24.4 next-hop-self
no auto-summary
R3#show running-config | section bgp
router bgp 3
no synchronization
bgp log-neighbor-changes
neighbor 192.168.23.2 remote-as 24
no auto-summary
R4#show running-config | section bgp
router bgp 24
no synchronization
bgp log-neighbor-changes
neighbor 192.168.24.2 remote-as 24
no auto-summary
It’s in the BGP table of these routers. Now let’s configure R1 to add the no advertise
community:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 send-community
First we have to tell R1 to send BGP communities, by default this is disabled. Now we
can create a route-map that sets the community value:
This route-map doesn’t have any match statements so it will set the no advertise
community to all prefixes. Let’s activate it:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 route-map NO_ADVERTISE out
122
The route-map is now activated on R1 for everything that is advertised to R2.
In this example I set the BGP community outbound on R1. It’s also possible to configure it inbound
on R2.
Before we reset BGP to activate our changes, let’s take a closer look at the BGP table:
Above you see the BGP table entry for 1.1.1.1/32 without any community information.
Let’s reset BGP so you can see the difference:
R2#clear ip bgp *
R2 learned the prefix and you can see the no-advertise community. As a result, it will no
longer advertise this prefix to R3 or R4:
123
Total number of prefixes 0
There’s nothing there…mission accomplished. Make sure you also check out the other
two BGP communities:
No Export
Local AS
The well known BGP community no export tells BGP neighbors to advertise a
prefix only to iBGP neighbors. If you are not sure what BGP communities are and how
they work then I advise you to read my introduction to BGP communities first before you
continue. Having said that, let’s take a look at a configuration example. Here’s the
topology we will use:
Above we see R1 with network 1.1.1.1/32 on a loopback interface. It will advertise this
prefix with the no export community set. As a result, R2 will install it in its BGP table and
advertises it to R4 (iBGP). It will not be advertised to R3 since this is a eBGP session.
124
Configuration
Basic BGP Configuration
Here’s the BGP configuration in case you want to try this example yourself:
By default BGP does not send any communities. All routers will learn about 1.1.1.1/32:
125
Let’s configure our BGP community. First we have to tell R1 to send communities:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 send-community
Now we can create a route-map that sets the BGP community to no-export and we
attach it to our neighbor R2:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 route-map NO_EXPORT out
Before we reset the BGP session, take a look at the BGP table of R2:
Above you don’t see any BGP community information. Let’s reset BGP so that you can
see the difference:
R2#clear ip bgp *
126
192.168.12.1 from 192.168.12.1 (192.168.12.1)
Origin IGP, metric 0, localpref 100, valid, external, best
Community: no-export
You can see that this prefix is tagged with the no export community. R2 no longer
advertises it to eBGP neighbors. Let’s verify this:
The local AS community is a well known BGP community and can be used for BGP
confederations. It’s basically the same as the no export community but this one works for
within the sub-AS of a confederation. Prefixes that are tagged are only advertised to
other neighbors in the same sub-AS, not to other sub-AS’es or eBGP routers.
To demonstrate this I will use the following topology:
127
AS 2345 has 4 routers and 2 sub-AS’es. We will advertise a prefix from R1 to AS 2345
so you can see what happens with and without the use of the local AS community. Let’s
look at the configuration…
Configuration
BGP Configuration
Below you will find the BGP configuration for all these routers. Within AS 2345 I have
used OSPF so that these routers can use their loopback interfaces for BGP.
128
neighbor 192.168.12.2 remote-as 2345
no auto-summary
R2#show running-config | section bgp
router bgp 23
no synchronization
bgp log-neighbor-changes
bgp confederation identifier 2345
bgp confederation peers 45
neighbor 3.3.3.3 remote-as 23
neighbor 3.3.3.3 update-source Loopback0
neighbor 4.4.4.4 remote-as 45
neighbor 4.4.4.4 ebgp-multihop 2
neighbor 4.4.4.4 update-source Loopback0
neighbor 192.168.12.1 remote-as 1
no auto-summary
R3#show running-config | section bgp
router bgp 23
no synchronization
bgp log-neighbor-changes
bgp confederation identifier 2345
bgp confederation peers 45
neighbor 2.2.2.2 remote-as 23
neighbor 2.2.2.2 update-source Loopback0
neighbor 5.5.5.5 remote-as 45
neighbor 5.5.5.5 ebgp-multihop 2
neighbor 5.5.5.5 update-source Loopback0
neighbor 192.168.36.6 remote-as 6
no auto-summary
R4#show running-config | section bgp
router bgp 45
no synchronization
bgp log-neighbor-changes
bgp confederation identifier 2345
bgp confederation peers 23
neighbor 2.2.2.2 remote-as 23
neighbor 2.2.2.2 ebgp-multihop 2
neighbor 2.2.2.2 update-source Loopback0
neighbor 5.5.5.5 remote-as 45
neighbor 5.5.5.5 update-source Loopback0
no auto-summary
R5#show running-config | section bgp
router bgp 45
no synchronization
bgp log-neighbor-changes
bgp confederation identifier 2345
bgp confederation peers 23
neighbor 3.3.3.3 remote-as 23
neighbor 3.3.3.3 ebgp-multihop 2
129
neighbor 3.3.3.3 update-source Loopback0
neighbor 4.4.4.4 remote-as 45
neighbor 4.4.4.4 update-source Loopback0
no auto-summary
R6#show running-config | section bgp
router bgp 6
no synchronization
bgp log-neighbor-changes
neighbor 192.168.36.3 remote-as 2345
no auto-summary
R1 has advertised prefix 1.1.1.1/32 in BGP, let’s see if our routers have learned this:
All router know about this prefix. Time to activate the local AS community…
R2(config)#router bgp 23
R2(config-router)#neighbor 192.168.12.1 route-map LOCAL_AS in
R2(config-router)#neighbor 3.3.3.3 send-community
130
R2 sets the community so make sure that it advertises it to R3. Before we reset BGP,
take a look at the BGP table of R2:
Above you can see the output without any communities. Let’s reset BGP now:
R2#clear ip bgp *
Above you can see that this prefix has the local AS community. It will not be advertised
outside of our sub-AS. So which of our routers still has it?
131
Only R3 has the prefix now since it’s in the same sub-AS as R2. Another good method
to verify this is by using checking what prefixes are advertised by R2 and R3:
Above you can see that R2 advertises 1.1.1.1/32 to R3, it doesn’t advertise it to R4
anymore:
132
Total number of prefixes 0
That’s all there is to it. Make sure you also check the other well known BGP
communities:
No-Advertise
No-Export
Regular Expressions are used often for BGP route manipulation or filtering. In this
lesson we’ll take a look at some useful regular expressions. First let’s take a look at the
different characters that we can use:
Characters
[] is a range.
_ matches the space between AS numbers or the end of the AS PATH list.
Examples
133
^$ matches an empty AS PATH so it will match all prefixes from the local AS.
matches prefixes that originated in AS 51, the $ ensures that it’s the beginning of
_51$
the AS PATH.
^(51_)+([0- matches prefixes from the clients of directly connected AS 51, where AS 51 might
9]+) be doing AS PATH prepending.
^51_([0-9]+_) matches prefixes from the clients of directly connected AS 51, where the clients
+ might be doing AS PATH prepending.
By default BGP will advertise all prefixes to EBGP (External BGP) neighbors. This
means that if you are multi-homed (connected to two or more ISPs) that you might
become a transit AS. Let me show you an example:
134
R1 is connected to ISP1 and ISP2 and each router is in a different AS (Autonomous
System). Since R1 is multi-homed it’s possible that the ISPs will use R1 to reach each
other. In order to prevent this we’ll have to ensure that R1 only advertises prefixes from
its own autonomous system.
As far as I know there are 4 methods how you can prevent becoming a transit AS:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1(config-router)#neighbor 192.168.13.3 remote-as 3
ISP1(config)#router bgp 2
ISP1(config-router)#neighbor 192.168.12.1 remote-as 1
ISP2(config)#router bgp 3
ISP2(config-router)#neighbor 192.168.13.1 remote-as 1
The commands above will configure EBGP (External BGP) between R1 – ISP1 and R1
– ISP2. To make sure we have something to look at, I’ll advertise the loopback
interfaces in BGP on each router:
135
R1(config)#router bgp 1
R1(config-router)#network 1.1.1.0 mask 255.255.255.0
ISP1(config)#router bgp 2
ISP1(config-router)#network 2.2.2.0 mask 255.255.255.0
ISP2(config)#router bgp 3
ISP2(config-router)#network 3.3.3.0 mask 255.255.255.0
With the networks advertised, let’s take a look at the BGP table of ISP1 and ISP2 to see
what they have learned:
ISP1#show ip bgp
BGP table version is 4, local router ID is 11.11.11.11
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
The ISP routers have learned about each other networks and they will use R1 as the
next hop. We now have everything in place to play with the different filtering techniques.
136
R1(config-router)#neighbor 192.168.12.2 filter-list 1 out
R1(config-router)#neighbor 192.168.13.3 filter-list 1 out
The ^$ regular expression ensures that we will only advertise locally originated prefixes.
We’ll have to apply this filter to both ISPs.
Keep in mind that BGP is slow…if you are doing labs, it’s best to speed things up with clear ip bgp
*
R1#show ip bgp
BGP table version is 4, local router ID is 22.22.22.22
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R1 still knows about the prefixes from the ISP routers. What about ISP1 and ISP2?
ISP1#show ip bgp
BGP table version is 7, local router ID is 11.11.11.11
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
137
*> 3.3.3.0/24 0.0.0.0 0 32768 i
ISP1 and ISP2 only know about the 1.1.1.0 /24 network. Excellent, we are no longer a
transit AS!
No-Export Community
Using the no-export community will also work pretty well. We will configure R1 so that
prefixes from the ISP routers will be tagged with the no-export community. This ensures
that the prefixes from those routers will be known within AS 1 but won’t be advertised to
other routers.
R1(config)#route-map NO-EXPORT
R1(config-route-map)#set community no-export
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 route-map NO-EXPORT in
R1(config-router)#neighbor 192.168.13.3 route-map NO-EXPORT in
I’m only using one router in AS 1, if you have other routers and are running IBGP (Internal BGP)
then don’t forget to send communities to those routers with the neighbor <ip> send-
community command.
Let’s see what ISP1 and ISP2 think about our configuration:
ISP1#show ip bgp
BGP table version is 11, local router ID is 11.11.11.11
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
138
*> 1.1.1.0/24 192.168.13.1 0 0 1 i
*> 3.3.3.0/24 0.0.0.0 0 32768 i
Prefix-List Filtering
Using a prefix-list we can determine what prefixes are advertised to our BGP neighbors.
This works fine but it’s not a good solution to prevent becoming a transit AS. Each time
you add new prefixes you’ll have to reconfigure the prefix-list. Anyway let me show you
how it works:
The prefix-list above will only advertise 1.1.1.0 /24 to the ISP routers. Let’s verify the
configuration:
ISP1#show ip bgp
BGP table version is 17, local router ID is 11.11.11.11
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
139
Distribute-list Filtering
This method is similar to using the prefix-list but this time we’ll use an access-list.
ISP1#show ip bgp
BGP table version is 23, local router ID is 11.11.11.11
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Filtering IPv6 routes in BGP is similar to IPv4 filtering. There are 3 methods we can use:
Prefix-list
Filter-list
Route-map
Each of these can be applied in- or outbound. I’ll explain how you can use these for
filtering, this is the topology I will use:
140
R1 and R2 are using IPv6 addresses and will use MP-BGP so that R1 can advertise
some prefixes on its loopback interfaces. All prefixes on the loopback interfaces are /64
subnets while loopback3 has a /96 subnet.
Configuration
Let’s start with a basic MP-BGP configuration so that R1 and R2 become eBGP
neighbors:
R1 & R2#
(config)ipv6 unicast-routing
R1(config)#router bgp 1
R1(config-router)#bgp router-id 1.1.1.1
R1(config-router)#neighbor 2001:db8:0:12::2 remote-as 2
R1(config-router)#address-family ipv6
R1(config-router-af)#neighbor 2001:db8:0:12::2 activate
R1(config-router-af)#network 2001:db8:0:1::/64
R1(config-router-af)#network 2001:db8:0:11::/64
R1(config-router-af)#network 2001:db8:0:111::/64
R1(config-router-af)#network 2001:db8:0:1111::/96
R2(config)#router bgp 2
R2(config-router)#bgp router-id 2.2.2.2
R2(config-router)#neighbor 2001:db8:0:12::1 remote-as 1
R2(config-router)#address-family ipv6
R2(config-router-af)#neighbor 2001:db8:0:12::1 activate
141
via FE80::21D:A1FF:FE8B:36D0, FastEthernet0/0
There we go, everything is in the routing table. Now we can play with some of the
filtering options…
Prefix-List Filtering
Let’s start with the prefix-list. R1 is advertising one /96 subnet. Let’s see if we can
configure R2 to filter this network:
This prefix-list checks the entire 2001::/16 range and permits subnets with a /64 or
larger. Anything smaller will be denied. Let’s activate it:
R2(config)#router bgp 2
R2(config-router)#address-family ipv6
R2(config-router-af)#neighbor 2001:db8:0:12::1 prefix-list
SMALL_NETWORKS in
We activate the prefix-list inbound on R2 for everything that we receive from R1. Let’s
reset BGP to speed things up:
R2#clear ip bgp *
Filter-List Filtering
142
Let’s try the filter-list. We can use this to filter prefixes from certain autonomous
systems. Everything that R1 is advertising only has AS 1 in the AS path, I’ll configure
AS prepending so we have something to play with:
R1(config)#router bgp 1
R1(config-router)#address-family ipv6
R1(config-router-af)#neighbor 2001:db8:0:12::2 route-map PREPEND
out
The above configuration will make sure that whenever R1 advertises 2001:db8:0:1::/64
it will add AS 11 to the AS path. Let’s verify this:
143
Above you can see that 2001:DB8:0:1::/64 now has AS 11 in its AS path. Let’s
configure a filter-list on R2 to get rid of this network:
R2(config)#router bgp 2
R2(config-router)#address-family ipv6
R2(config-router-af)#neighbor 2001:db8:0:12::1 filter-list 11 in
R2#clear ip bgp *
The as-path access-list above only permits prefixes from AS1, nothing else. We attach it
inbound to everything we receive from R1. This is the result:
Route-Map Filtering
Route-maps are really useful and can be used to match on many different things. I’ll use
an IPv6 access-list in a route-map to filter 2001:DB8:0:11::/64:
R2(config)#router bgp 2
R2(config-router-af)#neighbor 2001:db8:0:12::1 route-map MY_FILTER
in
R2#clear ip bgp *
144
The configuration above has an access-list called “THIRD_LOOPBACK” that matches
2001:DB8:0:11::/64 and is denied in the route-map called “MY_FILTER”. Last but not
least, we apply it inbound on R2. Here’s the result:
The access-list tells us that it has a match and you can see it’s gone from the routing
table.
Order of Operation
You have now seen how you can use a prefix-list, filter-list and route-map to filter IPv6
prefixes. You can apply all of these at the same time if you want, I didn’t remove any of
my previous configurations when I was writing this lesson. Take a look at R2:
On a production network you probably won’t use all of these at the same time. The
route-map is a popular choice since you can use it for pretty much anything, filtering and
doing things like prepending the AS path.
If you do activate all of these at the same time then you might want to know in what
order the router will process these filtering techniques. Here they are:
Inbound:
Route-map
Filter-List
Prefix-List
Outbound:
Prefix-List
145
Filter-List
Route-Map
Why do we care about this? Imagine you have an inbound route-map and prefix-list. If
you permitted a prefix in the prefix-list but denied it in the route-map then you will never
see the prefix in your BGP table since the route-map is processed before the prefix-list.
For outbound filtering it’s the other way around. If you permit something in the route-
map but denied it in a filter-list then it will never be advertised…the filter-list is
processed before the route-map for outbound updates.
Don’t make it too hard for yourself…it’s best to stick to using the route-map only since
you can attach prefix-lists and as-path access-lists to it.
In this tutorial we’ll take a look at BGP AS path filtering. Using the AS path filter we can
permit or deny prefixes from certain autonomous systems. You can use this for things
like:
A looking glass server is a router on the Internet that has a (full) internet routing table.
You can use telnet to one and use show commands to view the BGP table. It’s a great
way to practice regular expressions since there’s plenty of prefixes to play with.
You can find a looking glass server on BGP4.as, I picked one that is close to me:
route-server.tinet.net
Once I connect to it through telnet this is what I see:
146
+------------------------------------------------------------------
--+
|
|
| GTT Route Monitor - AS3257
|
|
|
| This system is solely for internet operational purposes. Any
|
| misuse is strictly prohibited. All connections to this router
|
| are logged.
|
|
|
| This server provides a view on the Tinet legacy routing table
|
| that is used in Frankfurt/Germany. If you are interested in
|
| other regions of the backbone check out http://www.as3257.net/
|
|
|
| Please report problems to [email protected]
|
|
|
+------------------------------------------------------------------
--+
route-server.as3257.net>
route-server.as3257.net>show ip bgp
BGP table version is 4491321, local router ID is 213.200.87.253
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal
Origin codes: i - IGP, e - EGP, ? - incomplete
147
*> 1.0.5.0/24 213.200.64.93 0 0 3257
6453 7545 56203 i
*> 1.0.6.0/24 213.200.64.93 0 0 3257
174 4826 38803 56203 i
*> 1.0.7.0/24 213.200.64.93 0 0 3257
174 4826 38803 56203 i
*> 1.0.20.0/23 213.200.64.93 1551 0 3257
2516 2519 i
*> 1.0.22.0/23 213.200.64.93 1551 0 3257
2516 2519 i
*> 1.0.24.0/23 213.200.64.93 1551 0 3257
2516 2519 i
*> 1.0.26.0/23 213.200.64.93 1551 0 3257
2516 2519 i
*> 1.0.28.0/22 213.200.64.93 1551 0 3257
2516 2519 i
*> 1.0.38.0/24 213.200.64.93 815 0 3257
9304 24155 i
*> 1.0.41.0/24 213.200.64.93 815 0 3257
9304 24155 i
*> 1.0.43.0/24 213.200.64.93 815 0 3257
9304 24155 i
*> 1.0.46.0/24 213.200.64.93 815 0 3257
9304 24155 i
*> 1.0.48.0/24 213.200.64.93 815 0 3257
9304 24155 i
*> 1.0.64.0/18 213.200.64.93 1551 0 3257
2516 7670 18144 i
*> 1.0.128.0/18 213.200.64.93 0 0 3257
174 38040 9737 i
*> 1.0.128.0/17 213.200.64.93 0 0 3257
38040 9737 9737 i
*> 1.0.129.0/24 213.200.64.93 0 0 3257
4651 9737 9737 23969 i
*> 1.0.130.0/24 213.200.64.93 0 0 3257
6453 4651 9737 9737 9737 23969 i
*> 1.0.131.0/24 213.200.64.93 0 0 3257
6453 4651 9737 9737 9737 23969 i
*> 1.0.142.0/23 213.200.64.93 0 0 3257
6453 4651 9737 9737 9737 23969 i
*> 1.0.160.0/19 213.200.64.93 18 0 3257
2914 38040 9737 i
*> 1.0.192.0/21 213.200.64.93 0 0 3257
6453 4651 9737 9737 9737 23969 i
Plenty of prefixes to play with…let’s try a couple of examples now shall we?
148
Only allow prefixes that originated from AS 3257
This example will only accept prefixes that originated in AS 3257, all the other prefixes
won’t be permitted:
Let me explain the regular expression that I used here. The ^ symbol means that this is
the beginning of the string and the $ matches the end of the string. We put 3257 in
between so only “3257” matches. If you want to configure this filter on a Cisco IOS
router you can do this with the as-path access-list command:
ip as-path access-list 1 permit ^3257$
router bgp 1
neighbor 213.200.64.93 remote-as 3257
neighbor 213.200.64.93 route-map AS_PATH_FILTER in
The as-path access-list works like the normal access-lists, there is a hidden “deny any”
at the bottom. First we create the as-path access-list and then attach it to a route-map.
In the BGP configuration you can attach the route-map to one of your BGP neighbors.
149
BGP table version is 4492787, local router ID is 213.200.87.253
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal
Origin codes: i - IGP, e - EGP, ? - incomplete
The regular expression starts and ends with a _ . This matches the space between the
AS path numbers. I’m not using a ^or $ to indicate the start and end of the string so
there can be as many autonomous systems as we want, as long as it passed through
AS 3257 it will match. Here’s what it looks like on a router:
router bgp 1
neighbor 213.200.64.93 remote-as 3257
neighbor 213.200.64.93 route-map AS_PATH_FILTER in
Deny prefixes that originated from AS 56203 and permit everything else
This one might be useful if you want to block prefixes that originated in a particular AS:
150
Origin codes: i - IGP, e - EGP, ? - incomplete
The first AS is always on the right side, so in order to match this we end the string with a
$ and put the AS number just in front of it. The _ will match the space in front of the AS
number. On a router it will look like this:
router bgp 1
neighbor 213.200.64.93 remote-as 3257
neighbor 213.200.64.93 route-map AS_PATH_FILTER in
First we use a deny statement to block the AS number and then we use a permit .* to
allow everything else. The . (dot) matches anything and the * (wildcard) means “repeat
the previous character zero or many times”. This will permit everything.
Allow prefixes from AS 3257 and its directly connected ASes but deny
the rest
This one lets us accept all prefixes from AS 3257 and the directly connected
autonomous systems of AS 3257:
151
route-server.as3257.net>show ip bgp regexp ^3257_[0-9]*$
BGP table version is 4493802, local router ID is 213.200.87.253
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal
Origin codes: i - IGP, e - EGP, ? - incomplete
We start with the ^3257 so that we only accept prefixes from AS 3257. The _ will match
on the space and the [0-9] will match on any character between 0 and 9. The * means
that we repeat the last character (0-9). This means that AS1 would match, but also
AS123 or AS12345, etc. The $ at the end will make sure that only 1 autonomous
system behind AS 3257 is allowed.
router bgp 1
neighbor 213.200.64.93 remote-as 3257
neighbor 213.200.64.93 route-map AS_PATH_FILTER in
That’s it for now! I hope these examples are helpful to understand regular expressions a
bit more and how to configure the as-path access-list on a Cisco IOS router. If you have
any questions, feel free to ask.
152
Nowadays we use prefix-lists to filter BGP prefixes. Prefix-lists are very convenient since
they allow you to specify a network address with a specific prefix length or a range of
prefix lengths. Back in the days, before prefix-lists existed on Cisco IOS you had to use
extended access-lists for this.
You really don’t want to use these anymore since the prefix-list does the same thing and
the configuration is much easier. However, when you face a CCIE lab it might be
possible that a task requires you to filter certain prefixes but you are not allowed to use
the prefix-list. The extended access-list will be your only option then…
Having said that, let’s take a look how extended access-list filtering works. The
“behavior” of the extended access-list is different compared to when you use it for
filtering IP packets.
When you use IP as the protocol, here’s what the extended access-list normally looks
like:
Above you see the source address with the source wildcard bits and the destination
address with destination wildcard bits. Now forget what you have seen above, this is
how the extended access-list works for BGP filtering:
The first field is for the network address, for example 10.0.0.0.
The second field is used to define what part of the network address to check. For example, when
we specify 10.0.0.0 then we use wildcard bits to tell the router if we want to look for 10.0.0.0,
10.0.0.x, 10.0.x.x or 10.x.x.x.
The subnet mask and its wildcard bits are used to define the prefix length, we can use this to tell
the router to look for /24, /25, /26 or a range like /24 to /32.
153
Using the extended access-list for BGP filtering is something that is best explained with
some examples. I’ll use two routers and some prefixes and we’ll walk through some
different filtering examples.
Configuration
I will use the following two routers for this:
R2 has a bunch of loopback interfaces with different networks, we’ll use these to play
with filtering.
Video 1
154
*> 10.3.0.0/25 0.0.0.0 0 32768 i
*> 10.3.0.128/25 0.0.0.0 0 32768 i
*> 10.4.0.0/25 0.0.0.0 0 32768 i
*> 10.4.0.128/25 0.0.0.0 0 32768 i
*> 10.5.0.0/26 0.0.0.0 0 32768 i
*> 10.6.0.0/27 0.0.0.0 0 32768 i
*> 10.7.0.0/28 0.0.0.0 0 32768 i
*> 10.8.1.0/24 0.0.0.0 0 32768 i
*> 10.8.2.0/24 0.0.0.0 0 32768 i
*> 20.0.0.0 0.0.0.0 0 32768 i
*> 30.0.0.0 0.0.0.0 0 32768 i
*> 172.16.0.0/24 0.0.0.0 0 32768 i
*> 172.16.1.0/24 0.0.0.0 0 32768 i
*> 172.16.2.0/25 0.0.0.0 0 32768 i
*> 172.16.3.0/25 0.0.0.0 0 32768 i
*> 172.16.4.0/26 0.0.0.0 0 32768 i
*> 172.16.5.0/27 0.0.0.0 0 32768 i
*> 172.16.6.0/28 0.0.0.0 0 32768 i
*> 172.16.7.0/29 0.0.0.0 0 32768 i
*> 192.168.0.0 0.0.0.0 0 32768 i
*> 192.168.1.0 0.0.0.0 0 32768 i
*> 192.168.2.0/25 0.0.0.0 0 32768 i
*> 192.168.3.0/25 0.0.0.0 0 32768 i
*> 192.168.4.0/26 0.0.0.0 0 32768 i
*> 192.168.5.0/27 0.0.0.0 0 32768 i
*> 192.168.6.0/28 0.0.0.0 0 32768 i
*> 192.168.7.0/29 0.0.0.0 0 32768 i
*> 192.168.7.8/29 0.0.0.0 0 32768 i
*> 192.168.7.16/29 0.0.0.0 0 32768 i
*> 192.168.7.24/30 0.0.0.0 0 32768 i
*> 192.168.12.0 0.0.0.0 0 32768 i
20.0.0.0 /8
172.16.0.0 /24
192.168.1.0 /24
Here’s what the access-list will look like:
155
R1(config)#access-list 100 permit ip 20.0.0.0 0.0.0.0 255.0.0.0
0.0.0.0
R1(config)#access-list 100 permit ip 172.16.0.0 0.0.0.0
255.255.255.0 0.0.0.0
R1(config)#access-list 100 permit ip 192.168.1.0 0.0.0.0
255.255.255.0 0.0.0.0
R1(config)#router bgp 1
R1(config-router)#distribute-list 100 in
R1#clear ip bgp *
In the first entry we want an exact match for “20.0.0.0” so we use network 20.0.0.0 with
wildcard 0.0.0.0. The prefix-length has to be exactly /8 so we use subnet mask 255.0.0.0 with
wildcard 0.0.0.0.
In the second entry we want an exact match for “172.16.0.0” so we use network 172.16.0.0 with
wildcard 0.0.0.0. The prefix-length has to be exactly /16 so we use subnet mask 255.255.0.0 with
wildcard 0.0.0.0.
In the last entry we want an exact match for “192.168.1.0” so we use network 192.168.1.0 with
wildcard 0.0.0.0. The prefix-length has to be exactly /24 so we use subnet mask 255.255.255.0
with wildcard 0.0.0.0.
Let’s see what we get:
R1#show ip bgp
BGP table version is 4, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Great, we only see our three specific prefixes. One little “extra” that the access-list
offers us that the prefix-list doesn’t is that it shows matches:
156
20 permit ip host 172.16.0.0 host 255.255.255.0 (1 match)
30 permit ip host 192.168.1.0 host 255.255.255.0 (2 matches)
We only want to see 192.168.0.0 /24, 192.168.1.0 /24 and 192.168.12.0 /24 on R1.
Here’s the access-list we will create:
R1(config)#router bgp 1
R1(config-router)#distribute-list 101 in
R1#clear ip bgp *
157
Here’s the result:
R1#show ip bgp
BGP table version is 4, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Great, these are the only 192.168.x.0 /24 networks that we have. Time for the next
example…
158
R1(config)#router bgp 1
R1(config-router)#distribute-list 102 in
R1#clear ip bgp *
The network we want to check is 10.0.0.0 but we only care about the 1st and 4th octet, the 2nd
and 3rd octet can be everything so we use wildcard 0.255.255.0.
We want all networks with a /24 prefix length so we use 255.255.255.0 as the subnet mask. This
has to be an exact match so we use 0.0.0.0 as the wildcard.
Here’s what we get:
R1#show ip bgp
BGP table version is 6, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Great, these are all networks in the 10.x.x.0 range with a /24 prefix length. Let’s try
something else…
159
*> 10.4.0.0/25 0.0.0.0 0 32768 i
*> 10.4.0.128/25 0.0.0.0 0 32768 i
*> 10.5.0.0/26 0.0.0.0 0 32768 i
*> 10.6.0.0/27 0.0.0.0 0 32768 i
*> 10.7.0.0/28 0.0.0.0 0 32768 i
*> 10.8.1.0/24 0.0.0.0 0 32768 i
*> 10.8.2.0/24 0.0.0.0 0 32768 i
R1(config)#router bgp 1
R1(config-router)#distribute-list 103 in
R1#clear ip bgp *
We want to check the 10.0.0.0 network but we don’t care about the 2nd, 3th or 4th octet. That’s
why we use a 0.255.255.255 wildcard.
The subnet mask is 255.255.255.128 which equals /25. It has to be an exact match so we use
wildcard 0.0.0.0.
Here’s what you will find:
R1#show ip bgp
BGP table version is 5, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Excellent, these are all 10.x.x.x networks with a /25 prefix length.
160
This example will be a bit different. This time I want to filter all networks that start with
192.168.7.x but I don’t care about the prefix length. We are talking about the following
prefixes:
R1(config)#router bgp 1
R1(config-router)#distribute-list 104 in
R1#clear ip bgp *
We are looking for network 192.168.7.0 but we only want to check the first three octets, that’s
why we use wildcard 0.0.0.255.
We don’t care about the prefix length, it should be at least a /24 since we are looking at the
192.168.7.x range but it doesn’t matter if it’s a /25, /26, etc. This is why we use subnet mask
255.255.255.0 with wildcard 0.0.0.255. It means that we don’t care about the prefix length in the
4th octet.
Here’s the result:
R1#show ip bgp
BGP table version is 5, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
161
r> 192.168.7.16/29 192.168.12.2 0 0 2 i
r> 192.168.7.24/30 192.168.12.2 0 0 2 i
R1 will only have these networks in its BGP table now, everything else will be filtered.
162
*> 192.168.2.0/25 0.0.0.0 0 32768 i
*> 192.168.3.0/25 0.0.0.0 0 32768 i
*> 192.168.4.0/26 0.0.0.0 0 32768 i
*> 192.168.5.0/27 0.0.0.0 0 32768 i
*> 192.168.6.0/28 0.0.0.0 0 32768 i
*> 192.168.7.0/29 0.0.0.0 0 32768 i
*> 192.168.7.8/29 0.0.0.0 0 32768 i
*> 192.168.7.16/29 0.0.0.0 0 32768 i
*> 192.168.7.24/30 0.0.0.0 0 32768 i
*> 192.168.12.0 0.0.0.0 0 32768 i
We have a big list with prefixes, most of them have a prefix length that is larger than
/24. We do have 20.0.0.0 /8 and 30.0.0.0 /8 that will be gone when we create this filter.
Time to find out:
R1(config)#router bgp 1
R1(config-router)#distribute-list 105 in
R1#clear ip bgp *
We don’t care about the network so the network address is 0.0.0.0 with wildcard
255.255.255.255.
We want all prefixes with a prefix length of at least /24, that’s why we pick a subnet mask of
255.255.255.0 and a wildcard of 0.0.0.255. This means we don’t care about the 4th octet so it
will match everything from /24 to /32.
Let’s find out if it works:
R1#show ip bgp
BGP table version is 33, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
163
r> 10.1.0.0/24 192.168.12.2 0 0 2 i
r> 10.2.0.0/24 192.168.12.2 0 0 2 i
r> 10.3.0.0/25 192.168.12.2 0 0 2 i
r> 10.3.0.128/25 192.168.12.2 0 0 2 i
r> 10.4.0.0/25 192.168.12.2 0 0 2 i
r> 10.4.0.128/25 192.168.12.2 0 0 2 i
r> 10.5.0.0/26 192.168.12.2 0 0 2 i
r> 10.6.0.0/27 192.168.12.2 0 0 2 i
r> 10.7.0.0/28 192.168.12.2 0 0 2 i
r> 10.8.1.0/24 192.168.12.2 0 0 2 i
r> 10.8.2.0/24 192.168.12.2 0 0 2 i
r> 172.16.0.0/24 192.168.12.2 0 0 2 i
r> 172.16.1.0/24 192.168.12.2 0 0 2 i
r> 172.16.2.0/25 192.168.12.2 0 0 2 i
r> 172.16.3.0/25 192.168.12.2 0 0 2 i
r> 172.16.4.0/26 192.168.12.2 0 0 2 i
r> 172.16.5.0/27 192.168.12.2 0 0 2 i
r> 172.16.6.0/28 192.168.12.2 0 0 2 i
r> 172.16.7.0/29 192.168.12.2 0 0 2 i
r> 192.168.0.0 192.168.12.2 0 0 2 i
r> 192.168.1.0 192.168.12.2 0 0 2 i
r> 192.168.2.0/25 192.168.12.2 0 0 2 i
r> 192.168.3.0/25 192.168.12.2 0 0 2 i
r> 192.168.4.0/26 192.168.12.2 0 0 2 i
r> 192.168.5.0/27 192.168.12.2 0 0 2 i
r> 192.168.6.0/28 192.168.12.2 0 0 2 i
r> 192.168.7.0/29 192.168.12.2 0 0 2 i
r> 192.168.7.8/29 192.168.12.2 0 0 2 i
r> 192.168.7.16/29 192.168.12.2 0 0 2 i
r> 192.168.7.24/30 192.168.12.2 0 0 2 i
r> 192.168.12.0 192.168.12.2 0 0 2 i
Our 20.0.0.0 /8 and 30.0.0.0 /8 prefixes are now gone from the BGP table, everything
you see above has at least a /24 prefix length.
164
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
165
R1(config)#router bgp 1
R1(config-router)#distribute-list 106 in
R1#clear ip bgp *
We don’t care about the network address so we use 0.0.0.0 as the network address with wildcard
255.255.255.255.
The prefix length has to be at least /26, that’s a 255.255.255.192 subnet mask.
We want to match all prefixes from /26 to /32, by using this wildcard we tell the router that the
last four bits have to match, we don’t care about the first four bits. This will match subnet mask
255.255.255.192, 255.255.255.224, 255.255.255.240, 255.255.255.248, 255.255.255.252,
255.255.255.254 and 255.255.255.255 (everything from /26 to /32).
Here’s the end result:
R1#show ip bgp
BGP table version is 15, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Above you can see that all prefixes below /26 have disappeared.
166
This example will be similar to the previous one with the exception that we will check a
specific network range. Here are all networks in the 172.16.x.x range that R2 offers us:
R1(config)#router bgp 1
R1(config-router)#distribute-list 107 in
R1#clear ip bgp *
We want to check network 172.16.0.0 but we don’t care about the 3rd or 4th octet so we use
wildcard 0.0.255.255.
The prefix length should be at least /27 so we use a subnet mask of 255.255.255.224.
We want to match all subnet masks from /27 to /32 so we use a wildcard of 0.0.0.31. This means
the first three octets have to match and the last four bits of the 4th octet. This will allow subnet
mask 255.255.255.192, 255.255.255.224, 255.255.255.240, 255.255.255.248, 255.255.255.252,
255.255.255.254 and 255.255.255.255.
Here’s the end result:
R1#show ip bgp
BGP table version is 4, local router ID is 192.168.12.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
167
Network Next Hop Metric LocPrf Weight Path
r> 172.16.5.0/27 192.168.12.2 0 0 2 i
r> 172.16.6.0/28 192.168.12.2 0 0 2 i
r> 172.16.7.0/29 192.168.12.2 0 0 2 i
Great, we only have a few 172.16.x.x networks with a /27 prefix length or larger.
Conclusion
You have now seen quite some examples of how you can use BGP filtering with
extended access-lists. This can be pretty annoying and it’s much easier to use prefix-
lists instead. However if you are not allowed to use them, you now know how to filter
with extended access-lists.
When you configure BGP on a router it’s possible that some of the BGP neighbors
share the exact same configuration. This can be annoying since you have to type in the
exact same commands for each of these neighbors. Also, when BGP prepares updates
it does this separately for each neighbor. This means that it has to use CPU resources
to prepare the update for each neighbor.
To simplify the configuration of BGP and to reduce the number of updates BGP has to
create, we can use peer groups. We can add neighbors to a peer group and then apply
all our configurations to the peer group. BGP will prepare the updates for the peer group
which requires less CPU resources than preparing them for each neighbor separately.
Configuration
Let’s take a look at two examples so you can see the difference between using peer
groups or not. I’ll use the following topology to demonstrate this:
168
Above we have 4 routers in different autonomous systems. R1 is connected to R2, R3
and R4. Let’s say that we have the following requirements for these eBGP neighbors:
R1(config)#router bgp 1
R1(config-router)#neighbor 2.2.2.2 remote-as 2
R1(config-router)#neighbor 3.3.3.3 remote-as 3
R1(config-router)#neighbor 4.4.4.4 remote-as 4
R1(config-router)#neighbor 2.2.2.2 update-source loopback 0
R1(config-router)#neighbor 3.3.3.3 update-source loopback 0
R1(config-router)#neighbor 4.4.4.4 update-source loopback 0
R1(config-router)#neighbor 2.2.2.2 ebgp-multihop 2
R1(config-router)#neighbor 3.3.3.3 ebgp-multihop 2
169
R1(config-router)#neighbor 4.4.4.4 ebgp-multihop 2
R1(config-router)#neighbor 2.2.2.2 route-map SET_MED out
R1(config-router)#neighbor 3.3.3.3 route-map SET_MED out
R1(config-router)#neighbor 4.4.4.4 route-map SET_MED out
In the configuration of R1 above the only difference is the AS number for each neighbor.
The update-source, ebgp-multihop and route-map are the same. This works but we
have to repeat the same commands over and over again.
First we have to configure the AS number for each eBGP neighbor separately:
R1(config)#router bgp 1
R1(config-router)#neighbor 2.2.2.2 remote-as 2
R1(config-router)#neighbor 3.3.3.3 remote-as 3
R1(config-router)#neighbor 4.4.4.4 remote-as 4
Now we can create the peer group. If you look at the neighbor command you will see
some options:
R1(config-router)#neighbor ?
A.B.C.D Neighbor address
WORD Neighbor tag
X:X:X:X::X Neighbor IPv6 address
We can specify an IPv4 or IPv6 address for the neighbor or we can use a tag. That’s
what we need to use for the peer group, let’s try that:
I’ll call my peer group R2-R3. The next step is to add my neighbors to this peer group:
170
That’s all you have to configure. Everything else you want to configure can be applied to
the peer group instead of applying it to the neighbor directly:
That’s all there is to it. These three commands are now applied to R2, R3 and R4
thanks to our peer group. This saves us some typing and copy/pasting and the router
will require less CPU cycles for its BGP updates.
Route reflectors (RR) are one method to get rid of the full-mesh of IBGP peers in your
network. The other method is BGP confederations.
The route reflector allows all IBGP speakers within your autonomous network to learn
about the available routes without introducing loops. Let me show you an example
picture:
Above we have a network with 6 IBGP routers. Using the full mesh formula we can
calculate the number of IBGP peerings:
N(N-1)/2
So that will be:
171
6(6-1=5) / 2 = 15 IBGP peerings.
When we use a route reflector our network could look like this:
We still have 6 routers but each router only has an IBGP peering with the route reflector
on top. When one of those IBGP routes advertises a route to the route reflector, it will
be “reflected” to all other IBGP routers:
172
This simplifies our IBGP configuration a lot but there’s also a downside. What if the
route reflector crashes? It’s a single point of failure when it comes to IBGP peerings. Of
course there’s a solution to this, we can have multiple route reflectors in our network. I’ll
give you some examples later.
EBGP neighbor
IBGP client neighbor
IBGP non-client neighbor
When you configure a route reflector you have to tell the router whether the other IBGP
router is a client or non-client. A client is an IBGP router that the route reflector will
“reflect” routes to, the non-client is just a regular IBGP neighbor.
1. A route learned from an EBGP neighbor can be forwarded to another EBGP neighbor, a client
and non-client.
2. A route learned from a client can be forwarded to another EBGP neighbor, client and non-
client.
3. A route learned from a non-client can be forwarded to another EBGP neighbor and client, but
not to a non-client.
The third rule makes sense, this is our normal IBGP split horizon behavior.
Now you have an idea what the route reflector is about, let’s take a look at some
configurations.
Configuration
We’ll use a simple example, 3 IBGP routers with a single route reflector:
173
In this example we have 3 IBGP routers. With normal IBGP rules, when R2 receives a
route from R1 it will not be forwarded to R3 (IBGP split horizon). We will configure R2 as
the route reflector to get around this. Let’s configure R1 and R3 first:
The configuration of R1 and R3 is exactly the same as a normal IBGP peering. Only the
configuration on the route reflector is special:
R2(config)#router bgp 123
R2(config-router)#neighbor 192.168.12.1 remote-as 123
R2(config-router)#neighbor 192.168.12.1 route-reflector-client
Here’s the magic…when we configure the route reflector we have to specify its clients.
In this case, R1 and R3. In my topology I have added a loopback interface on R1, let’s
advertise that in BGP to see what it looks like on R2 and R3:
That’s all we have to configure. Let’s use some show commands to verify our work.
Verification
First we’ll look at R2, see if it learned anything:
174
R2 shows us that this route was received from a route reflector client. Did it advertise
anything to R3? Let’s find out:
Excellent, the 1.1.1.1/32 route was advertised to R3. Let’s see what R3 thinks of this:
R3 has learned about this route from R2 and there are two important new fields that you
can see here:
Originator
Cluster List
This information was added by R2 but for what reason?
The IBGP split horizon rule was created to prevent loops, since our route reflector
violates this rule we have to think of a new rule for loop prevention. That’s where these
two items are used for:
The originator ID is set by the route reflector, you can see that this is the IP address of
R1. When an IBGP router receives a route with its own originator ID, it will not accept
175
the route. Just like with OSPF or EIGRP, it’s important that each BGP router has a
unique router ID.
The other thing called Cluster list is the router ID of the route reflector. When we talk
about a cluster, we refer to a route reflector and its clients. Let me give you an
example of a larger topology with multiple route reflectors:
In this topology we have 3 route reflectors, each serves 2 IBGP neighbors. Between the
route reflectors we still have to configure full mesh IBGP. It’s possible that a route
loops between these route reflectors so when a route reflector sees its own cluster ID, it
will drop the route.
In this tutorial we’ll take a look at the BGP Confederation. As you might know, IBGP
requires a full mesh of peerings which can become an administrative nightmare. If you
don’t know why we need a full mesh, I recommend to start reading my IBGP tutorial first.
To reduce the number of IBGP peerings there are two techniques:
Confederations
Route Reflector
176
Let’s talk about confederations, look at the picture below:
Above we have AS 1 with 6 routers running IBGP. The number of IBGP peerings can be
calculated with the full mesh formula:
N(N-1)/2
So in our case that’s:
A BGP confederation divides our AS into sub-ASes to reduce the number of required
IBGP peerings. Within a sub-AS we still require full-mesh IBGP but between these sub-
ASes we use something that looks like EBGP but behaves like IBGP (called
confederation BGP) . Here’s an example of what a BGP confederation could look like:
177
By dividing our main AS into two sub-ASes we reduced the number of IBGP peerings
from 15 to 8.
Within the sub-AS we still have the full-mesh IBGP requirement. Between sub-ASes it’s
just like EBGP, it’s up to you how many peerings you want. The outside world will never
see your sub-AS numbers, they will only see the main AS number.
Since the sub-AS numbers are not seen outside of your network you will often see
private AS numbers used for the sub-ASes (64512 – 65535) but you can pick any
number you like.
You should now have an idea what BGP confederations are like, let’s look at the
configuration so I can add some more details. I’ll use the following topology:
178
Above we have AS 2 which is divided into two sub-ASes, AS 24 and AS 35. There’s
also AS 1 on top that we can use to see how the outside world sees our confederation.
Configuration
Just like any other IBGP configuration it’s best practice to use loopback interfaces for
the BGP sesssions. For this reason I created a loopback interface on all routers within
AS 2 and I’ll use OSPF to advertise them.
OSPF Configuration
R2(config)#router ospf 1
R2(config-router)#network 192.168.23.0 0.0.0.255 area 0
R2(config-router)#network 192.168.24.0 0.0.0.255 area 0
R2(config-router)#network 2.2.2.2 0.0.0.0 area 0
R3(config)#router ospf 1
R3(config-router)#network 192.168.23.0 0.0.0.255 area 0
R3(config-router)#network 192.168.35.0 0.0.0.255 area 0
R3(config-router)#network 3.3.3.3 0.0.0.0 area 0
R4(config)#router ospf 1
R4(config-router)#network 192.168.24.0 0.0.0.255 area 0
R4(config-router)#network 192.168.45.0 0.0.0.255 area 0
R4(config-router)#network 4.4.4.4 0.0.0.0 area 0
R5(config)#router ospf 1
R5(config-router)#network 192.168.35.0 0.0.0.255 area 0
R5(config-router)#network 192.168.45.0 0.0.0.255 area 0
R5(config-router)#network 5.5.5.5 0.0.0.0 area 0
Now we can worry about the BGP confederation configuration. I’ll explain all the
different steps…
R2(config)#router bgp 24
R2(config-router)#bgp confederation identifier 2
R2(config-router)#bgp confederation peers 35
R2(config-router)#neighbor 4.4.4.4 remote-as 24
179
R2(config-router)#neighbor 4.4.4.4 update-source loopback 0
R2(config-router)#neighbor 3.3.3.3 remote-as 35
R2(config-router)#neighbor 3.3.3.3 update-source loopback 0
R2(config-router)#neighbor 3.3.3.3 ebgp-multihop 2
The configuration of R2 requires some explanation. First of all, when you start the BGP
process you have to use the AS number of the sub-AS. Secondly, you have to use
the bgp confederation identifier command to tell BGP what the main AS number is.
We also have to configure all other sub-AS numbers with the bgp confederation
peers command, in this case that’s only AS 35. R4 is in the same sub-as so you can
configure this neighbor just like any other IBGP neighbor. R3 is a bit different though…
since it’s in another sub-AS we have to use the same rules as EBGP, that means
configuring multihop if you are using loopbacks.
Let’s take a look at R3:
R3(config)#router bgp 35
R3(config-router)#bgp confederation identifier 2
R3(config-router)#bgp confederation peers 24
R3(config-router)#neighbor 2.2.2.2 remote-as 24
R3(config-router)#neighbor 2.2.2.2 update-source loopback 0
R3(config-router)#neighbor 2.2.2.2 ebgp-multihop 2
R3(config-router)#neighbor 5.5.5.5 remote-as 35
R3(config-router)#neighbor 5.5.5.5 update-source loopback 0
R4(config)#router bgp 24
R4(config-router)#bgp confederation identifier 2
R4(config-router)#bgp confederation peers 35
R4(config-router)#neighbor 2.2.2.2 remote-as 24
R4(config-router)#neighbor 2.2.2.2 update-source loopback 0
R4(config-router)#neighbor 5.5.5.5 remote-as 35
R4(config-router)#neighbor 5.5.5.5 update-source loopback 0
R4(config-router)#neighbor 5.5.5.5 ebgp-multihop 2
R5(config)#router bgp 35
R5(config-router)#bgp confederation identifier 2
R5(config-router)#bgp confederation peers 24
180
R5(config-router)#neighbor 4.4.4.4 remote-as 24
R5(config-router)#neighbor 4.4.4.4 update-source loopback 0
R5(config-router)#neighbor 4.4.4.4 ebgp-multihop 2
R5(config-router)#neighbor 3.3.3.3 remote-as 35
R5(config-router)#neighbor 3.3.3.3 update-source loopback 0
That takes care of configuring the neighbors. The more interesting part is of course
using some show commands to see the differences with normal IBGP and EBGP. Let’s
get going…
Verification
To have something we can look at I will create a loopback interface on R5 and advertise
a network in BGP:
R5(config)#interface loopback 5
R5(config-if)#ip address 55.55.55.55 255.255.255.255
R5(config)#router bgp 35
R5(config-router)#network 55.55.55.55 mask 255.255.255.255
This entry looks pretty much the same as normal IBGP but there’s one important
difference…
181
R2#show ip bgp 55.55.55.55
BGP routing table entry for 55.55.55.55/32, version 2
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Flag: 0x820
Advertised to update-groups:
2
(35)
5.5.5.5 (metric 3) from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, confed-external,
best
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R2(config)#router bgp 24
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R1 only sees AS 2 so all the sub-AS magic remains within the BGP confederation.
Pretty neat right? Let’s try one more thing…I’ll advertise something on R1 so our
confederation can learn about it. I’ll create a loopback and advertise it in BGP:
R1(config)#interface loopback 1
R1(config-if)#ip address 11.11.11.11 255.255.255.255
182
R1(config)#router bgp 1
R1(config-router)#network 11.11.11.11 mask 255.255.255.255
There’s one more thing we have to do…since the next hop doesn’t change with BGP,
our routers will not know how to reach 192.168.12.1 (R1). I’ll fix this by advertising the
192.168.12.0 /24 network in BGP:
R2(config)#router bgp 24
R2(config-router)#network 192.168.12.0 mask 255.255.255.0
This is just plain EBGP information, nothing special. Let’s look at R4 which is in the
same sub-AS:
R4 sees the route and recognizes it as “confed-internal”. Let’s check R3 which is a bit
more interesting:
183
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Advertised to update-groups:
1
(24) 1
192.168.12.1 (metric 2) from 2.2.2.2 (2.2.2.2)
Origin IGP, metric 0, localpref 100, valid, confed-external,
best
R3 is in a different sub-AS than R2, you can see that it says confed-external. Something
important to note is that the next hop IP address didn’t change. When you use regular
EBGP, a router changes the next hop IP address of a route to its own IP address when
it sends the route to another EBGP router.
The sub-AS number from R2 has been prepended, the AS path is now (24) 1.
If you have played with BGP and regular expressions before, see if you can create some that match
on the sub-AS values…nice exercise!
BGP Synchronization
This tutorial explains the BGP synchronization rule. To understand what this is all about,
make sure you understand why we need IBGP first. If you are a little fuzzy about IBGP,
BGP split horizon and why we need IBGP full mesh adjacencies then please read
my IBGP tutorial first. Having said that, let’s look at the synchronization rule.
184
BGP synchronization is an old rule from the days where we didn’t run IBGP on all
routers within a transit AS. In short, BGP will not advertise something that it learns
from an IBGP neighbor to an EBGP neighbor if the prefix can’t be validated in its IGP.
It’s best explained with an example, take a look below:
Above we see 5 routers and 3 autonomous systems. When we want to get from R1 to
R5 we’ll have to cross AS2, this makes AS2 our transit AS.
EBGP has been configured between R1/R2 and also between R4/R5. IBGP is
configured between R2/R4 and R3 on top doesn’t run BGP at all.
The routers within AS2 are configured with OSPF, this is required since R2/R4 have to
be able to reach each other to establish the IBGP session.
R1 will advertise a prefix in BGP, AS2 and AS3 will learn about this prefix…
OSPF Configuration
185
The OSPF configuration is really straight-forward. R2 and R4 have a loopback interface
that is used for the IBGP peering which is advertised in OSPF:
R2#
router ospf 1
network 2.2.2.0 0.0.0.255 area 0
network 192.168.23.0 0.0.0.255 area 0
R3#
router ospf 1
network 3.3.3.0 0.0.0.255 area 0
network 192.168.23.0 0.0.0.255 area 0
network 192.168.34.0 0.0.0.255 area 0
R4#
router ospf 1
network 4.4.4.0 0.0.0.255 area 0
network 192.168.34.0 0.0.0.255 area 0
BGP Configuration
The configuration of R1 is simple, it’s configured to run EBGP with R2 and it advertises
network 1.1.1.0 /24 into BGP:
R1#
router bgp 1
no synchronization
bgp log-neighbor-changes
network 1.1.1.0 mask 255.255.255.0
neighbor 192.168.12.2 remote-as 2
no auto-summary
R2#
router bgp 2
no synchronization
bgp log-neighbor-changes
neighbor 4.4.4.4 remote-as 2
neighbor 4.4.4.4 update-source Loopback0
neighbor 4.4.4.4 next-hop-self
neighbor 192.168.12.1 remote-as 1
no auto-summary
186
R4 is similar to R2:
R4#
router bgp 2
no synchronization
bgp log-neighbor-changes
neighbor 2.2.2.2 remote-as 2
neighbor 2.2.2.2 update-source Loopback0
neighbor 2.2.2.2 next-hop-self
neighbor 192.168.45.5 remote-as 3
no auto-summary
R4#show ip bgp
BGP table version is 10, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R5#show ip bgp
187
BGP table version is 6, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Great, R5 also knows about this network. The problem in this scenario however is that
we will never get any IP packets from AS3 to AS1 since R3 doesn’t run BGP…it will
never learn about network 1.1.1.0 /24 so whenever R4 forwards something, it will be
dropped. Take a look at R3 here:
To synchronization rule was created to prevent this problem. Let’s find out how it
works…
R2(config)#router bgp 2
R2(config-router)#synchronization
R4(config)#router bgp 2
R4(config-router)#synchronization
R4#show ip bgp
BGP table version is 11, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
188
R4 sees the network in its BGP table but refuses to install it.
This is because the synchronization rule states that the prefix has to be in its IGP before
it can advertise it to an EBGP neighbor. Since R4 can’t install it, R5 will never learn
about it:
R5#show ip bgp
So to fix this, you can either disable synchronization OR redistribute the prefix into your
IGP (OSPF in our case). Let’s do that:
R2(config)#router ospf 1
R2(config-router)#redistribute bgp 2 route-map PREFIX subnets
I am using a route-map to redistribute only this particular prefix, you don’t have to but
you don’t want to accidently redistribute your entire BGP table into OSPF.
According to the BGP synchronization rule, we are now allowed to advertise it to our
EBGP neighbor. Take a look at R4:
R4#show ip bgp
BGP table version is 13, local router ID is 4.4.4.4
189
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
R4 selected this one as the best one. the “r” in front is there because this router will
install the OSPF (AD 110) entry for 1.1.1.0 /24 instead of the IBGP (AD 200) route.
What about R5?
R5#show ip bgp
BGP table version is 8, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
When your router learns about a prefix through EBGP and an IGP (RIP, OSPF or
EIGRP) then it will always prefer the external BGP route. EBGP uses an administrative
distance of 20 so it’s preferred over OSPF (110), RIP (120) or EIGRP (90).
190
Above you see 3 routers, R1,R2 and R3. Imagine R1 and R2 are two sites from a
customer and R3 is the ISP router.
R1 and R2 have a fast “backdoor” link and OSPF is configured to exchange some
prefixes between the two sites. To illustrate this I have added a loopback interface on
these two routers.
R1 and R2 are also configured to use EBGP with R3, they advertise the same prefixes
as they do in OSPF. This introduces a problem:
191
Above you see that R1 learns about the 2.2.2.2 /32 prefix through BGP (R3) and OSPF
(R2). Since EBGP has a lower (thus better) AD it will install this path in its routing table.
The same thing applies to R2 for the 1.1.1.1 /32 prefix.
Let’s take a look at this scenario on our routers, I’ll configure OSPF and BGP and you
will learn how to fix this problem.
OSPF Configuration
First we’ll configure R1 and R2 to run OSPF. I’ll advertise their loopback interfaces:
R1(config)#router ospf 1
R1(config-router)#network 192.168.12.0 0.0.0.255 area 0
R1(config-router)#network 1.1.1.1 0.0.0.0 area 0
R2(config)#router ospf 1
R2(config-router)#network 192.168.12.0 0.0.0.255 area 0
R2(config-router)#network 2.2.2.2 0.0.0.0 area 0
Nothing special here, just a basic OSPF configuration. Here’s what the routing table of
R1 and R2 looks like now:
They learned about each others prefixes, great! Our next move is configuring BGP…
BGP Configuration
R1 and R2 will both peer with R3 and I’ll advertise their loopback interfaces in BGP:
192
R3(config-router)#neighbor 192.168.23.2 remote-as 2
Just a plain and simple BGP configuration. Now look again at the routing table of R1
and R2:
R1 and R2 will now use R3 to reach each others loopback interfaces. This happens
because the AD of EBGP is 20 while OSPF has an AD of 110. As a result, OSPF is
removed from the routing table. So how do we fix this? You could change the
administrative distance manually but this tutorial is about the “backdoor” feature so let’s
see how it works.
You use the network command but add the backdoor keyword at the end.
Verification
Let’s see what changed:
Great! Our routers now prefer the OSPF routes again. The prefixes are still in BGP as
you can see here:
193
R1#show ip bgp
BGP table version is 7, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
This is a good thing. When the backdoor link fails we can still use the information from
BGP, let’s simulate that:
Shutting the interface will cause the OSPF adjacency to drop. Here’s what the routing
tables look like when that happens:
Excellent we now have our BGP information in the routing table. This output also
reveals how the backdoor command really works…if you look closely you can see that it
changed the AD from 20 to 200.
194
The normal version of BGP (Border Gateway Protocol) only supported IPv4 unicast
prefixes. Nowadays we use MP-BGP (Multiprotocol BGP) which supports different
addresses:
IPv4 unicast
IPv4 multicast
IPv6 unicast
IPv6 multicast
MP-BGP is also used for MPLS VPN where we use MP-BGP to exchange the VPN
labels. For each different “address” type, MP-BGP uses a different address family.
To allow these new addresses, MBGP has some new features that the old BGP doesn’t
have:
Configuration
MP-BGP with IPv6 adjacency & IPv6 prefixes
Let’s start with a simple example where we use IPv6 for the neighbor adjacency and
exchange some IPv6 prefixes. Here’s the topology I will use:
195
Here’s the configuration of R1:
R1(config)#router bgp 1
R1(config-router)#neighbor 2001:db8:0:12::2 remote-as 2
R1(config-router)#address-family ipv4
R1(config-router-af)#no neighbor 2001:db8:0:12::2 activate
R1(config-router-af)#exit
R1(config-router)#address-family ipv6
R1(config-router-af)#neighbor 2001:db8:0:12::2 activate
R1(config-router-af)#network 2001:db8::1/128
In the configuration above we first specify the remote neighbor. The address-family
command is used to change the IPv4 or IPv6 settings. I disable the IPv4 address-family
and enabled IPv6. Last but not least, we advertised the prefix on the loopback interface.
The configuration of R2 looks similar:
R2(config)#router bgp 2
R2(config-router)#neighbor 2001:db8:0:12::1 remote-as 1
R2(config-router)#address-family ipv4
R2(config-router-af)#no neighbor 2001:db8:0:12::1 activate
R2(config-router-af)#exit
R2(config-router)#address-family ipv6
R2(config-router-af)#neighbor 2001:db8:0:12::1 activate
R2(config-router-af)#network 2001:db8::2/128
R1#
%BGP-5-ADJCHANGE: neighbor 2001:DB8:0:123::2 Up
196
R1#show ipv6 route bgp
IPv6 Routing Table - default - 7 entries
Codes: C - Connected, L - Local, S - Static, U - Per-user Static
route
B - BGP, HA - Home Agent, MR - Mobile Router, R - RIP
I1 - ISIS L1, I2 - ISIS L2, IA - ISIS interarea, IS - ISIS
summary
D - EIGRP, EX - EIGRP external, NM - NEMO, ND - Neighbor
Discovery
l - LISP
O - OSPF Intra, OI - OSPF Inter, OE1 - OSPF ext 1, OE2 -
OSPF ext 2
ON1 - OSPF NSSA ext 1, ON2 - OSPF NSSA ext 2
B 2001:DB8::2/128 [20/0]
via FE80::217:5AFF:FEED:7AF0, FastEthernet0/0
R2#show ipv6 route bgp
IPv6 Routing Table - default - 7 entries
Codes: C - Connected, L - Local, S - Static, U - Per-user Static
route
B - BGP, HA - Home Agent, MR - Mobile Router, R - RIP
I1 - ISIS L1, I2 - ISIS L2, IA - ISIS interarea, IS - ISIS
summary
D - EIGRP, EX - EIGRP external, NM - NEMO, ND - Neighbor
Discovery
l - LISP
O - OSPF Intra, OI - OSPF Inter, OE1 - OSPF ext 1, OE2 -
OSPF ext 2
ON1 - OSPF NSSA ext 1, ON2 - OSPF NSSA ext 2
B 2001:DB8::1/128 [20/0]
via FE80::21D:A1FF:FE8B:36D0, FastEthernet0/0
The routers learned each others prefixes…great! This example was pretty straight-
forward but you have now learned how MP-BGP uses different address families.
Configurations
R1
R2
Want to take a look for yourself? Here you will find the configuration of each device.
197
Here’s the configuration:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
R1(config)#router bgp 1
R1(config-router)#address-family ipv6
R1(config-router-af)#network 2001:db8::1/128
R1(config-router-af)#neighbor 192.168.12.2 activate
R2(config)#router bgp 2
R2(config-router)#address-family ipv6
R2(config-router-af)#network 2001:db8::2/128
R2(config-router-af)#neighbor 192.168.12.1 activate
Once we enter the address-family IPv6 configuration there are two things we have to
configure. The prefix has to be advertised and we need to specify the neighbor. The
prefixes on the loopback interface should now be advertised. Let’s check it out:
198
*> 2001:DB8::1/128 :: 0 32768 i
* 2001:DB8::2/128 ::FFFF:192.168.12.2
0 0 2 i
R2#show ip bgp ipv6 unicast
BGP table version is 2, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
As you can see the routers have learned about each others prefixes. There’s one
problem though…we were able to exchange IPv6 prefixes but we only use IPv4
between R1 and R2, there is no valid next hop address that we can use.
To fix this, we need to use some IPv6 addresses that we can use as the next hop. We’ll
have to configure a prefix between R1 and R2 for this:
Now we have IPv6 addresses that we can use as the next hop. We are using IPv4 for
the neighbor peering so the next hop doesn’t change automatically. We’ll have to use a
route-map for this:
199
R2(config-router-af)#neighbor 192.168.12.1 route-map IPV6_NEXT_HOP
in
Both routers will now advertise their IPv6 address as the next hop for all prefixes that
are advertised. Let’s reset BGP:
R1#clear ip bgp *
The next hop IPv6 addresses are now reachable so they can be installed in the routing
table. The downside of this solution is that we had to fix the next hop ourselves, the
advantage however is that we have a single BGP neighbor adjacency that can be used
for the exchange of IPv4 and IPv6 prefixes.
Just like IP addresses, ASNs (Autonomous System Numbers) have to be unique on the
Internet. The main reason for this is that BGP uses the AS number for its loop
prevention mechanism. When BGP learns about a route that has its own AS number in
its path then it will be discarded.
Here’s an example:
200
Above we have three routers, R1 and R3 are using the same AS number. Once R1
sends an update, R2 will accept it but R3 will not since the AS number is the same.
To prevent the above from happening, IANA is in control of the AS numbers (similar to
public IP addresses). If you want an AS number for the Internet then you’ll have to
request one. They started with 16-bit AS numbers (also called 2-octed AS numbers)
that were assigned like this:
0: reserved.
1-64.495: public AS numbers.
64.496 – 64.511 – reserved to use in documentation.
64.512 – 65.534 – private AS numbers.
65.535 – reserved.
The 1-64.495 public AS range is pretty small so there are similar issues to the IPv4
public IP addresses, there aren’t enough numbers. Right now (May 2015) there are only
199 AS numbers left that could be assigned. You can see the current status of available
AS numbers here.
To get more AS numbers, an extension has been created that supports 32-bit AS
numbers (also called 4-octed AS numbers). This means we have about 4.294.967.296
AS numbers that we can use.
When you request an AS number you’ll have to justify why you need a public AS
number. For some organizations, using a private AS number should also be a solution.
Private AS numbers can be used when you are connected to a single AS that uses a
public AS number. Here’s an example:
201
R1 is behind R2 and using private AS number 64512. R2 is using public AS number AS
2. In the BGP table of R2 we will find the AS number of R1 but once it advertises
something to AS3, it will remove the private AS number.
Another example where we can use private AS numbers are BGP confederations. Within
the confederation we can use private AS numbers, to the outside world we use a public
AS number.
Removing the private AS numbers is a bit similar to NAT where we hide private IP
addresses behind one or more public IP addresses.
Private range AS numbers (64512 – 65535) should not be used on the Internet since
they are not unique like public AS numbers.
Sometimes, private AS numbers are used for customer networks that are behind
a single ISP. The advantage of doing this is that we will save some public AS numbers,
the disadvantage is that if you ever plan to connect to another ISP, you should switch to
a public AS number.
When the ISP forwards prefixes that it learns from the private AS, it will remove the
private AS number before it forwards the prefix to other autonomous systems.
Cisco IOS routers support the remove-private-as command to achieve this. There are
some restrictions however:
You can only use this for eBGP neighbors.
The private AS numbers are removed from outbound updates.
You can only have private AS numbers in the AS path, if you have a mix of public and private AS
numbers then the router won’t remove anything (there’s a solution for this though that I will
demonstrate).
If the AS path contains the AS number of the eBGP neighbor then it won’t be removed.
If there are confederations, BGP only removes private AS numbers after the confederation part
in the AS path.
Let’s take a look at the configuration!
Configuration
I will use the following 3 routers for this:
202
R1 is in a private AS while R2 and R3 use public AS numbers. We’ll advertise the
loopback interface on R1 in eBGP so that R2 and R3 can learn it. Here’s the BGP
configuration of these routers:
Remove-Private-AS
Let’s take a look at R2 and R3, they should have learned about 1.1.1.1/32:
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
203
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
In the AS path we see AS 2 and 64512, this is as expected. Now let’s configure R2 to
remove the private AS number:
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.23.3 remove-private-as
We use the remove-private-as command for this. Let’s clear BGP to speed things up:
R2#clear ip bgp *
R3#show ip bgp
BGP table version is 5, local router ID is 192.168.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
It’s only showing AS 2 in the AS path now, the private AS number has been removed.
That’s easy enough, there are a few other things we can try however…
Remove-Private-AS All
Removing the private AS number(s) will only work if there are no public AS numbers in
the AS path. To demonstrate this I will add extra AS numbers on the update from R1:
204
R1(config-route-map)#set as-path prepend 1 64513 11 64514 111
I used a mix of public and private AS numbers. Let’s add these to the updates to R2:
Let’s reset R2 to speed things up and check the BGP table of R2 and R3:
R2#clear ip bgp *
R2#show ip bgp
BGP table version is 2, local router ID is 192.168.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
As you can see above, the AS path didn’t change. No private AS numbers have been
removed because there are some public AS numbers in the AS path. Cisco IOS sees
this as a misconfiguration so it won’t do anything. We can change this behavior though:
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.23.3 remove-private-as all
R2#clear ip bgp *
IOS 15.1T and later support the all parameter. This will remove all private AS numbers,
no matter what else there is in the AS path. Let’s take another look at R3:
205
R3#show ip bgp
BGP table version is 11, local router ID is 192.168.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
All private AS numbers are now gone from the BGP table, only public AS numbers
remain.
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.23.3 remove-private-as all
replace-as
R2#clear ip bgp *
Add the replace-as parameter behind the remove-private-as all command and that’s it.
Here’s what the BGP table of R3 looks like now:
R3#show ip bgp
BGP table version is 12, local router ID is 192.168.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
206
Before January 2009, we only had 2 byte AS numbers in the range of 1-65535. 1024 of
those (64512-65534) are reserved for private AS numbers.
Similar to IPv4, we started running out of AS numbers so IANA increased the AS
numbers by introducing 4-byte AS numbers in the range of 65536 to 4294967295.
There are three ways to write down these new 4-byte AS numbers:
Asplain
Asdot
Asdot+
Asplain is the most simple to understand, these are just regular decimal numbers.
For example, AS number 545435, 4294937295, 4254967294, 2294967295, etc. These
numbers are simple to understand but prone to errors. It’s easy to make a
configuration mistake or misread a number in the BGP table.
Asdot represents AS numbers less than 65536 using the asplain notation and AS
numbers above 65536 with the asdot+ notation.
Asdot+ breaks the AS number in two 16-bit parts, a high-order value, and a low-
order value, separated by a dot. All older AS numbers can fit in the second part
where the first part is set to 0. For example:
AS 6541 becomes 0.6541
AS 54233 becomes 0.54233
AS 544 becomes 0.544
For AS numbers above 65535, we use the next high order bit value and start counting
again at 0. For example:
If you want to convert an asplain AS number to an asdot+ AS number, you take the
asplain number and see how many times you can divide it by 65536. This is the integer
that we use for the high order bit value.
207
Then, you take the asplain number and deduct (65536 * the integer) to get your low
order bit value. In other words, this is the formula:
#AS 5434995
5434995 / 65536 = 82
5434995 - (82 * 65536) = 61043
asdot = 82.6104
#AS 1499547
1499547 / 65536 = 22
1499547 - (22 * 65536) = 57755
asdot = 22.57755
Once you understand how the conversion is done, you can use the APNIC asplain to
asdot calculatorto convert this automatically and make your life a bit easier.
BGP speakers that support 4-byte AS numbers advertise this via BGP capability
negotiation and there is backward compatibility. When a “new” router talks to an “old”
router (one that only supports 2-byte AS numbers), it can use a reserved AS number
(23456) called AS_TRANS instead of its 4-byte AS number. I’ll show you how this works
in the configuration.
Configuration
Cisco routers support the asplain and asdot representations. The configuration is
pretty straightforward and I’ll show you two scenarios:
Video 1
We have two routers:
208
Both routers support 4-byte AS numbers. You can see this when you configure the AS
number:
R1(config)#router bgp ?
<1-4294967295> Autonomous system number
<1.0-XX.YY> Autonomous system number
As you can see, this IOS router supports asplain and asdot numbers. Let’s pick asplain
and establish a BGP neighbor adjacency:
You can see the asplain AS numbers in all bgp show commands:
If you want, you can change the representation to the asdot format:
R1(config-router)#bgp asnotation ?
209
dot asdot notation
You will now see the asdot format in all show commands:
Configurations
R1
R2
Want to take a look for yourself? Here you will find the configuration of each device.
2-byte AS support
Video 2
Let’s use two routers. R1 only supports 2-byte AS numbers, R2 supports 4-byte AS
numbers:
210
R1 has no clue what an AS number above 65535 is:
R1(config)#router bgp ?
<1-65535> Autonomous system number
What value do we enter for R2 here? R2 is in AS 22222222 but we can’t configure that
number. We need to use the reserved AS number (AS_TRANS) here:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 23456
Here’s a capture of the neighbor adjacency. You can see that R2 adds the 4-byte AS
number capability in its OPEN message:
211
BGP 4-byte AS number capability
Once the neighbor adjacency is established, R2 uses AS 23456 when talking with R1:
212
192.168.12.1 4 1 2 2 1 0 0 00:00:25
0
Configurations
R1
R2
Want to take a look for yourself? Here you will find the configuration of each device.
Conclusion
You have now learned about BGP 4-byte AS numbers:
When we change the BGP routing policy (changing the attributes or adding filters) we
need to reset the BGP session before the new policy takes effect. This is no problem in
a lab but it’s something you don’t want to do in a production network. In fact, there are 3
methods how you can refresh your BGP policies:
Hard reset
Dynamic Soft Reset (route refresh)
Soft reset with pre-stored information
213
The hard reset is the most simple method (clear ip bgp command). It kills the TCP
session with your BGP neighbor which forces it to restart and as a result you’ll receive
all prefixes from your neighbor again. It works, but it’s cruel…
Dynamic soft reset is the most preferred method, it requires the route refresh capability.
Simply said, this feature lets your router request its BGP neighbor to send its prefixes
again.
Routers that don’t support the route refresh capability will have to use the soft
reset option. That’s what this tutorial is about. You can read about dynamic soft reset /
route refresh in my other tutorial.
Normally I talk about “prefixes” or “routes” but technically the information that BGP exchanges in
update messages is called NLRI (Network Layer Reachability Information). The NLRI field
contains the prefixes and length.
The soft reset option uses “pre-stored” information. Basically when we receive prefixes
from a BGP neighbor we will store this information in a new table and we don’t make
any changes to it. Our router will then apply its inbound BGP policy to this table and
stores the end result as the BGP table.
Since you are now storing another table for each neighbor instead of one BGP table you
will have some overhead, your router will require more memory. This is especially true
when you enable soft reset for all your BGP neighbors…keep this in mind before you
configure this.
The tables that I’m talking about have some special names, let me show you a picture
and explain this a bit more:
214
On the left side we see a table called adj-RIB-in. This is the unedited routing information
from a BGP neighbor. There’s a separate table for each BGP neighbor that you peer
with. We apply our inbound BGP policy to this information and the result is a table called
the loc-RIB, this is the actual BGP table.
BGP will select the best path from the BGP table and the router will install this in the
routing table. Also, the best paths can be advertised to other BGP neighbors. We can
apply an outbound BGP policy to outbound updates and when this is done we have a
table called adj-RIB-out (per neighbor). The adj-RIB-in table is actually stored in
memory for each neighbor, the adj-RIB-out table not.
Now you have an idea about the different tables and how soft reconfiguration works,
let’s take a look at this on some BGP routers.
Configuration
To demonstrate the soft reset we only need two routers. R1 has two loopback interfaces
so that we have a couple of networks to advertise:
215
First we will configure BGP between the two routers:
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1(config-router)#network 1.1.1.1 mask 255.255.255.255
R1(config-router)#network 11.11.11.11 mask 255.255.255.255
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
Nothing special here, we run EBGP and R1 advertises its two loopback interfaces. By
default the soft reset option is disabled, let’s configure it on R2:
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 soft-reconfiguration
inbound
216
This will set the local preference to 200 for all incoming prefixes from R1. Instead of
clearing the TCP session, we’ll do a soft reset:
Use the soft in parameter to do a soft reset. Now look at the BGP table first:
R2#show ip bgp
BGP table version is 3, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
The BGP table (loc-RIB) was modified as expected, now take a look at the adj-RIB-in
table:
Above you see the raw routing information from R1 before we applied the inbound BGP
policy. You can see that no changes were made to the local preference of my prefixes.
217
R2(config-router)#neighbor 192.168.12.1 distribute-list 1 in
I’ll use a distribute-list so that 11.11.11.11 /32 is not allowed anymore. Before I do
another soft reset I’ll enable a debug, this allows you to see what the router is doing with
the BGP updates:
R2#
BGP(0): start inbound soft reconfiguration for
BGP(0): process 1.1.1.1/32, next hop 192.168.12.1, metric 0 from
192.168.12.1
BGP(0): process 11.11.11.11/32, next hop 192.168.12.1, metric 0
from 192.168.12.1
BGP(0): Prefix 11.11.11.11/32 rejected by inbound
distribute/prefix-list.
BGP(0): update denied
BGP(0): complete inbound soft reconfiguration, ran for 0ms
The router starts the soft reconfiguration, rejects the 11.11.11.11 /32 prefix and
completes the soft reconfiguration. Take a look at the BGP table:
R2#show ip bgp
BGP table version is 4, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
As expected it’s gone but you will still find it in the adj-RIB-in table:
218
R2#show ip bgp neighbors 192.168.12.1 received-routes
BGP table version is 4, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Those are two good examples that show the difference between the adj-RIB-in and Loc-
RIB tables. Of course we can also view the adj-RIB-out table, I’ll show you an example
of R1:
A long time ago there was no method to dynamically request a re-advertisement of the
prefixes of one of your BGP neighbors. When you change your policy, somehow you
have to compare all the prefixes from your BGP neighbor against your new policy.
219
additional memory since you are saving an additional table for each BGP neighbor.
Since 2000 we also have the route refresh capability, simply said…your router will ask
its BGP neighbor to re-send its prefixes.
Here are the 3 options that we have to refresh our BGP table when our policy changes:
Hard reset
Soft reconfiguration
Route refresh capability
The hard reset is the most simple method (clear ip bgp command). It kills the TCP
session with your BGP neighbor which forces it to restart and as a result you’ll receive
all prefixes from your neighbor again. It works but will interrupt your network, not a good
idea.
The soft reconfiguration will store everything that you receive from a BGP neighbor in
a separate table before applying the policy. I explain this in my soft reconfiguration
tutorial. This works but it’s not very efficient. Your router will store an entire table for
each BGP neighbor with the unmodified prefixes, you’ll need extra memory.
Route refresh capability is the most preferred method…when you change your BGP
policy you just send a message to your BGP neighbor and it will re-send you all its
prefixes, there will be no disruption at all.
In this tutorial we’ll look at the route refresh capability, it’s described in RFC 2918 and
supported on most routers.
Configuration
I will use two routers for this, R1 and R2. I have added two loopback interfaces on R1
so that we have something to advertise:
220
R1(config)#router bgp 1
R1(config-router)#neighbor 192.168.12.2 remote-as 2
R1(config-router)#network 1.1.1.1 mask 255.255.255.255
R1(config-router)#network 11.11.11.11 mask 255.255.255.255
R2(config)#router bgp 2
R2(config-router)#neighbor 192.168.12.1 remote-as 1
Route refresh is enabled by default, you can verify this by using the following show
command:
This router can do a route refresh for inbound prefixes (what you learn from you BGP
neighbor) or outbound (the prefixes that you send to them). On my IOS 15.x router you
see “(new)” which means this router supports the RFC 2918 version of route refresh.
Some older IOS versions might show (“old & new”) which means they also support a
version of route refresh that Cisco implemented before the RFC was created.
R2#show ip bgp
BGP table version is 3, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
That’s looking good. Now I will create a route-map that changes one of the BGP
attributes. This means the router will have to update its BGP table somehow:
R2(config)#router bgp 2
221
R2(config-router)#neighbor 192.168.12.1 route-map METRIC in
This route-map will set the metric to 222 for all prefixes that we receive from R1. Let’s
look at he BGP table again:
R2#show ip bgp
BGP table version is 3, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
As you can see nothing has changed yet. We’ll use the route refresh method to fix this
but before I do so, let’s enable a debug so you can see in realtime what is going on:
222
You can choose between inbound, outbound or both. Let’s do inbound:
R2#
BGP: 192.168.12.1 sending REFRESH_REQ(5) for afi/safi: 1/1
R1#
BGP: 192.168.12.2 rcvd REFRESH_REQ for afi/safi: 1/1
R2#show ip bgp
BGP table version is 5, local router ID is 192.168.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i
- internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete
Very nice, the metric has been updated and we didn’t clear the BGP session…mission
accomplished!
When you enable soft reconfiguration, your router will no longer send a route refresh update
request to its BGP neighbor but it will use the routing information that it stored for this neighbor.
External BGP uses a simple loop prevention mechanism: when you see your own AS
number in the AS path, we don’t accept the prefix. There are some scenarios where this
might be an issue. Take a look at the following topology:
223
Above we have a MPLS VPN network where the customer is using the same AS
number (12) on both sites. CE1 and CE2 will be unable to learn each others prefixes
since they are using the same AS number.
Video 3
Each CE router has a loopback interface that was advertised in BGP (1.1.1.1/32 and
5.5.5.5/32). The first thing to check is to see if the PE routers have learned the prefixes
from our CE routers:
224
PE1#show ip bgp vpnv4 all
Above you can see that both PE routers have a VPN route for these prefixes. Did they
advertise these prefixes to our CE routers?
225
No issues there, our PE routers are advertising these prefixes to the CE routers. Let’s
see what we find in the BGP tables of the CE routers:
CE1#show ip bgp
The CE routers only have their own prefixes in their BGP tables. Why did they refuse
the updates from the PE routers? Time for a debug:
CE1#clear ip bgp *
CE1(config)#router bgp 12
CE1(config-router)#neighbor 192.168.12.2 allowas-in
CE1 is now configured to allow prefixes with its own AS number from the PE1 router. If
you left the debug enabled then you will see this:
226
CE1#
BGP(0): Revise route installing 1 of 1 routes for 5.5.5.5/32 ->
192.168.12.2(global) to main IP table
That should take care of our problem. Let’s see if the prefix has been installed:
There we go, it’s in the routing table. Don’t forget to configure the same change on CE2:
CE2(config)#router bgp 12
CE2(config-router)#neighbor 192.168.45.4 allowas-in
That’s looking good. One final check left, let’s see if there is connectivity between
1.1.1.1 and 5.5.5.5:
227
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 7/9/11
ms
Conclusion
The allow-AS command is a simple trick to overrule the loop prevention mechanism of
external BGP. In this example it’s safe to disable it since CE1 and CE2 are stub routers,
they only have one exit path through the PE routers. This solution allowed us to solve
the problem on the CE routers. We can also fix it by making a change on the PE
routers, I’ll show you how to do this in the AS override lesson.
When your customer sites are multihomed or have a backdoor link between them then
you have to be careful as this solution can introduce loops. The BGP SoO (Site of
Origin) communitry attribute is then used as a loop prevention mechanism. This is
something we will cover in another lesson.
BGP has a simple loop prevention mechanism for external BGP. When you see your
own AS number in the AS path, we do not accept the prefix. This mechanism is fine for
Internet routing but there are some other scenarios where this might be an issue. Take
a look at the following topology:
228
Above we have a small MPLS VPN network with two customer sites. The customer is
using the same AS number (12) for both sites. When CE1 or CE2 receive an update
from each other they will not accept it since their own AS number will be in the AS path.
Video 4
Let’s find out if this is true. Here are the configurations of all routers:
Let’s find out what is going on. First we’ll check if the PE routers have a VPN route for
the prefixes from the CE routers:
229
Route Distinguisher: 1:1 (default for vrf CUSTOMER)
*> 1.1.1.1/32 192.168.12.1 0 0 12 i
*>i 5.5.5.5/32 4.4.4.4 0 100 0 12 i
PE2#show ip bgp vpnv4 all
The PE routers have an entry for the loopback interfaces of the CE routers. Are they
advertising these to the CE routers?
The PE routers are advertising these to the CE routers. Let’s check the CE routers:
230
CE1#show ip bgp
There’s nothing there…they only have the prefix on their own loopback interface in the
BGP table. Let’s enable a debug on CE1 to figure out why it’s not accepting anything
from PE1:
CE1#clear ip bgp *
CE1#
BGP(0): 192.168.12.2 rcv UPDATE about 5.5.5.5/32 -- DENIED due to:
AS-PATH contains our own AS;
No surprises here…CE1 is denying the update since it sees its own AS number in the
AS path. If we want to keep the same AS number on CE1 and CE2 then there are two
possible solutions for this issue:
Allow-AS in: this can be configured on the CE routers which tells them to accept prefixes with
their own AS number in the AS path.
AS override: this can be configured on the PE routers, the AS number will be replaced with the
AS number from the service provider.
This lesson is about AS override so that’s what we will do. Let’s configure the PE
routers:
231
PE2(config)#router bgp 234
PE2(config-router)#address-family ipv4 vrf CUSTOMER
PE2(config-router-af)#neighbor 192.168.45.5 as-override
To speed things up, let’s clear the BGP neighbor adjacencies on the PE routers:
The CE routers have now learned each others prefixes. If you take a closer look, you
can see that AS number 1 has been replaced with AS number 234.
One final check, let’s see if there is connectivity between 1.1.1.1 and 5.5.5.5:
232
Conclusion
AS override is a simple technique to change the AS number of updates that you
advertise to your external BGP neighbors. Another solution is allow AS in but this is
configured on the CE routers. Since we are “overruling” the external BGP loop
prevention mechanism you have to make sure that you have a loop-free topology.
In this scenario there are no issues since the CE routers are stubs, they only have one
exit path. When your customer sites are multihomed or have a backdoor link then you
need to use the BGP SoO (Site of Origin) community to ensure you have a loop free
topology. This is something we’ll cover in another lesson.
When you use the BGP aggregate-address command on Cisco IOS without any
parameters, then all information of individual route attributes such as AS_PATH is lost.
This can cause issues since the AS_PATH is used for loop prevention. For example, it’s
possible that an AS installs a summary that it shouldn’t. With the AS-SET parameter,
you can optionally include AS information in the summary. In this lesson, I’ll show you
how to do this.
Configuration
Here is the topology we’ll use:
233
We have four routers, all in a different AS. R2 and R3 have a loopback with an IP
address that are advertised in BGP. R1 will send an aggregate to R4.
Video 5
Want to take a look for yourself? Here you will find the startup configuration of each
device.
Right now, there is no aggregate so R4 sees two separate prefixes with the correct AS
path information:
234
R4#show ip bgp
BGP table version is 3, local router ID is 192.168.14.4
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Without AS-SET
Let’s create a summary/aggregate. We’ll start without the AS-SET parameter so that we
have a before and after example:
R1(config)#router bgp 1
R1(config-router)#aggregate-address 172.16.0.0 255.255.0.0 summary-
only
R4#show ip bgp
BGP table version is 10, local router ID is 192.168.14.4
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
We see the 172.16.0.0/16 prefix but all AS path information is lost. This prefix seems to
come from AS 1 only.
If R4 was connected to R2 or R3 then those routers would install this prefix without
hesitation since they don’t see their own AS number in the summary route. This could
cause routing loops.
235
With AS-SET
R1(config)#router bgp 1
R1(config-router)#aggregate-address 172.16.0.0 255.255.0.0 summary-
only as-set
R4#show ip bgp
BGP table version is 11, local router ID is 192.168.14.4
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
We now see the AS path information in the aggregate. This helps against routing loops
as it shows the AS numbers in the aggregate. If R2 or R3 would somehow receive this
aggregate, they would not accept it since they see their own AS number.
So, should we always use AS-SET? Maybe, there is a downside to using this. Whenever
there is a change in the aggregate, an update will be sent by R1. For example, let’s shut
the loopback on R3:
R3(config)#interface Loopback 0
R3(config-if)#shutdown
R4#show ip bgp
BGP table version is 12, local router ID is 192.168.14.4
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
236
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Information about AS 3 has been removed. It’s interesting to see that this router does
now show 1 {2} but just 1 2.
If you have an aggregate that covers hundreds or thousands of prefixes then a change
in your aggregate is likely. If you have a flapping network somewhere, it’s possible that
your aggregate keeps getting updated.
Conclusion
You have now learned how you can include AS path information in aggregates
(summaries) with the AS-SET parameter. This helps to prevent routing loops in case the
aggregate somehow makes it back to one of the ASes where one of the prefixes that
fall within the range of your aggregate originated from.
The disadvantage of AS-SET is that by including AS path information, it’s possible that
your aggregate gets updated whenever there is a change. With a flapping network
somehow, this could mean that your aggregate keeps getting updated over and over
again.
I hope you enjoyed this lesson. If you have any questions feel free to leave a comment!
Unlike most routing protocols, BGP only selects a single best path for each prefix. It
doesn’t do ECMP (Equal Cost Multi-Path Routing) by default but it is possible to enable
this.
In order for BGP to use the second path, the following attributes have to match:
Weight
Local Preference
AS Path (both AS number and AS path length)
237
Origin code
MED
IGP metric
Also, the next hop address for each path must be different. This comes into play when
you are multihomed to the same router.
In this lesson, I’ll show you how to configure eBGP and iBGP to use more than one
path.
Configuration
We’ll start with two eBGP scenarios.
eBGP
Let’s look at a scenario where we have two paths to the same AS. Here’s the topology:
238
R1 is in AS 1 and connected to R2/R3 in AS23. R1 will will have paths to get to
192.168.23.0/24.
Configurations
R1
R2
R3
Want to take a look for yourself? Here you will find the startup configuration of each
device.
R1#show ip bgp
BGP table version is 2, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
239
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
R1 has two equal paths but decided to install the path to R2. We can enable load
balancing with the maximum-paths command:
R1(config)#router bgp 1
R1(config-router)#maximum-paths 2
R1#show ip bgp
BGP table version is 3, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Now we have two entries. Note the “m” that stands for multipath. Both paths are
installed in the routing table:
Different AS
Video 2
Let’s look at another eBGP scenario. This time, we have multiple AS numbers:
240
R1 can go through AS 3 or AS 2 to get to 4.4.4.4/32 in AS 4.
Configurations
R1
R2
R3
R4
Want to take a look for yourself? Here you will find the startup configuration of each
device.
R1#show ip bgp
BGP table version is 2, local router ID is 192.168.13.1
241
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
R1 has installed R2 as its next hop address. Let’s see if we can change that:
R1(config)#router bgp 1
R1(config-router)#maximum-paths 2
R1#show ip bgp
*Mar 21 11:10:13.118: %SYS-5-CONFIG_I: Configured from console by console
BGP table version is 2, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
The problem here is that we have two different AS numbers, AS 2 and AS 3. We can tell
BGP to “relax” its requirement of having the same AS path numbers and AS path length
to only checking the AS path length. This can be done with the following hidden
command:
R1#show ip bgp
BGP table version is 3, local router ID is 192.168.13.1
242
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
We now see the “m” so we know R1 uses R3 as well. We can confirm this by looking at
the routing table:
iBGP
Video 3
What about iBGP? Let’s take a look at the following topology:
243
R1, R2, and R3 are in AS 123 while R4 is in AS 4. R1 can use either R2 or R3 to get to
4.4.4.4/32.
Configurations
R1
R2
R3
R4
Want to take a look for yourself? Here you will find the startup configuration of each
device.
244
R1#show ip bgp
BGP table version is 12, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
R1 has two options to get to 4.4.4.4/32 but is using only one entry. We can change this
with the maximum-paths command but you need to add the ibgp parameter:
R1(config)#router bgp 123
R1(config-router)#maximum-paths ibgp 2
R1#show ip bgp
BGP table version is 10, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
245
B 4.4.4.4 [200/0] via 192.168.34.4, 00:00:37
[200/0] via 192.168.24.4, 00:00:37
B 192.168.24.0/24 [200/0] via 2.2.2.2, 00:26:11
B 192.168.34.0/24 [200/0] via 3.3.3.3, 00:26:18
Conclusion
You have now learned how to enable ECMP (Equal-Cost Multi-Path Routing) for BGP.
All BGP attributes have to be the same for different paths, except for the next hop
address.
You can enable load balancing with the maximum-paths command.
When you use eBGP with different AS numbers, you need to add the hidden bgp
bestpath asp-path multipath-relax command
When you use iBGP, make sure you add the ibgp parameter to the maximum-paths
command
I hope you enjoyed this lesson. If you have any questions feel free to leave a comment!
For each route in the BGP table, the next hop has to exist and has to be reachable. If
not, the route can’t be used. BGP uses a scanner that checks all routes in the BGP table
every 60 seconds. The BGP scanner does best path calculation, checks the next hop
addresses, and if the next hops are reachable.
60 seconds is a long time. When something happens with a next hop during the 60
seconds between two scans, we have to wait for the next scan to start before problems
are resolved. Meanwhile, we can have black holes and/or routing loops.
BGP next hop tracking is a feature that reduces the BGP convergence time by
monitoring BGP next hop address changes in the routing table. It’s event-
based because it detects changes in the routing table. When it detects a change,
it schedules a next hop scan to adjust the next hop in the BGP table.
246
After detecting a change, the next hop scan has a default delay of 5 seconds. Next
hop tracking also supports dampening penalties. This increases the delay of the next
hop scan for next hop addresses that keep changing in the routing table.
In this lesson, I’ll show you what the BGP next hop scanner looks like and how
dampening works.
Configuration
We use the following topology:
We have three routers in AS 123 running iBGP. Each router has a loopback interface
with an IP address that we advertise in OSPF. Those IP addresses are used by IGP as
the next hop addresses. In between R2 and R3, we have the 192.168.23.0/24 network
that we will advertise in BGP.
247
Configurations
R1
R2
R3
Want to take a look for yourself? Here you will find the configuration of each device.
As explained earlier, BGP has a scanner that runs every 60 seconds. If you have never
seen it before, it’s interesting to take a look at. You can see that it runs every 60
seconds if you enable the following debug:
R1#debug ip bgp
BGP debugging is on for address family: IPv4 Unicast
R1#
*Apr 9 09:56:53.743: BGP: topo global:IPv4 Unicast:base Scanning routing
tables
*Apr 9 09:56:53.744: BGP: topo global:IPv4 Multicast:base Scanning routing
tables
*Apr 9 09:56:53.746: BGP: topo global:L2VPN E-VPN:base Scanning routing
tables
*Apr 9 09:56:53.747: BGP: topo global:MVPNv4 Unicast:base Scanning routing
tables
I left the timestamps so that you can see it runs every 60 seconds. The BGP scanner is a
bit too slow to rely on for next hop changes. Let’s see how next hop tracking works!
Next hop tracking is enabled by default so it’s not something we have to configure.
You can see the two commands here:
R1#show run all | include nexthop trigger
bgp nexthop trigger enable
bgp nexthop trigger delay 5
248
You can disable it by adding no to the first command. The only value we can change is
the delay for when the next hop scanner starts (5 seconds).
We want to see next hop tracking in action so let’s enable the following two debugs:
R1#debug ip routing
IP routing debugging is on
The first debug is useful to see changes to the routing table. The second debug shows
what next hop tracking will do.
R1#show ip bgp
BGP table version is 19, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
The 192.168.23.0/24 network is advertised by both R2 and R3 but we use the path
through R3. Let’s shut the loopback interface of R3 to see what happens:
R3(config)#interface Loopback 0
R3(config-if)#shutdown
R1#
RT: del 3.3.3.3 via 192.168.123.3, ospf metric [110/2]
RT: delete subnet route to 3.3.3.3/32
EvD: accum. penalty decayed to 0 after 67 second(s)
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:19, 19000 ,
scheduling nexthop scan in 5 secs
249
BGP: BGP Event nhop timer
BGP: tbl IPv4 Unicast:base Nexthop walk
BGP(IPv4 Unicast): CHANGED Path metric 0 Path aigp-metric 0 nexthop:
3.3.3.3
RT: updating bgp 192.168.23.0/24 (0x0) :
via 2.2.2.2 0 1048577
As soon as OSPF figures out that 3.3.3.3/32 is gone, the route is deleted from the
routing table. Immediately after the OSPF event, you can see that BGP schedules the
next hop scanner in 5 seconds.
Once those 5 seconds have expired, it changes the next hop address to 2.2.2.2 (R2) and
adds this change to the routing table. This process is much faster than the BGP scanner
that runs every 60 seconds.
R1#show ip bgp
BGP table version is 22, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
We still see 3.3.3.3 as a next hop in the BGP table. That’s because the BGP hold-down
timer hasn’t expired yet.
250
Known via "bgp 123", distance 200, metric 0, type internal
Last update from 2.2.2.2 00:00:32 ago
Routing Descriptor Blocks:
* 2.2.2.2, from 2.2.2.2, 00:00:32 ago
Route metric is 0, traffic share count is 1
AS Hops 0
MPLS label: none
Let’s try one more thing. Let’s shut the loopback 0 interface of R2 so that next hop
address 2.2.2.2 is invalid. Here’s the BGP table right now:
R1#show ip bgp
BGP table version is 22, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
R2(config)#interface Loopback 0
R2(config-if)#shutdown
R1#
RT: del 2.2.2.2 via 192.168.123.2, ospf metric [110/2]
RT: delete subnet route to 2.2.2.2/32
EvD: accum. penalty decayed to 0 after 208 second(s)
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:19, 19000 ,
scheduling nexthop scan in 5 secs
BGP: BGP Event nhop timer
BGP: tbl IPv4 Unicast:base Nexthop walk
BGP(IPv4 Unicast): CHANGED Path metric 0 Path aigp-metric 0 nexthop: 2.2.2.2
RT: del 192.168.23.0 via 2.2.2.2, bgp metric [200/0]
RT: delete network route to 192.168.23.0/24
OSPF detects the change and 2.2.2.2/32 is deleted from the routing table. Right after,
BGP schedules a next hop scan in 5 seconds and when the timer expires, it deletes the
192.168.23.0/24 route from the routing table.
251
Here’s what our BGP table looks like now:
R1#show ip bgp
BGP table version is 24, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
The entry is still there because our iBGP hold-down timer hasn’t expired yet but the
network is no longer installed. We can verify this by checking the routing table:
Dampening
You have now seen how next hop tracking is scheduled and runs after 5 seconds. What
if we have a flapping network that causes the next hop to change over and over again?
Right now, that means that the BGP table gets updated after 5 seconds over and over
again.
Each time a next hop changes, a value of 500 is added to the penalty. When the penalty
is below 950, the next hop scanner is scheduled in 5 seconds. This is what we just
witnessed.
When the penalty is above 950, the next hop scanner is scheduled to when the
penalty decreases to 100 or below.
The penalty decreases by half every 8 seconds. If the current penalty is 2000 then 8
seconds later, it will be 1000. Another 8 seconds later, it will be 500. These parameters
cannot be configured.
252
We can test dampening by changing the next hop in the routing table over and over
again. You could shut/unshut the loopback 0 interfaces of R2 or R3 a couple of times
but then you need to wait until OSPF converges.
I’m going to add and remove a static route on R1 a couple of times. This is the quickest
method to change the routing table and trigger next hop tracking.
Now I add and remove the following static route for next hop 2.2.2.2 a couple of times
after another:
R1#
EvD: accum. penalty decayed to 0 after 127 second(s)
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:19, 19000 ,
scheduling nexthop scan in 5 secs
The first time, our next hop scanner is scheduled in 5 seconds. This is normal. The next
hop keeps “flapping” and we get the following debug messages:
R1#
EvD: accum. penalty decayed to 353 after 4 second(s)
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:25, 25000 , timer
already running
EvD: accum. penalty decayed to 853 after 0 second(s)
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:30, 30000 , timer
already running
253
Above, you see the current penalty values (first 353, then 853) but nothing happens.
This is because the next hop scanner has already been scheduled. Once those 5
seconds have expired, it runs.
R1#
EvD: accum. penalty decayed to 956 after 4 second(s)
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:31, 31000 ,
scheduling nexthop scan in 31 secs
Above, you see that the penalty was 956 and a bit later, the next hop scanner is
scheduled to run in 31 seconds. That’s when the penalty is supposed to have a value of
100.
The next hop keeps flapping and we see the following debug messages:
254
BGP(0): IPv4 Unicast::base nexthop modified, reuse in 00:00:40, 40000 , timer
already running
Above, you can see that the penalty keeps increasing but nothing happens. We already
scheduled the next hop scanner so right now, the penalty increased but that’s it.
R1#
BGP: BGP Event nhop timer
BGP: tbl IPv4 Unicast:base Nexthop walk
If the network keeps flapping, the penalty will rise higher and higher, and the next hop
scanner will be delayed even more.
Conclusion
You have now learned how BGP next hop tracking works.
BGP has a BPG scanner which checks next hops and next hop reachability every 60
seconds.
When a next hop changes or fails in between two runs of the BGP scanner then we
can have temporarily black holes or routing loops.
Next hop tracking increases BGP convergence time by checking changes to next
hops in the routing table.
When a change is detected, the BGP next hop scanner is scheduled to run in 5
seconds. You can change this value.
Next hop tracking supports dampening:
o Flapping networks get a penalty of 500 each time they flap.
o The penalty is reduced by half every 8 seconds.
o When the penalty is below 950, the next hop scanner is scheduled with the
default value (5 seconds).
o When the penalty is above 950, the next hop scanner is scheduled to run when
the penalty has a value of 100.
I hope you enjoyed this lesson. If you have any questions feel free to leave a comment!
255
BGP routers only advertise the best path to their neighbors. When a better path is
found, it replaces the current path. Advertising a path and replacing it with a new path
is called an implicit withdraw.
Since we only advertise the best path, a lot of other possible paths are unknown to
some of the routers. We call this path hiding.
Path hiding has a couple of disadvantages:
Best N: this is the best path and the second best path(s). The second best path(s) is
chosen by eliminating the (next) best path and then selecting the next best path.
Group-best: the set of paths that are the best paths from each AS. This is best
explained with a quick example:
o AS 1 advertises three paths for prefix 1.1.1.1/32:
o One with next hop 192.168.1.1
o One with next hop 192.168.1.2
o One with next hop 192.168.1.3
o AS 2 advertises three paths for prefix 2.2.2.2/32:
o One with next hop 192.168.2.1
o One with next hop 192.168.2.2
o One with next hop 192.168.2.3
256
o AS 3 advertises three paths for prefix 3.3.3.3/32:
o One with next hop 192.168.3.1
o One with next hop 192.168.3.2
o One with next hop 192.168.3.3
o We choose the best path from each AS and that becomes our set. This could be:
o 1.1.1.1/32 with next hop 192.168.1.1
o 2.2.2.2/32 with next hop 192.168.2.1
o 3.3.3.3/32 with next hop 192.168.3.1
All: all paths that have a unique next hop can be used as an additional path.
Let’s take a look at the additional paths feature in action.
Configuration
Video 2
We’ll use the following topology:
257
We have five routers in AS 12345 where R2 is the RR (Route Reflector). R6 has a
loopback interface with prefix 6.6.6.6/32 that is advertised in BGP.
Configurations
R1
R2
R3
R4
R5
R6
Want to take a look for yourself? Here you will find the startup configuration of each
device.
Let’s see what we have. R4 and R5 have learned about 6.6.6.6/32 from R6:
R2 has both paths in its BGP table but installs the path to R4. It only advertises this best
path to its clients. Let’s look at two scenarios where this might be an issue.
Load Balancing
258
Network Next Hop Metric LocPrf Weight Path
*>i 6.6.6.6/32 4.4.4.4 0 100 0 6 i
Since R2 only advertises its best path, R1 only knows about the path through R4. How is
the next hop resolved? Let’s have a look:
4.4.4.4 resolves to R2 so R1 will never use the path through R3. Because R2 is “hiding”
the path, we have two disadvantages here:
R3 also uses 4.4.4.4 as the next hop to get to 6.6.6.6. How does 4.4.4.4 get resolved?
R3 uses two different paths to get to 4.4.4.4/32, we can go through R2 or R5. The path
through R2 is not the most optimal path since it’s one more hop compared to the path
through R5.
Also, when 4.4.4.4 fails, BGP has to reconverge. What we have seen so far is just the
way it is; this is how BGP works. Can we improve it though?
259
BGP Additional Paths
Let’s enable the additional-paths feature. I’ll start with R2. Let’s take a look at the
command:
We enable this for the address-family, IPv4 in our case. There are a couple of options.
Let’s start by enabling the extension itself:
For each neighbor, you can define if you want to send and/or receive additional paths.
I’ll configure R2 to send additional paths to both R1 and R3:
The next thing to do is configure R2 how to globally select additional paths with the bgp
additional-paths select command:
R2(config-router-af)#bgp additional-paths select ?
all Select all available paths
backup Select backup path
best Select best N paths
best-external Select best-external path
group-best Select group-best path
We’ll keep it simple and tell R2 to use all available paths as additional paths:
260
R2#show ip bgp
BGP table version is 11, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Note the a for an additional path. This does not affect the routing table and/or
forwarding table for R2:
R2#show ip route bgp
We still see only a single route here. Nothing gets installed. We are almost there, the
last thing to do on R2 is tell it which additional paths (that we previously selected
globally) we want to advertise to our neighbors:
Above, you can see that we only have three options here. Let’s keep it simple and
configure R2 to advertise all additional paths:
R1 & R3
(config)#router bgp 12345
(config-router)#neighbor 2.2.2.2 additional-paths receive
261
Let’s take a look at R1 again:
R1#show ip bgp
R1 has now two options so that tells us that R2 advertises two paths. R1 has selected
4.4.4.4 as the next hop but we could use 5.5.5.5 as well. Let’s take a closer look at the
BGP table for prefix 6.6.6.6/32:
Above, you can see the received path ID. These are two different values which make
each path unique. You now could configure R1 to use multipath so that it used both
paths, not a bad idea but since I already did this in the multipath lesson, we’ll try
something else here. Let’s configure the path through R3 as a backup path.
How and why to use backup paths in BGP is explained in detail in the BGP PIC (Prefix
Independent Convergence lesson).
Two commands are needed, first the bgp additional-paths select all command to tell R1
which additional paths to use and the second command to actually install the
additional path:
262
Let’s check the BGP table again:
Above, we see that the second path is now a backup/repair path. We can also verify this
in the CEF table:
This is very nice. When our next hop 4.4.4.4 fails, we can install 5.5.5.5 right away
without having to wait for BGP to reconverge.
R3#show ip bgp
Because R3 has received the additional path, it’s now able to make better choices. It
installed 5.5.5.5 as the next hop.
263
It’s not a bad idea to configure R3 to use the other path as a backup/repair path:
That’s looking good, R5 is now our best path and R4 is our backup path.
Conclusion
You have now learned about BGP additional paths:
BGP only advertises the best path to neighbors, when a better path is found, it
advertises the new path and replaces the old one. This is called implicit
withdrawing.
Not advertising all paths is also called path hiding.
Path hiding has some disadvantages:
264
o Can’t use BGP multipath
o The possibility of sub-optimal routing
o MED oscillation
o Next hop failure means BGP has to reconverge
BGP additional paths let you advertise additional paths next to the best path.
This leads to path diversity instead of path hiding.
Each path gets a unique path identifier.
BGP additional paths are only available to iBGP.
There are three things needed for additional paths:
o Configure the router to send and/or receive additional paths.
o Configure a global selection criterion for additional paths.
o To each neighbor, advertise a set or sets of additional paths that we selected as
candidate paths.
There are three options for additional paths selection:
o Best N
o Group Best
o All
I hope you enjoyed this lesson. If you have any questions feel free to leave a comment!
265
Above, we have a small provider network in AS 2 with routers running iBGP. P2 is our
route reflector in this network. We have two customers, CE1 and CE2. CE2 is advertising
two prefixes in BGP. Within AS 2, we use OSPF as the IGP. I increased the cost on the
interfaces of the P2 router so that P1 is the preferred path.
266
Configurations
CE1
CE2
P1
P2
PE1
PE2
PE3
PE4
Want to take a look for yourself? Here you will find the startup configuration of each
device.
PE1#show ip bgp
BGP table version is 23, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Let’s focus on 8.8.8.8/32 and 88.88.88.88/32. These two prefixes were advertised by
CE2 and PE1 learned them from P2 (our route reflector). The next hop for both prefixes
is 6.6.6.6. When PE1 wants to reach CE2, our traffic path looks like this:
267
How does 6.6.6.6 get resolved? Let’s have a look:
We learn about 6.6.6.6 through OSPF and to get to this next hop, we use next hop
192.168.24.4 through interface GigabitEthernet0/1. This is what gets installed in the FIB
(CEF table):
What about those two prefixes? This is how they show up in the FIB:
268
PE1#show ip cef | include 8.8
8.8.8.8/32 192.168.24.4 GigabitEthernet0/1
88.88.88.88/32 192.168.24.4 GigabitEthernet0/1
269
Let’s find out! To see what happens in real-time, I’ll create an access-list that matches
the next hop (6.6.6.6) and our two prefixes (8.8.8.8/32 and 88.88.88.88/32). We attach
the access-list to a debug:
PE1#debug ip routing 1
IP routing debugging is on for access list 1
Now it’s up to OSPF to figure out that P1 is gone and find a new path to 6.6.6.6:
PE1#
RT: updating ospf 6.6.6.6/32 (0x0) :
via 192.168.25.5 Gi0/2 0 1048578
We see that OSPF reconverges and installs 192.168.25.5 (P2) as the new next hop to get
to 6.6.6.6/32. BGP is also updated. Let’s take a look at the BGP table:
PE1#show ip bgp
BGP table version is 33, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
270
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Nothing changed in the BGP table, this makes sense since our next hop did not
change. Only the path to get to the next hop. Let’s take a look at OSPF:
PE1#show ip route 6.6.6.6
Routing entry for 6.6.6.6/32
Known via "ospf 1", distance 110, metric 4, type intra area
Last update from 192.168.25.5 on GigabitEthernet0/2, 00:02:06 ago
Routing Descriptor Blocks:
* 192.168.25.5, from 6.6.6.6, 00:02:06 ago, via GigabitEthernet0/2
Route metric is 4, traffic share count is 1
We see that OSPF now uses 192.168.25.5 (P2) to get to 6.6.6.6/32 via the
GigabitEthernet0/2 interface. Let’s take a look at the FIB for our next hop:
Above, we see the new next hop for 6.6.6.6/32 and interface in the FIB. Let’s take a look
at the FIB entries for our two prefixes:
The FIB entries for our two prefixes has changed as well to reflect the new next hop
and outgoing interface. Think about this for a second…
Our router had to figure out a new path to get to 6.6.6.6/32 and update all entries in
the FIB that used the old next hop (192.168.24.4) to the new next hop (192.168.25.5).
With two prefixes, this is a piece of cake but what if we have one million prefixes in our
FIB? They all have to get updated one-by-one and this takes a LONG time (minutes).
The problem here is how our FIB works. We use what we call a “flat” or “flattened” FIB
structure:
271
With a flat FIB, there is a one-to-one relationship between the prefix and the next
hop / outgoing interface. When the next hop changes, the router walks through each
prefix in the FIB to update the next hop. This is a time-consuming process that requires
a lot of CPU cycles.
To improve this, we use a different structure for our FIB, called the hierarchical FIB:
Instead of a flat table, we use a hierarchical table where we also store the BGP next
hops and the IGP next hops separately.
This has a huge advantage: when the IGP next hop changes, only the part of the FIB
that is affected has to change:
272
We change the pointer from the BGP path list from “PE3 via 192.168.24.4” to “PE3 via
192.168.25.5” and that’s it! We don’t have to touch the BGP prefixes or BGP path list
since those did not change! That’s why we call it “prefix independent”.
With something like a million prefixes, this makes the difference between a
convergence time of minutes to less than 200 ms. The convergence time depends on
how fast your IGP converges to find the new next hop. Some other advantages are:
It’s only a single command to enable the hierarchical FIB and PIC core but if you want to
use this, it doesn’t end here. Since this relies on your IGP convergence time, there are
some things you need to implement to speed up your convergence times like:
273
P1(config-if-range)#no shutdown
When that happens, BGP has to reconverge and switch from one PE router to another
PE router.
Video 4
I’ll show you a “before” and “after” scenario of PIC edge from PE1’s perspective when
PE3 fails. We use the exact same topology that we started with.
PE1#show ip bgp
We see the 8.8.8.8/32 and 88.88.88.88/32 prefix that both use next hop 6.6.6.6 (PE3).
Let’s check the OSPF routing table:
To get to 6.6.6.6/32, we use next hop 192.168.24.4. This gets installed in the FIB:
274
PE1#show ip cef | include 8.8
8.8.8.8/32 192.168.24.4 GigabitEthernet0/1
88.88.88.88/32 192.168.24.4 GigabitEthernet0/1
Disaster strikes again…our provider recovered from the P1 failure but now their PE3
router fails:
275
On PE1, I enable this debug again:
PE1#debug ip routing 1
IP routing debugging is on for access list 1
And to simulate the PE3 failure, we’ll shut down all its interfaces:
PE1#
RT: del 6.6.6.6 via 192.168.24.4, ospf metric [110/3]
RT: delete subnet route to 6.6.6.6/32
RT: del 8.8.8.8 via 6.6.6.6, bgp metric [200/0]
RT: delete subnet route to 8.8.8.8/32
RT: del 88.88.88.88 via 6.6.6.6, bgp metric [200/0]
RT: delete subnet route to 88.88.88.88/32
RT: updating bgp 8.8.8.8/32 (0x0) :
via 7.7.7.7 0 1048577
276
RT: updating bgp 88.88.88.88/32 (0x0) :
via 7.7.7.7 0 1048577
OSPF figures out that 6.6.6.6/32 is gone so it gets removed from the routing table. BGP
reconverges and figures out that 7.7.7.7 (PE4) is the new next hop.
PE1#show ip bgp
We see that BGP now uses 7.7.7.7 as the next hop. How does it get resolved?
277
This is all looking good but it’s a pretty slow process. BGP has to reconverge since the
path through PE3 no longer exists. It will take some time before P2 (the route reflector)
figures out that PE3 is gone and advertises the new path through PE4 (7.7.7.7) to PE1.
Let’s see how we can improve this. Let’s recover PE3 before we continue:
We can configure BGP to pre-install a backup next hop. To do that, we need a second
next hop. That’s a problem right now. P2 is our route reflector and has two paths:
P2#show ip bgp
BGP only advertises the best path so that’s why PE1 only has an entry through PE3 to
get to these prefixes. We can change this behavior with the BGP additional
paths feature. Let’s enable this on P2 and PE1 so that PE1 receives two paths.
P2 has to send additional paths:
P2(config)#router bgp 2
P2(config-router)#address-family ipv4
P2(config-router-af)#bgp additional-paths select all
P2(config-router-af)#neighbor 2.2.2.2 additional-paths send
P2(config-router-af)#neighbor 2.2.2.2 advertise additional-paths all
PE1(config)#router bgp 2
PE1(config-router)#address-family ipv4
PE1(config-router-af)#neighbor 5.5.5.5 additional-paths receive
278
PE1#show ip bgp
Very nice, we see two paths…one through PE3 and the other one through PE4. Only the
path through PE3 is installed though:
PE1#show ip bgp
BGP table version is 67, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-
Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
279
Network Next Hop Metric LocPrf Weight Path
*>i 8.8.8.8/32 6.6.6.6 0 100 0 8 i
*bi 7.7.7.7 0 100 0 8 i
*>i 88.88.88.88/32 6.6.6.6 0 100 0 8 i
*bi 7.7.7.7 0 100 0 8 i
Note the b that stands for backup path. You can also see this if you take a closer look at
a single prefix:
This entry gets installed in the FIB right away. Take a look here:
Once the primary next hop (6.6.6.6) fails, we can forward to the backup (7.7.7.7) right
away!
Conclusion
You have now learned what BGP PIC (Prefix Independent Convergence) is.
BGP PIC (Prefix Independent Convergence) helps to decrease the data plane
convergence time.
280
There are two flavors:
o BGP PIC Core:
o helps when a core router fails, the BGP next hop does not change and your
IGP has to find a new path.
o BGP prefixes are installed in a flat FIB with the recursed next hop and
outgoing interface. When the next hop changes, the router has to go
through all prefixes one-by-one in the FIB and change the next hop.
o A hierarchical FIB stores the BGP and IGP path list. When an IGP next hop
changes, you don’t have to re-flatten the entire table again. This decreases
convergence time in the data plane.
o BGP PIC Edge:
o helps when a PE router fails and BGP has to find a new next hop.
o BGP can pre-install a backup next hop for another BGP path in the FIB.
o When using iBGP and a route reflector, you’ll need to use the BGP additional
paths feature.
I hope you enjoyed this lesson. If you have any questions feel free to leave a comment!
281