Cumulus Linux 2.5.5 User Guide
Cumulus Linux 2.5.5 User Guide
5
User Guide
Table of Contents
Cumulus Linux 2.5.5 User Guide
Table of Contents
Welcome to Cumulus Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
System Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Setting Date and Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Authentication, Authorization, and Accounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Netfilter - ACLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Configuring switchd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Power over Ethernet - PoE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
[Link] 2
Cumulus Linux 2.5.5 User Guide
[Link] 3
Cumulus Linux 2.5.5 User Guide
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Welcome
[Link]
to Cumulus Networks 4
Cumulus Linux 2.5.5 User Guide
This documentation is current as of December 14, 2015 for version 2.5.5. Please visit the Cumulus
Networks Web site for the most up to date documentation.
Read the release notes for new features and known issues in this release.
Release Notes for Cumulus Linux 2.5.5 (see page 5)
Quick Start Guide (see page 5)
Installation, Upgrading and Package Management
System Management (see page 56)
Configuring and Managing Network Interfaces
Layer 2 Features (see page 117)
Layer 3 Features (see page 281)
Monitoring and Troubleshooting (see page 358)
Contents
(Click to expand)
Contents (see page 6)
What's New in Cumulus Linux 2.5.5 (see page 6)
Open Source Contributions (see page 6)
Prerequisites (see page 7)
Hardware Compatibility List (see page 7)
Installing Cumulus Linux (see page 7)
Upgrading Cumulus Linux (see page 8)
Configuring Cumulus Linux (see page 8)
Login Credentials (see page 9)
Serial Console Management (see page 9)
Wired Ethernet Management (see page 9)
Configuring the Hostname and Time Zone (see page 9)
Installing the License (see page 10)
Configuring 4x10G Port Configuration (Splitter Cables) (see page 11)
Testing Cable Connectivity (see page 11)
Configuring Switch Ports (see page 12)
Layer 2 Port Configuration (see page 12)
Layer 3 Port Configuration (see page 13)
Configuring a Loopback Interface (see page 14)
6 14 December 2015
Cumulus Linux 2.5.5 User Guide
Prerequisites
Prior intermediate Linux knowledge is assumed for this guide. You should be familiar with basic text
editing, Unix file permissions, and process monitoring. A variety of text editors are pre-installed,
including vi and nano.
You must have access to a Linux or UNIX shell. If you are running Windows, you should use a Linux
environment like Cygwin as your command line tool for interacting with Cumulus Linux.
If you're a networking engineer but are unfamiliar with Linux concepts, use this reference
guide to see examples of the Cumulus Linux CLI and configuration options, and their
equivalent Cisco Nexus 3000 NX-OS commands and settings for comparison. You can also
watch a series of short videos introducing you to Linux in general and some Cumulus Linux-
specific concepts in particular.
1. Powering on the switch and entering ONIE, the Open Network Install Environment.
2. Installing Cumulus Linux on the switch via ONIE.
3. Booting into Cumulus Linux and installing the license.
4. Rebooting the switch to activate the switch ports.
5. Configuring switch ports and a loopback interface.
To install Cumulus Linux, you use ONIE (Open Network Install Environment), an extension to the
traditional U-Boot software that allows for automatic discovery of a network installer image. This
facilitates the ecosystem model of procuring switches, with a user's own choice of operating system
loaded, such as Cumulus Linux.
If Cumulus Linux is already installed on your switch, and you need to upgrade the software
only, you can skip to Upgrading Cumulus Linux (see page 8) below.
The easiest way to install Cumulus Linux with ONIE is via local HTTP discovery:
1. If your host (like a laptop or server) is IPv6-enabled, make sure it is running a Web server.
If the host is IPv4-enabled, make sure it is running DHCP as well as a Web server.
[Link] 7
Cumulus Networks
2. Download the Cumulus Linux installation file to the root directory of the Web server. Rename
this file onie-installer.
3. Connect your host via Ethernet cable to the management Ethernet port of the switch.
4. Power on the switch. The switch downloads the ONIE image installer and boots it. You can watch
the progress of the install in your terminal. After the installation finishes, the Cumulus Linux
login prompt appears in the terminal window.
These steps describe a flexible unattended installation method. You should not need a
console cable. A fresh install via ONIE using a local Web server should generally complete in
less than 10 minutes.
You have more options for installing Cumulus Linux with ONIE. Read this knowledge base
article to install Cumulus Linux using ONIE in the following ways:
DHCP/Web server with and without DHCP options
Web server without DHCP
FTP or TFTP without a Web server
Local file
USB
ONIE supports many other discovery mechanisms using USB (copy the installer to the root of the drive),
DHCPv6 and DHCPv4, and image copy methods including HTTP, FTP, and TFTP. For more information
on these discovery methods, refer to the ONIE documentation.
After installing Cumulus Linux, you are ready to:
Log in to Cumulus Linux on the switch.
Install the Cumulus Linux license.
Configure Cumulus Linux. This quick start guide provides instructions on configuring switch
ports and a loopback interface.
8 14 December 2015
Cumulus Linux 2.5.5 User Guide
Login Credentials
The default installation includes one system account, root, with full system privileges, and one user
account, cumulus, with sudo privileges. The root account password is set to null by default (which
prohibits login), while the cumulus account is configured with this default password:
CumulusLinux!
In this quick start guide, you will use the cumulus account to configure Cumulus Linux.
For best security, you should change the default password (using the passwd command)
before you configure Cumulus Linux on the switch.
All accounts except root are permitted remote SSH login; sudo may be used to grant a non-root
account root-level access. Commands which change the system configuration require this elevated
level of access.
For more information about sudo, read Using sudo to Delegate Privileges (see page 61).
auto eth0
iface eth0
address [Link]/24
gateway [Link]
[Link] 9
Cumulus Networks
Then replace the [Link] IP address in /etc/hosts with the new hostname:
To update the time zone, update the /etc/timezone file with the correct timezone, run dpkg-
reconfigure --frontend noninteractive tzdata, then reboot the switch:
It is possible to change the hostname without a reboot via a script available on Cumulus
Networks GitHub site.
user@[Link]|thequickbrownfoxjumpsoverthelazydog312
There are three ways to install the license onto the switch:
Copy it from a local server. Create a text file with the license and copy it to a server accessible
from the switch. On the switch, use the following command to transfer the file directly on the
switch, then install the license file:
10 14 December 2015
Cumulus Linux 2.5.5 User Guide
Copy the file to an HTTP server (not HTTPS), then reference the URL when you run cl-license:
Copy and paste the license key into the cl-license command:
After the switch reboots, all front panel ports will be active. The front panel ports are identified as
switch ports, and show up as swp1, swp2, and so forth.
Run the following bash script, as root, to administratively enable all physical ports:
[Link] 11
Cumulus Networks
Run the following bash script, as root, to administratively enable all physical ports:
cumulus@switch:~$ sudo su -
cumulus@switch:~$# for i in /sys/class/net/*; do iface=`basename $i`; if [[
$iface == swp* ]]; then ip link set $iface up; fi done
To view link status, use ip link show. The following examples show the output of a port in "admin
down", "down" and "up" mode, respectively:
# Administratively Down
swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode
DEFAULT qlen 1000
Examples
In the following configuration example, the front panel port swp1 is placed into a bridge called br0:
auto br0
iface br0
bridge-ports swp1
bridge-stp on
To put a range of ports into a bridge, use the glob keyword. For example, add swp1 through swp10,
swp12, and swp14 through swp20 to br0:
12 14 December 2015
Cumulus Linux 2.5.5 User Guide
auto br0
iface br0
bridge-ports glob swp1-10 swp12 glob swp14-20
bridge-stp on
A script is available to generate a configuration that places all physical ports in a single bridge.
auto swp1
iface swp1
address [Link]/30
To add an IP address to a bridge interface, include the address under the iface configuration in /etc
/network/interfaces:
auto br0
iface br0
address [Link]/24
[Link] 13
Cumulus Networks
To view the changes in the kernel use the ip addr show command:
14 14 December 2015
Cumulus Linux 2.5.5 User Guide
auto lo
iface lo inet loopback
address [Link]
If an IP address is configured without a mask, as shown above, the IP address becomes a /32.
So, in the above case, [Link] is actually [Link]/32.
auto lo
iface lo inet loopback
address [Link]
address [Link]/24
Installation,
[Link] Upgrading and Package 15
Cumulus Networks
Contents
(Click to expand)
Contents (see page 16)
Commands (see page 16)
Installing a New Cumulus Linux Image (see page 17)
Clean Installation of Cumulus Linux Using ONIE over USB (see page 17)
Installing a New Image when Cumulus Linux Is already Installed (see page 24)
Understanding Image Slots (see page 31)
PowerPC vs x86 vs ARM Switches (see page 32)
PowerPC Image Slots (see page 32)
x86 and ARM Image Slots (see page 33)
Upgrading Cumulus Linux (see page 36)
Reverting an Image to its Original Configuration (PowerPC Only) (see page 37)
Reprovisioning the System (Restart Installer) (see page 37)
Uninstalling All Images and Removing the Configuration (see page 38)
Booting into Rescue Mode (see page 39)
Inspecting Image File Contents (see page 39)
Useful Links (see page 40)
Commands
apt-get
cl-img-install
cl-img-select
cl-img-clear-overlay
16 14 December 2015
Cumulus Linux 2.5.5 User Guide
cl-img-clear-overlay
cl-img-pkg
ONIE is an open source project, equivalent to PXE on servers, that enables the installation of
network operating systems (NOS) on bare metal switches.
Make sure to back up any important configuration files that you may need to restore the
configuration of your switch after the installation finishes.
It is possible that you could severely damage your system with the following utilities,
so please use caution when performing the actions below!
a. Insert your flash drive into the USB port on the switch running Cumulus Linux and log in
to the switch.
b. Determine and note at which device your flash drive can be found by using output from
cat /proc/partitions and sudo fdisk -l [device]. For example, sudo fdisk
-l /dev/sdb.
[Link] 17
Cumulus Networks
These instructions assume your USB drive is the /dev/sdb device, which is
typical if the USB stick was inserted after the machine was already booted.
However, if the USB stick was plugged in during the boot process, it is possible
the device could be /dev/sda. Make sure to modify the commands below to
use the proper device for your USB drive!
e. Format the partition to your filesystem of choice using ONE of the examples below:
f. To continue installing Cumulus Linux, mount the USB drive in order to move files to it.
3. Copy the image and license files over to the flash drive and rename the image file to:
onie-installer_x86-64, if installing on an x86 platform
onie-installer-powerpc, if installing on a PowerPC platform
onie-installer-arm, if installing on an ARM platform
4. Insert the USB stick into the switch, then continue with the appropriate instructions below for
your x86, ARM or PowerPC platform.
18 14 December 2015
Cumulus Linux 2.5.5 User Guide
SSH sessions to the switch get dropped after this step. To complete the remaining
instructions, connect to the console of the switch. Cumulus Linux switches display their
boot process to the console, so you need to monitor the console specifically to
complete the next step.
2. Monitor the console and select the ONIE option from the first GRUB screen shown below.
3. Cumulus Linux on x86 uses GRUB chainloading to present a second GRUB menu specific to the
ONIE partition. No action is necessary in this menu to select the default option ONIE: Install OS.
[Link] 19
Cumulus Networks
4. At this point, the USB drive should be automatically recognized and mounted. The image file
should be located and automatic installation of Cumulus Linux should begin. Here is some
sample output:
20 14 December 2015
Cumulus Linux 2.5.5 User Guide
5. After installation completes, the switch automatically reboots into the newly installed instance of
Cumulus Linux.
6. Determine and note at which device your flash drive can be found by using output from cat
/proc/partitions and sudo fdisk -l [device]. For example, sudo fdisk -l /dev/sdb
.
These instructions assume your USB drive is the /dev/sdb device, which is typical if
the USB stick was inserted after the machine was already booted. However, if the USB
stick was plugged in during the boot process, it is possible the device could be /dev
/sda. Make sure to modify the commands below to use the proper device for your USB
drive!
10. Check that your license is installed with the cl-license command.
11. Reboot the switch to utilize the new license.
sudo reboot
[Link] 21
1.
Cumulus Networks
If the switch is already online in Cumulus Linux, connect to the console and reboot the
switch into the ONIE environment with the sudo cl-img-select -i command,
followed by sudo reboot. Then skip to step 4.
SSH sessions to the switch get dropped after this step. To complete the remaining
instructions, connect to the console of the switch. Cumulus Linux switches display their
boot process to the console, so you need to monitor the console specifically to
complete the next step.
2. Interrupt the normal boot process before the countdown (shown below) completes. Press any
key to stop the autobooting.
3. A command prompt appears, so you can run commands. Execute the following command:
run onie_bootcmd
4. At this point the USB drive should be automatically recognized and mounted. The image file
should be located and automatic installation of Cumulus Linux should begin. Here is some
sample output:
22 14 December 2015
Cumulus Linux 2.5.5 User Guide
5. After installation completes, the switch automatically reboots into the newly installed instance of
Cumulus Linux.
6. Determine and note at which device your flash drive can be found by using output from cat
/proc/partitions and sudo fdisk -l [device]. For example, sudo fdisk -l /dev/sdb
.
These instructions assume your USB drive is the /dev/sdb device, which is typical if
the USB stick was inserted after the machine was already booted. However, if the USB
stick was plugged in during the boot process, it is possible the device could be /dev
/sda. Make sure to modify the commands below to use the proper device for your USB
drive!
[Link] 23
8.
Cumulus Networks
10. Check that your license is installed with the cl-license command.
11. Reboot the switch to utilize the new license.
sudo reboot
1. Installing the new image into the alternate image slot (see below (see page 31)).
2. Backing up your configuration files into /mnt/persist.
3. Selecting the alternate slot for next boot (that is, the slot you just installed into).
4. Rebooting the switch.
5. Copying the files from /mnt/persist to the new slot; this happens automatically if you follow
the instructions below.
6. Clearing /mnt/persist out so subsequent reboots don't load /mnt/persist.
Installing a new image overwrites all files — including configuration files — on the target slot.
24 14 December 2015
Cumulus Linux 2.5.5 User Guide
Installing a new image overwrites all files — including configuration files — on the target slot.
Cumulus Networks strongly recommends you create a persistent configuration (see page )
to back up your important files, like your configurations.
You can only install into the alternate slot, as it is not possible to install into the actively
running slot. The system automatically determines which slot is the alternate slot (slot 2 in
this case).
This example assumes the new image is located in the current directory (where the user is running the
command from):
[Link] 25
Cumulus Networks
26 14 December 2015
Cumulus Linux 2.5.5 User Guide
/etc Switchd Configuring switchd (see page 82) N/A; please read the
/cumulus configuration guide on switchd
/switchd. configuration
conf
[Link] 27
Cumulus Networks
If you are using the root user account, consider including /root/.
If you have custom user accounts, consider including /home/<username>/.
If you are using VXLANs without a controller (see page 226), see this list of files (see page )to
include in a persistent configuration.
#!/bin/bash
#network configuration files
cp -r --parents /etc/network/ /mnt/persist/
cp --parents /etc/[Link] /mnt/persist/
if [ -f /etc/quagga/[Link] ]; then cp --parents /etc/quagga
/[Link] /mnt/persist; fi
cp --parents /etc/quagga/daemons /mnt/persist
cp --parents /etc/hostname /mnt/persist
cp --parents /etc/cumulus/[Link] /mnt/persist
#commonly used filed
cp --parents /etc/motd /mnt/persist/
cp --parents /etc/passwd /mnt/persist/
cp --parents /etc/shadow /mnt/persist/
if [ -f /etc/[Link] ]; then cp --parents /etc/[Link] /mnt
/persist/; fi
cp -r --parents /etc/lldpd.d/* /mnt/persist/
cp --parents /etc/[Link] /mnt/persist
cp -a --parents /etc/ssh/ /mnt/persist/
28 14 December 2015
Cumulus Linux 2.5.5 User Guide
To run the script copy the above into a .sh file (for example, sudo nano [Link]).
[Link] 29
Cumulus Networks
|-- ssh_host_rsa_key.pub
`-- sshd_config
cumulus@switch$ reboot
30 14 December 2015
Cumulus Linux 2.5.5 User Guide
This is an extra reminder to clear out /mnt/persist. A future reboot will cause everything
in /mnt/persist to overwrite the current primary slot.
To identify which slot is active, which slot is the primary, and which slot is alternate use the cl-img-
select command:
The above switch is currently running 2.5.3 as indicated by the active. When the switch is rebooted, it
will boot into slot 1, as indicated by primary. The alternate slot is running Cumulus Linux 2.5.2 and
won't be booted into unless the user selects it.
[Link] 31
Cumulus Networks
cumulus@PPCswitch$ uname -m
ppc
cumulus@leaf1$ uname -m
x86_64
cumulus@leaf1$ uname -m
armv7l
You can also visit the HCL (hardware compatibility list) to look at your hardware to determine the
processor type.
Files you edit and create reside in the read-write user overlay. This also includes any additional
software you install on top of Cumulus Linux. After an install, the user overlay is empty.
The following table describes the mount points and directories used to create the overlay for image
32 14 December 2015
Cumulus Linux 2.5.5 User Guide
The following table describes the mount points and directories used to create the overlay for image
slots 1 and 2.
Slot R/O squashfs R/O mount point R/W block device R/W directory
Number device
A single read-write partition provides separate read-write directories for the upper part of the
overlay. The lower part of the overlay is a partition, while the upper part is a directory.
/mnt/root- ext2 Contains the read-write user directories for the overlay.
rw
/mnt/persist ext2 Contains the persistent user configuration applied to each image slot.
/mnt tmpfs Contains the initramfs used at boot. Needed during shutdown.
/initramfs
1. Check utilization on the root filesystem with the df command. In the following example,
filesystem utilization is 16%:
cumulus@switch$ df -h /
Filesystem Size
Used Avail Use% Mounted on
/dev/disk/by-uuid/64650289-cebf-4849-91ae-a34693fce2f1 4.0G
579M 3.2G 16% /
2. To increase available space in the root filesystem, first use the vgs command to check the
available space in the volume group. In this example, there is 6.34 Gigabytes of free space
available in the volume group CUMULUS:
3. Once you confirm the available space, determine the number of the currently active slot using
cl-img-select.
The use of + is very important with the lvresize command. Issuing lvresize without
the + results in the logical volume size being set directly to the specified size, rather
than extended.
5.
34 14 December 2015
Cumulus Linux 2.5.5 User Guide
5. Once the slot has been extended, use the resize2fs command to expand the filesystem to fit
the new space in the slot. Again, replace the "#" character in the example with the active slot
number.
cl-img-install fails while the alternate slot is mounted. It is important to unmount the
alternate slot as shown in step 4 below when done.
cumulus@switch$ cd /
[Link] 35
4.
Cumulus Networks
Before you upgrade a PowerPC switch, run df -m and make sure the overlay filesystem /mnt
/root-rw has at least 200MB of free disk space. See this release note for more details.
While this method doesn't overwrite the target image slot, the disk image does occupy a lot of
disk space used by both Cumulus Linux image slots.
After you successfully upgrade Cumulus Linux, you may notice some some results that you
may or may not have expected:
apt-get dist-upgrade always updates the operating system to the most current
version, so if you are currently running Cumulus Linux 2.5.2 and run apt-get dist-
upgrade on that switch, the packages will get upgraded to their 2.5.4 versions.
When you run cl-img-select, the output still shows the version of Cumulus Linux
from the last binary install. So if you installed Cumulus Linux 2.5.3 as a full image
install and then upgraded to 2.5.4 using apt-get dist-upgrade, the output from
cl-img-select still shows version 2.5.3.
Why you should use apt-get dist-upgrade instead of apt-get upgrade (Click here to expand...)
Cumulus Networks recommends you upgrade Cumulus Linux using apt-get dist-upgrade
instead of apt-get upgrade.
This ensures all the packages in the distribution get updated to the current version. apt-get
upgrade may work correctly if no packages are held back by apt. A package can be held back
if one or more of its dependencies has changed, or it can occur for other reasons. For
example, if you see this message when running apt-get upgrade:
36 14 December 2015
Cumulus Linux 2.5.5 User Guide
It means apt-get upgrade did not install the kernel package. However, apt-get dist-
upgrade would have picked it up. Most applications in Cumulus Linux rely on the correct
kernel version. If an application doesn't get the kernel version it expects, It may result in a non-
functional system.
You can manually install a held back package by running apt-get install on it:
If you must use apt-get upgrade, run it twice. For the second time, include the -s or --
dry-run option to verify that all packages were picked up when you upgraded. Otherwise,
you must manually install any held back packages to complete the upgrade.
[Link] 37
Cumulus Networks
WARNING:
WARNING: Operating System install requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling install at next reboot...done.
Reboot required to take effect.
If you change your mind, you can cancel a pending reinstall operation by using cl-img-
select -c:
If you change your mind you can cancel a pending uninstall operation by using cl-img-
select -c:
38 14 December 2015
Cumulus Linux 2.5.5 User Guide
If you change your mind you can cancel a pending rescue boot operation by using cl-img-
select -c:
[Link] 39
Cumulus Networks
You can also extract the image files to the current directory with the -e option:
Useful Links
Open Network Install Environment (ONIE) Home Page
Contents
40 14 December 2015
Cumulus Linux 2.5.5 User Guide
Contents
(Click to expand)
Contents (see page 40)
Commands (see page 41)
Updating the Package Cache (see page 41)
Listing Available Packages (see page 42)
Adding a Package (see page 43)
Listing Installed Packages (see page 44)
Upgrading to Newer Versions of Installed Packages (see page 45)
Upgrading a Single Package (see page 45)
Upgrading All Packages (see page 45)
Adding Packages from Another Repository (see page 45)
Configuration Files (see page 46)
Useful Links (see page 46)
Commands
apt-get
apt-cache
dpkg
[Link] 41
Cumulus Networks
Translation-en
Ign [Link] CumulusLinux-2.5/updates Translation-en
Fetched 413 kB in 3s (117 kB/s)Reading package lists... Done
42 14 December 2015
Cumulus Linux 2.5.5 User Guide
Section: net
Installed-Size: 984
Maintainer: Noël Köthe <noel@[Link]>
Architecture: powerpc
Version: 3.4.3-2+wheezy1
Depends: libc6 (>= 2.7), libpcap0.8 (>= 0.9.8)
Filename: pool/CumulusLinux-2.5/addons/tcpreplay_3.4.3-2+wheezy1_powerpc.deb
Size: 435904
MD5sum: cf20bec7282ef77a091e79372a29fe1e
SHA1: 8ee1b9b02dacd0c48a474844f4466eb54c7e1568
SHA256: 03dc29057cb608d2ddf08207aedf18d47988ed6c23db0af69d30746768a639ae
SHA512:
a411b08e7a7bea62331c527d152533afca735b795f2118507260a5a0c3b6143500df9f6723cf
f736a1de0969a63e7a7ad0ce8a181ea7dfb36e2330a95d046fb1
Description: Tool to replay saved tcpdump files at arbitrary speeds
Tcpreplay is aimed at testing the performance of a NIDS by
replaying real background network traffic in which to hide
attacks. Tcpreplay allows you to control the speed at which the
traffic is replayed, and can replay arbitrary tcpdump traces. Unlike
programmatically-generated artificial traffic which doesn't
exercise the application/protocol inspection that a NIDS performs,
and doesn't reproduce the real-world anomalies that appear on
production networks (asymmetric routes, traffic bursts/lulls,
fragmentation, retransmissions, etc.), tcpreplay allows for exact
replication of real traffic seen on real networks.
Homepage: [Link]
cumulus@switch:~$
The search commands look for the search terms not only in the package name but in other
parts of the package information. Consequently, it will match on more packages than you
would expect.
Adding a Package
In order to add a new package, first ensure the package is not already installed in the system:
If the package is installed already, ensure it’s the version you need. If it’s an older version, then update
the package from the Cumulus Linux repository:
[Link] 43
Cumulus Networks
If the package is not already on the system, add it by running apt-get install. This retrieves the
package from the Cumulus Linux repository and installs it on your system together with any other
packages that this package might depend on.
For example, the following adds the package tcpreplay to the system:
44 14 December 2015
Cumulus Linux 2.5.5 User Guide
anal
ii tcpreplay 3.4.3-2+whee powerpc Tool to replay saved tcpdump
file
cumulus@switch:~$
For several packages, Cumulus Networks has added features or made bug fixes and these
packages must not be replaced with versions from other repositories. Cumulus Linux has
been configured to ensure that the packages from the Cumulus Linux repository are always
preferred over packages from other repositories.
If you want to install packages that are not in the Cumulus Linux repository, the procedure is the same
as above with one additional step.
Packages not part of the Cumulus Linux Repository have generally not been tested, and may
not be supported by Cumulus Linux support.
Installing packages outside of the Cumulus Linux repository requires the use of apt-get, but,
depending on the package, easy-install and other commands can also be used.
To install a new package, please complete the following steps:
1. First, ensure package is not already installed in the system. Use the dpkg command:
[Link] 45
1.
Cumulus Networks
2. If the package is installed already, ensure it's the version you need. If it's an older version, then
update the package from the Cumulus Linux repository:
3. If the package is not on the system, then most likely the package source location is also not in
the /etc/apt/[Link] file. If the source for the new package is not in [Link],
please edit and add the appropriate source to the file. For example, add the following if you
wanted a package from the Debian repository that is not in the Cumulus Linux repository:
To uncomment the repository, remove the # at the start of the line, then save the file:
Configuration Files
/etc/apt/[Link]
/etc/apt/preferences
/etc/apt/[Link]
Useful Links
Debian GNU/Linux FAQ, Ch 8 Package management tools
man pages for apt-get, dpkg, [Link], apt_preferences
46 14 December 2015
Cumulus Linux 2.5.5 User Guide
The standard Cumulus Linux license requires you to page through the license file before
accepting the terms, which can hinder an unattended installation like zero touch provisioning.
To request a license without the EULA, email licensing@[Link].
Contents
(Click to expand)
Contents (see page 47)
Commands (see page 48)
Zero Touch Provisioning over DHCP (see page 48)
Triggering ZTP over DHCP (see page 48)
Configuring The DCHP Server (see page 48)
Detailed Look at HTTP Headers (see page 49)
Testing and Debugging ZTP Scripts for DHCP (see page 49)
Zero Touch Provisioning Using USB (ZTP-USB) (see page 49)
Testing and Debugging ZTP-USB Scripts (see page 51)
Writing ZTP Scripts (see page 52)
Example ZTP Scripts (see page 52)
(see page 53)
Manually Using the autoprovision Command (see page 54)
Notes (see page 55)
Configuration Files (see page 56)
[Link] 47
Cumulus Networks
Commands
autoprovision
1. The first time you boot Cumulus Linux, eth0 is configured for DHCP and makes a DHCP request.
2. The DHCP server offers a lease to the switch.
3. If option 239 is present in the response, the zero touch provisioning process itself will start.
4. The zero touch provisioning process requests the contents of the script from the URL, sending
additional HTTP headers (see page 49) containing details about the switch.
5. The script's contents are parsed to ensure it contains the CUMULUS-AUTOPROVISIONING flag
(see example scripts (see page 53)).
6. The autoprovision command checks its configuration file (see page 56) to see if
autoprovisioning has already occurred and completed.
7. If autoprovision determines that provisioning is necessary, then the script executes locally on
the switch with root privileges.
8. The return code of the script gets examined. If it is 0, then the provisioning state is marked as
complete in the autoprovisioning configuration file.
48 14 December 2015
Cumulus Linux 2.5.5 User Guide
Additionally, the hostname of the switch can be specified via the host-name option:
This feature has been tested only with "thumb" drives, not an actual external large USB hard
drive.
[Link] 49
Cumulus Networks
Cumulus Linux supports the use of a FAT32, FAT16, or VFAT-formatted USB drive as an installation
source for ZTP scripts. A daemon called ztp-usb runs by default in Cumulus Linux (you can disable it
by specifying START=no in /etc/default/ztp-usb).You can plug in a USB stick at any time — when
you power up a switch or even when the switch has been running for some time. This is useful for
performing a full installation of the operating system for cases like fresh installs or disaster recovery.
At minimum, the script should:
Install the Cumulus Linux operating system and license.
Copy over a basic configuration to the switch.
Restart the switch or the relevant serves to get switchd up and running with that configuration.
Follow these steps to perform zero touch provisioning using USB:
1. Copy the Cumulus Linux license and installation image (see page 17) to the USB stick.
2. When Cumulus Linux boots, the ztp-usb daemon starts.
3. Every 30 seconds, the ztp-usb daemon looks for unmounted FAT32-, FAT16- or VFAT-formatted
volumes.
4. Each new device detected by the kernel is mounted to /mnt/usb.
5. The daemon searches the root filesystem of the newly mounted device for filenames matching
an ONIE-style waterfall (see the patterns and examples below), looking for the most specific
name first, and ending at the most generic.
6. The script's contents are parsed to ensure it contains the CUMULUS-AUTOPROVISIONING flag
(see example scripts (see page 53)).
7. The autoprovision command checks its configuration file (see page 56) to see if
autoprovisioning has already occurred and completed.
8. If autoprovision determines that provisioning is necessary, then the script executes locally on
the switch with root privileges.
9. The return code of the script gets examined. If it is 0, then the provisioning state is marked as
complete in the autoprovisioning configuration file.
/mnt/usb/cumulus-ztp-powerpc-cel_smallstone-rUNKNOWN
/mnt/usb/cumulus-ztp-powerpc-cel_smallstone
/mnt/usb/cumulus-ztp-cel_smallstone
/mnt/usb/cumulus-ztp-powerpc
/mnt/usb/cumulus-ztp
50 14 December 2015
Cumulus Linux 2.5.5 User Guide
By moving the configuration file to a new location, the autoprovision framework has no
record of previous provisioning successes or failures, which means any new attempt to
autoprovision succeeds.
4. Use debugging mode to run the ztp-usb script.
[Link] 51
Cumulus Networks
Remember to include the following line in any of the supported scripts which are expected to
be run via the autoprovisioning framework.
# CUMULUS-AUTOPROVISIONING
This line is required somewhere in the script file in order for execution to occur.
The script must contain the CUMULUS-AUTOPROVISIONING flag. This can be in a comment or remark
and does not needed to be echoed or written to stdout.
The script can be written in any language currently supported by Cumulus Linux, such as:
Perl
Python
Ruby
Shell
The script must return an exit code of 0 upon success, as this triggers the autoprovisioning process to
be marked as complete in the autoprovisioning configuration file.
52 14 December 2015
Cumulus Linux 2.5.5 User Guide
The following script install Cumulus Linux and its license from USB and applies a configuration:
#!/bin/bash
function error() {
echo -e "\e[0;33mERROR: The Zero Touch Provisioning script failed
while running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&
2
exit 1
}
# CUMULUS-AUTOPROVISIONING
exit 0
[Link] 53
Cumulus Networks
#!/bin/bash
function error() {
echo -e "\e[0;33mERROR: The Zero Touch Provisioning script failed
while running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&
2
exit 1
}
trap error ERR
apt-get update -y
apt-get upgrade -y
apt-get install puppet -y
sed -i /etc/default/puppet -e 's/START=no/START=yes/'
sed -i /etc/puppet/[Link] -e 's/\[main\]/\[main\]
\npluginsync=true/'
service puppet restart
# CUMULUS-AUTOPROVISIONING
exit 0
This script illustrates how to specify an internal APT mirror and puppet master:
#!/bin/bash
function error() {
echo -e "\e[0;33mERROR: The Zero Touch Provisioning script failed
while running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&
2
exit 1
}
trap error ERR
sed -i /etc/apt/[Link] -e 's/[Link]/labrepo.
[Link]/'
apt-get update -y
apt-get upgrade -y
apt-get install puppet -y
sed -i /etc/default/puppet -e 's/START=no/START=yes/'
sed -i /etc/puppet/[Link] -e 's/\[main\]/\[main\]
\npluginsync=true/'
sed -i /etc/puppet/[Link] -e 's/\[main\]/\[main\]
\nserver=[Link]/'
service puppet restart
# CUMULUS-AUTOPROVISIONING
exit 0
Now puppet can take over management of the switch, configuration authentication, changing the
default root password, and setting up interfaces and routing protocols.
54 14 December 2015
Cumulus Linux 2.5.5 User Guide
All forms of ZTP use the autoprovision command on the backend to execute a provided provisioning
script, whether that script is sourced from a URL over the network or locally via a file from a USB drive.
One of the benefits of using the autoprovision command — instead of simply scheduling a cronjob
to run your script — is that autoprovision tracks whether or not a script has already been executed
(and when) in its configuration file /var/lib/cumulus/[Link], ensuring that a switch
that has already been provisioned is not accidentally provisioned again at a later date.
Users with root privileges can interact with the autoprovision command directly using the examples
below.
To enable zero touch provisioning, use the -e option:
To run the provisioning script against a script hosted on a Web server, use the -u option and include
the URL to the script:
To run the provisioning script against a script hosted on the local filesystem, use the --file or -i
option and include the file location of the script:
To enable startup discovery mode, without relying on DHCP when you boot the switch, use the -s
option:
To force provisioning to occur and ignore the status listed in the configuration file use the -f option:
Notes
During the development of a provisioning script, the switch may need to be reset.
You can use the Cumulus Linux cl-img-clear-overlay command to revert the image to its
[Link] 55
Cumulus Networks
You can use the Cumulus Linux cl-img-clear-overlay command to revert the image to its
original configuration.
You can use the Cumulus Linux cl-img-select -i command to cause the switch to
reprovision itself and install a network operating system again using ONIE.
Configuration Files
/var/lib/cumulus/[Link]: Stores configuration options and details for the
autoprovisioning framework
/etc/default/ztp-usb: Stores the enable/disable flag for the ztp-usb service
System
56 Management 14 December 2015
Cumulus Linux 2.5.5 User Guide
System Management
Contents
(Click to expand)
Contents (see page 57)
Commands (see page 57)
Setting the Time Zone (see page 57)
Setting the Date and Time (see page 58)
Setting Time Using NTP (see page 59)
Configuration Files (see page 59)
Useful Links (see page 59)
Commands
date
dpkg-reconfigure tzdata
hwclock
ntpd (daemon)
ntpq
Then navigate the menus to enable the time zone you want. The following example selects the US
/Pacific time zone:
[Link] 57
Cumulus Networks
Configuring tzdata
------------------
For more info see the Debian System Administrator’s Manual – Time.
58 14 December 2015
Cumulus Linux 2.5.5 User Guide
Configuration Files
/etc/default/ntp — ntpd init.d configuration variables
/etc/[Link] — default NTP configuration file
/etc/init.d/ntp — ntpd init script
Useful Links
Debian System Administrator’s Manual – Time
[Link]
[Link]
[Link]
[Link] 59
Cumulus Networks
Contents
(Click to expand)
Contents (see page 60)
Access Using Passkey (Basic Setup) (see page 60)
Completely Passwordless System (see page 61)
Useful Links (see page 61)
cumulus@management-station:~$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cumulus/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cumulus/.ssh/id_rsa.
Your public key has been saved in /home/cumulus/.ssh/id_rsa.pub.
The key fingerprint is:
[Link] cumulus@management-
station
The key's randomart image is:
+--[ RSA 2048]----+
| . .= o o. |
| o . O *.. |
| . o = =.o |
| . O oE |
| + S |
| + |
| |
| |
| |
+-----------------+
60 14 December 2015
Cumulus Linux 2.5.5 User Guide
Next, append the public key in ~/.ssh/id_rsa.pub into ~/.ssh/authorized_keys in the target
user’s home directory:
Remember, you cannot use the root account to SSH to a switch in Cumulus Linux.
Useful Links
[Link]
User Accounts
By default, Cumulus Linux has two user accounts: cumulus and root.
The cumulus account:
Default password is CumulusLinux!
Is a user account in the sudo group with sudo privileges
User can log in to the system via all the usual channels like console and SSH (see page 60)
The root account:
Default password is disabled by default
Has the standard Linux root user access to everything on the switch
Disabled password prohibits login to the switch by SSH, telnet, FTP, and so forth
For best security, you should change the default password (using the passwd command) before you
configure Cumulus Linux on the switch.
You can enable a valid password for the root account using the sudo passwd root command and can
install an SSH key for the root account if needed. Enabling a password for the root account allows the
root user to log in directly to the switch. The Cumulus Linux default root account behavior is consistent
with Debian.
You can add more user accounts as needed. Like the cumulus account, these accounts must use sudo
to execute privileged commands (see page 61), so be sure to include them in the sudo group.
To access the switch without any password requires booting into a single shell/user mode. Here are the
instructions (see page 364) on how to do this using PowerPC and x86 switches.
Contents
(Click to expand)
Contents (see page 62)
Commands (see page 62)
Using sudo (see page 62)
sudoers Examples (see page 63)
Configuration Files (see page 68)
Useful Links (see page 68)
Commands
sudo
visudo
Using sudo
sudo allows you to execute a command as superuser or another user as specified by the security
policy. See man sudo(8) for details.
The default security policy is sudoers, which is configured using /etc/sudoers. Use /etc/sudoers.d/
to add to the default sudoers policy. See man sudoers(5) for details.
Use visudo only to edit the sudoers file; do not use another editor like vi or emacs. See man
visudo(8) for details.
Errors in the sudoers file can result in losing the ability to elevate privileges to root. You can
fix this issue only by power cycling the switch and booting into single user mode. Before
modifying sudoers, enable the root user by setting a password for the root user.
By default, users in the sudo group can use sudo to execute privileged commands. To add users to the
sudo group, use the useradd(8) or usermod(8) command. To see which users belong to the sudo
group, see /etc/group (man group(5)).
Any command can be run as sudo, including su. A password is required.
The example below shows how to use sudo as a non-privileged user cumulus to bring up an interface:
62 14 December 2015
Cumulus Linux 2.5.5 User Guide
sudoers Examples
The following examples show how you grant as few privileges as necessary to a user or group of users
to allow them to perform the required task. For each example, the system group noc is used; groups
are prefixed with an %.
When executed by an unprivileged user, the example commands below must be prefixed with sudo.
Monitoring Switch
port info
ethtool -m swp1 %noc ALL=(ALL) NOPASSWD:
/sbin/ethtool
Monitoring System
diagnostics
cl-support %noc ALL=(ALL) NOPASSWD:/usr
/cumulus/bin/cl-support
Monitoring Routing
diagnostics
cl-resource- %noc ALL=(ALL) NOPASSWD:/usr
query /cumulus/bin/cl-resource-
query
Image Install
management images
%noc ALL=(ALL) NOPASSWD:/usr
/cumulus/bin/cl-img-install
[Link] 63
Cumulus Networks
cl-img-install
[Link]
/[Link]
Image Swapping
management slots
cl-img-select 1 %noc ALL=(ALL) NOPASSWD:/usr
/cumulus/bin/cl-img-select
Image Clearing
management an overlay
cl-img-clear- %noc ALL=(ALL) NOPASSWD:/usr
overlay 1 /cumulus/bin/cl-img-clear-
overlay
Package Install
management packages
apt-get install %noc ALL=(ALL) NOPASSWD:/usr
mtr-tiny /bin/apt-get install *
Package Upgrading
management
apt-get upgrade
64 14 December 2015
Cumulus Linux 2.5.5 User Guide
Netfilter List
iptables
rules iptables -L %noc ALL=(ALL) NOPASSWD:
/sbin/iptables
Interfaces Up any
interface
ifup swp1 %noc ALL=(ALL) NOPASSWD:
/sbin/ifup
[Link] 65
Cumulus Networks
Interfaces Up/down
only swp2
ifup swp2 / %noc ALL=(ALL) NOPASSWD:
ifdown swp2 /sbin/ifup swp2,/sbin
/ifdown swp2
Interfaces Any IP
address
chg ip addr %noc ALL=(ALL) NOPASSWD:
{add|del} /sbin/ip addr *
[Link]/30
dev swp1
Ethernet Add
bridging bridges
and ints brctl addbr br0 %noc ALL=(ALL) NOPASSWD:
/ brctl addif /sbin/brctl addbr *,/sbin
br0 swp1 /brctl addif *
66 14 December 2015
Cumulus Linux 2.5.5 User Guide
Troubleshooting Restart
switchd
service switchd %noc ALL=(ALL) NOPASSWD:/usr
restart /sbin/service switchd *
Troubleshooting Restart
any service
service switchd %noc ALL=(ALL) NOPASSWD:/usr
cron /sbin/service
Troubleshooting Packet
capture
tcpdump %noc ALL=(ALL) NOPASSWD:/usr
/sbin/tcpdump
L3 Add static
routes
ip route add %noc ALL=(ALL) NOPASSWD:/bin
[Link]/16 via /ip route add *
[Link]
L3 Delete
static
routes ip route del %noc ALL=(ALL) NOPASSWD:/bin
[Link]/16 via /ip route del *
[Link]
L3 Any static
route chg
ip route * %noc ALL=(ALL) NOPASSWD:/bin
/ip route *
[Link] 67
Cumulus Networks
L3 Any
iproute
command ip * %noc ALL=(ALL) NOPASSWD:/bin
/ip
L3 Non-
modal
OSPF cl-ospf area %noc ALL=(ALL) NOPASSWD:/usr
[Link] range /bin/cl-ospf
[Link]/24
Configuration Files
/etc/sudoers - default security policy
/etc/sudoers.d/ - default security policy
Useful Links
sudo
Adding Yourself to sudoers
Contents
(Click to expand)
Contents (see page 68)
Configuring LDAP (see page 69)
Installing libnss-ldapd (see page 69)
Configuring [Link] (see page 69)
Troubleshooting LDAP Authentication (see page 69)
Common Problems (see page 70)
Configuring LDAP Authorization (see page 70)
68 14 December 2015
Cumulus Linux 2.5.5 User Guide
Configuring LDAP Authorization (see page 70)
A Longer Example (see page 70)
References (see page 70)
Configuring LDAP
There are 3 common ways of configuring LDAP authentication on Linux:
libnss-ldap
libnss-ldapd
libnss-sss
This chapter covers using libnss-ldapd only. From internal testing, this library worked best with
Cumulus Linux and was the easiest to configure, automate and troubleshoot.
Installing libnss-ldapd
To install libnss-ldapd, run:
This brings up an interactive prompt asking questions about the LDAP URI, base domain name and so
on. To pre-fill these details, run apt-get install debconf-utils and populate debconf-set-
selections with the appropriate answers. Run debconf-show <pkg> to check the settings.
Here is an example of how to prefill questions using debconf-set-selections.
For nested group support, libnss-ldapd must be version 0.9 or higher. For Cumulus Linux 2.
x, you can get this from the wheezy-backports repo.
Configuring [Link]
/etc/[Link] is the main configuration file that needs to be changed after the package is
installed. The [Link] man page details all the available configuration options.
Here is an example configuration using Cumulus Linux.
[Link] 69
Cumulus Networks
Common Problems
nslcd cannot read the SSL certificate. nslcd will report a “Permission denied” error in the
debug during server connection negotiation. The sniffer trace output will show only a TCP
handshake and then a TCP FIN from the switch. Check the permission on each directory in the
path of the root SSL certificate. Ensure that is is readable by the nslcd user.
The FQDN on the LDAP URI does not match the SSL FQDN exactly.
The search filter returns wrong results. Check for typos in the search filter. Use ldapsearch to
test your filter. For example:
# ldapsearch \
-D 'CN=cumulus admin,CN=Users,DC=rtp,DC=example,DC=test' \
-w '1Q2w3e4r!' \
"(&(ObjectClass=user) \
(memberOf=cn=cumuluslnxadm,ou=groups,ou=support,dc=rtp,
dc=example, dc=test))"
# This filter says to get all users who are part of the cumuluslnxadm group.
filter passwd (&(Objectclass=user)(!(objectClass=computer))
(memberOf=cn=cumuluslnxadm,ou=groups,ou=support,dc=rtp,dc=example,dc=test))
A Longer Example
A longer, more complete example for configuring LDAP is available on our knowledge base.
References
[Link]
70 14 December 2015
Cumulus Linux 2.5.5 User Guide
[Link]
[Link]
Netfilter - ACLs
Netfilter is the packet filtering framework in Cumulus Linux, as well as every other Linux distribution.
iptables, ip6tables and ebtables are userspace tools in Linux to administer filtering rules for IPv4
packets, IPv6 packets and Ethernet frames respectively. cl-acltool is the userspace tool to
administer filtering rules on Cumulus Linux, and is the only tool for configuring ACLs in Cumulus Linux.
cl-acltool operates on a series of configuration files, and uses iptables, ip6tables and ebtables
to install rules into the kernel. In addition to programming rules in the kernel, cl-acltool programs
rules in hardware for interfaces involving switch port interfaces, which iptables, ip6tables and
ebtables do not do on their own.
Contents
(Click to expand)
Contents (see page 71)
Commands (see page 71)
Files (see page 72)
Netfilter Framework in the Cumulus Linux Kernel (see page 72)
Limitations on Number of Rules (see page 72)
Enabling Nonatomic Updates (see page 73)
ebtables and Memory Spaces (see page 74)
Memory Spaces with Multiple Commands Line Options (see page 74)
Installing Packet Filtering (ACL) Rules using cl-acltool (see page 75)
Specifying which Policy Files to Install (see page 77)
Managing ACL Rules with cl-acltool (see page 77)
Further Examples (see page 78)
cl-acltool and Network Troubleshooting (see page 78)
Policing Control Plane and Data Plane Traffic (see page 79)
Useful Links (see page 80)
Caveats and Errata (see page 80)
Not All Rules Supported (see page 80)
iptables Interactions with cl-acltool (see page 81)
Where to Assign Rules (see page 81)
Generic Error Message Displayed after ACL Rule Installation Failure (see page 82)
Commands
cl-acltool
ebtables
iptables
[Link] 71
Cumulus Networks
iptables
ip6tables
Files
/etc/cumulus/acl/[Link]
/etc/cumulus/acl/policy.d/
Firebolt2 Limits
72 14 December 2015
Cumulus Linux 2.5.5 User Guide
Firebolt2 Limits
Trident/Trident+ Limits
Trident II Limits
1. Edit /etc/cumulus/[Link].
2. Add the following line to the file:
acl.non_atomic_update_mode = TRUE
[Link] 73
3.
Cumulus Networks
During nonatomic updates, traffic is stopped first, and enabled after the new configuration is
written into the hardware completely.
If you set an output flag with the INPUT chain you will get an error. For example, running cl-
acltool -i on the following rule:
However, simply removing the -o option and interface would make it a valid rule.
74 14 December 2015
Cumulus Linux 2.5.5 User Guide
[iptables]
-A INPUT --in-interface swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD --in-interface swp1 -p tcp --dport 80 -j ACCEPT
[ip6tables]
-A INPUT --in-interface swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD --in-interface swp1 -p tcp --dport 80 -j ACCEPT
[ebtables]
-A INPUT -p IPv4 -j ACCEPT
-A FORWARD -p IPv4 -j ACCEPT
Variables can be used to specify chain and interface lists to ease administration of rules:
INGRESS = swp+
INPUT_PORT_CHAIN = INPUT,FORWARD
[iptables]
-A $INPUT_PORT_CHAIN --in-interface $INGRESS -p tcp --dport 80 -j ACCEPT
[ip6tables]
-A $INPUT_PORT_CHAIN --in-interface $INGRESS -p tcp --dport 80 -j ACCEPT
[ebtables]
-A INPUT -p IPv4 -j ACCEPT
ACL rules for the system can be written into multiple files under the default /etc/cumulus/acl
[Link] 75
Cumulus Networks
ACL rules for the system can be written into multiple files under the default /etc/cumulus/acl
/policy.d/ directory. Ordering of rules during install follow the sorted order of the files based on file
names.
Use multiple files support to stack rules. The example below shows two rules files separating rules for
management and datapath traffic:
cumulus@switch:~$ ls /etc/cumulus/acl/policy.d/
00sample_mgmt.rules
01sample_datapath.rules
[iptables]
# protect the switch management
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s [Link] -d [Link] -p
tcp -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s [Link] -d [Link] -p
tcp -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -d [Link] -p udp -j DROP
[iptables]
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s [Link] -p icmp -j
ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s [Link] -d [Link] -j
DROP
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -s [Link] -d [Link] -j
DROP
76 14 December 2015
Cumulus Linux 2.5.5 User Guide
#
# This file is a master file for acl policy file inclusion
#
# Note: This is not a file where you list acl rules.
#
# This file can contain:
# - include lines with acl policy files
# example:
# include <filepath>
#
# see manpage cl-acltool(5) and cl-acltool(8) for how to write policy
files
#
include /etc/cumulus/acl/policy.d/*.rules
include /etc/cumulus/acl/policy.d/01_new.acl
[Link] 77
Cumulus Networks
...
To list installed rules using native iptables, ip6tables and ebtables, run these commands:
If the install fails, ACL rules in the kernel and hardware are rolled back to previous state. Errors from
programming rules in kernel or BCM hardware are reported appropriately.
Further Examples
More examples demonstrating how to use cl-acltool are available in the Help Center.
78 14 December 2015
Cumulus Linux 2.5.5 User Guide
Counters on POLICE ACL rules in iptables do not currently show the packets that are dropped
due to those rules.
Use the POLICE target with iptables. POLICE takes these arguments:
--set-class value: Sets the system internal class of service queue configuration to value.
--set-rate value: Specifies the maximum rate in kilobytes (KB) or packets.
--set-burst value: Specifies the number of packets or kilobytes (KB) allowed to arrive
sequentially.
--set-mode string: Sets the mode in KB (kilobytes) or pkt (packets) for rate and burst size.
For example, to rate limit the incoming traffic on swp1 to 400 packets/second with a burst of 100
packets/second and set the class of the queue for the policed traffic as 0, set this rule in your
appropriate .rules file:
Here is another example of control plane ACL rules to lock down the switch. This is specified in /etc
/cumulus/acl/policy.d/00control_plane.rules:
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT
INNFWD_CHAIN = INPUT,FORWARD
MARTIAN_SOURCES_4 = "[Link]/5,[Link]/8,[Link]/8,[Link]/32"
MARTIAN_SOURCES_6 = "ff00::/8,::/128,::ffff:[Link]/96,::1/128"
[iptables]
-A $INNFWD_CHAIN --in-interface $INGRESS_INTF -s $MARTIAN_SOURCES_4 -j DROP
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p ospf -j POLICE --set-mode
pkt --set-rate 2000 --set-burst 2000 --set-class 7
[Link] 79
Cumulus Networks
# Custom policy
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport 22 -s
$SSH_SOURCES_4 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --sport 123 -s
$NTP_SERVERS_4 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --sport 53 -s
$DNS_SERVERS_4 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --dport 161 -s
$SNMP_SERVERS_4 -j ACCEPT
# Allow UDP traceroute when we are the current TTL expired hop
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p udp --dport 1024:65535 -m
ttl --ttl-eq 1 -j ACCEPT
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -j DROP
Useful Links
[Link]
[Link]
80 14 December 2015
Cumulus Linux 2.5.5 User Guide
Does work, and the rules appear when you run cl-acltool -L:
However, running cl-acltool -i or reboot will remove them. To ensure all rules that can be
in hardware are hardware accelerated, place them in /etc/cumulus/acl/[Link] and
run cl-acltool -i.
[Link] 81
Cumulus Networks
When using the OUTPUT chain, rules must be assigned to the source. For example, if a rule is
assigned to the switch port in the direction of traffic but the source is a bridge (VLAN), the traffic
won’t be affected by the rule and must be applied to the bridge.
If all transit traffic needs to have a rule applied, use the FORWARD chain, not the OUTPUT chain.
Configuring switchd
switchd is the daemon at the heart of Cumulus Linux. It communicates between the switch and
Cumulus Linux, and all the applications running on Cumulus Linux.
The switchd configuration is stored in /etc/cumulus/[Link].
Versions of Cumulus Linux prior to 2.1 stored the switchd configuration at /etc/default
/switchd.
Contents
(Click to expand)
Contents (see page 82)
The switchd File System (see page 82)
Configuring switchd Parameters (see page 84)
Restarting switchd (see page 85)
Commands (see page 85)
Configuration Files (see page 85)
82 14 December 2015
Cumulus Linux 2.5.5 User Guide
[Link] 83
Cumulus Networks
| | |-- max
| | `-- max_per_route
| |-- host
| | |-- count
| | |-- count_v4
| | |-- count_v6
| | `-- max
| |-- mac
| | |-- count
| | `-- max
| `-- route
| |-- count_0
| |-- count_1
| |-- count_total
| |-- count_v4
| |-- count_v6
| |-- mask_limit
| |-- max_0
| |-- max_1
| `-- max_total
`-- version
To modify the configuration, run cl-cfg -w. For example, to set the buffer utilization measurement
interval to 1 minute, run:
84 14 December 2015
Cumulus Linux 2.5.5 User Guide
You can get some of this information by running cl-resource-query; though you cannot
update the switchd configuration with it.
Restarting switchd
Whenever you modify any switchd hardware configuration file (typically changing any *.conf file that
requires making a change to the switching hardware, like /etc/cumulus/datapath/[Link]),
you must restart switchd for the change to take effect:
You do not have to restart the switchd service when you update a network interface
configuration (that is, edit /etc/network/interfaces).
Restarting switchd causes all network ports to reset in addition to resetting the switch
hardware configuration.
Commands
cl-cfg
Configuration Files
/etc/cumulus/[Link]
[Link] 85
Cumulus Networks
How It Works
When a powered device is connected to the switch via an Ethernet cable:
If the available power is greater than the power required by the connected device, power is
supplied to the switch port, and the device powers on
If available power is less than the power required by the connected device and the switch port's
priority is less than the port priority set on all powered ports, power is not supplied to the port
If available power is less than the power required by the connected device and the switch port's
priority is greater than the priority of a currently powered port, power is removed from lower
priority port(s) and power is supplied to the port
If the total consumed power exceeds the configured power limit of the power source, low
priority ports are turned off. In the case of a tie, the port with the lower port number gets
priority
For the Accton AS4610-54P switch, power is available as follows:
920W x 750W
x 920W 750W
The AS4610-54P has an LED on the front panel to indicate PoE status:
Green: The poed daemon is running and no errors are detected
Yellow: One or more errors are detected or the poed daemon is not running
Configuring PoE
You use the poectl command utility to configure PoE on a switch that supports the feature. You can:
Enable or disable PoE for a given switch port
Set a switch port's PoE priority to one of three values: low, high or critical
By default, PoE is enabled on all Ethernet/1G switch ports, and these ports are set with a low priority.
Switch ports can have low, high or critical priority.
86 14 December 2015
Cumulus Linux 2.5.5 User Guide
To change the priority for one or more switch ports, run poectl -p swp# [low|high|critical].
For example:
To display PoE information for a set of switch ports, run poectl -i [port_numbers]:
Or to see all the PoE information for a switch, run poectl -s:
cumulus@switch:~$ poectl -s
System power:
Total: 730.0 W
Used: 11.0 W
Available: 719.0 W
Connected ports:
swp11, swp24, swp27, swp48
The set commands (priority, enable, disable) either succeed silently or display an error message if the
[Link] 87
Cumulus Networks
The set commands (priority, enable, disable) either succeed silently or display an error message if the
command fails.
poectl Arguments
The poectl command takes the following arguments:
Argument Description
-i, --port-info Returns detailed information for the specified ports. For example:
PORT_LIST -i swp1-swp5,swp10
-p, --priority Sets priority for the specified ports: low, high, critical.
PORT_LIST
PRIORITY
-r, --reset Performs a hardware reset on the specified ports. Use this if one or more ports are
PORT_LIST stuck in an error state. This does not reset any configuration settings for the
specified ports.
--save Saves the current configuration. The saved configuration is automatically loaded on
system boot.
Man Pages
88 14 December 2015
Cumulus Linux 2.5.5 User Guide
Man Pages
man poectl
Configuring
[Link] and Managing Network 89
Cumulus Networks
By default, ifupdown is quiet; use the verbose option -v when you want to know what is
going on when bringing an interface down or up.
Contents
(Click to expand)
Contents (see page 90)
Commands (see page 90)
Man Pages (see page 91)
Configuration Files (see page 91)
Basic Commands (see page 91)
Bringing All auto Interfaces Up or Down (see page 92)
ifupdown Behavior with Child Interfaces (see page 93)
ifupdown2 Interface Dependencies (see page 94)
ifup Handling of Upper (Parent) Interfaces (see page 97)
Configuring IP Addresses (see page 97)
Purging Existing IP Addresses on an Interface (see page 99)
Specifying User Commands (see page 99)
Sourcing Interface File Snippets (see page 100)
Using Globs for Port Lists (see page 100)
Using Templates (see page 101)
Adding Descriptions to Interfaces (see page 102)
Caveats and Errata (see page 102)
Useful Links (see page 103)
Commands
ifdown
ifquery
ifreload
90 14 December 2015
Cumulus Linux 2.5.5 User Guide
ifreload
ifup
mako-render
Man Pages
The following man pages have been updated for ifupdown2:
man ifdown(8)
man ifquery(8)
man ifreload
man ifup(8)
man ifupdown-addons-interfaces(5)
man interfaces(5)
Configuration Files
/etc/network/interfaces
Basic Commands
To bring up an interface or apply changes to an existing interface, run:
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
[Link] 91
Cumulus Networks
If you specified manual as the address family, you must bring up that interface manually
using ifconfig. For example, if you configured a bridge like this:
auto bridge01
iface bridge01 inet manual
ifdown always deletes logical interfaces after bringing them down. Use the --admin-state
option if you only want to administratively bring the interface up or down.
To see the link and administrative state, use the ip link show command:
In this example, swp1 is administratively UP and the physical link is UP (LOWER_UP flag). More
information on interface administrative state and physical state can be found in this knowledge base
article.
To reload all network interfaces marked auto, use the ifreload command, which is equivalent to
running ifdown then ifup, the one difference being that ifreload skips any configurations that
didn't change):
92 14 December 2015
Cumulus Linux 2.5.5 User Guide
[Link] 93
Cumulus Networks
bridge-stp on
For more information on the bridge in traditional mode vs the bridge in VLAN-aware mode, please read
this knowledge base article.
auto bond1
iface bond1
address [Link]/16
bond-slaves swp29 swp30
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto bond2
iface bond2
address [Link]/16
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto br2001
iface br2001
address [Link]/24
bridge-ports bond1.2001 bond2.2001
bridge-stp on
Using ifup --with-depends br2001 brings up all dependents of br2001: bond1.2001, bond2.2001,
bond1, bond2, bond1.2001, bond2.2001, swp29, swp30, swp31, swp32.
94 14 December 2015
Cumulus Linux 2.5.5 User Guide
Similarly, specifying ifdown --with-depends br2001 brings down all dependents of br2001: bond1.
2001, bond2.2001, bond1, bond2, bond1.2001, bond2.2001, swp29, swp30, swp31, swp32.
As mentioned earlier, ifdown2 always deletes logical interfaces after bringing them down.
Use the --admin-state option if you only want to administratively bring the interface up or
down. In terms of the above example, ifdown br2001 deletes br2001.
To guide you through which interfaces will be brought down and up, use the --print-dependency
option to get the list of dependents.
Use ifquery --print-dependency=list -a to get the dependency list of all interfaces:
[Link] 95
Cumulus Networks
swp30 : None
swp31 : None
swp32 : None
You can use dot to render the graph on an external system where dot is installed.
96 14 December 2015
Cumulus Linux 2.5.5 User Guide
auto br100
iface br100
bridge-ports bond1.100 bond2.100
auto bond1
iface bond1
bond-slaves swp1 swp2
If you run ifdown bond1, ifdown deletes bond1 and the VLAN interface on bond1 (bond1.100); it also
removes bond1 from the bridge br100. Next, when you run ifup bond1, it creates bond1 and the
VLAN interface on bond1 (bond1.100); it also executes ifup br100 to add the bond VLAN interface
(bond1.100) to the bridge br100.
As you can see above, implicitly bringing up the upper interface helps, but there can be cases where an
upper interface (like br100) is not in the right state, which can result in warnings. The warnings are
mostly harmless.
If you want to disable these warnings, you can disable the implicit upper interface handling by setting
skip_upperifaces=1 in /etc/network/ifupdown2/[Link].
With skip_upperifaces=1, you will have to explicitly execute ifup on the upper interfaces. In this
case, you will have to run ifup br100 after an ifup bond1 to add bond1 back to bridge br100.
Although specifying a subinterface like swp1.100 and then running ifup swp1.100 will also
result in the automatic creation of the swp1 interface in the kernel, Cumulus Networks
recommends you specify the parent interface swp1 as well. A parent interface is one where
any physical layer configuration can reside, such as link-speed 1000 or link-duplex
full.
It's important to note that if you only create swp1.100 and not swp1, then you cannot run
ifup swp1 since you did not specify it.
Configuring IP Addresses
[Link] 97
Cumulus Networks
Configuring IP Addresses
In /etc/network/interfaces, list all IP addresses as shown below under the iface section (see
man interfaces for more information):
auto swp1
iface swp1
address [Link]/30
address [Link]/30
The address method and address family are not mandatory. They default to inet/inet6 and static
by default, but inet/inet6 must be specified if you need to specify dhcp or loopback:
auto lo
iface lo inet loopback
You can specify both IPv4 and IPv6 addresses in the same iface stanza:
auto swp1
iface swp1
address [Link]/30
address [Link]/30
address [Link]/126
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
See man ip for more details on the options available to manage and query interfaces.
98 14 December 2015
Cumulus Linux 2.5.5 User Guide
auto swp1
iface swp1
address-purge no
Purging existing addresses on interfaces with multiple iface stanzas is not supported. Doing
so can result in the configuration of multiple addresses for an interface after you change an
interface address and reload the configuration with ifreload -a. If this happens, you must
shut down and restart the interface with ifup and ifdown, or manually delete superfluous
addresses with ip address delete [Link]/mask dev DEVICE. See
also the Caveats and Errata (see page 102) section below for some cautions about using
multiple iface stanzas for the same interface.
auto swp1
iface swp1
address [Link]/30
up /sbin/foo bar
[Link] 99
Cumulus Networks
Any valid command can be hooked in the sequencing of bringing an interface up or down, although
commands should be limited in scope to network-related commands associated with the particular
interface.
For example, it wouldn't make sense to install some Debian package on ifup of swp1, even though
that is technically possible. See man interfaces for more details.
source /etc/network/interfaces.d/bond0
auto br0
iface br0
bridge-ports glob swp1-6.100
auto br1
iface br1
bridge-ports glob swp7-9.100 swp11.100 glob swp15-18.100
Using Templates
ifupdown2 supports Mako-style templates. The Mako template engine is run over the interfaces file
before parsing.
Use the template to declare cookie-cutter bridges in the interfaces file:
%for v in [11,12]:
auto vlan${v}
iface vlan${v}
address 10.20.${v}.3/24
bridge-ports glob swp19-20.${v}
bridge-stp on
%endfor
%for i in [1,12]:
auto swp${i}
iface swp${i}
address 10.20.${i}.3/24
Regarding Mako syntax, use square brackets ([1,12]) to specify a list of individual numbers
(in this case, 1 and 12). Use range(1,12) to specify a range of interfaces.
You can test your template and confirm it evaluates correctly by running mako-render /etc
/network/interfaces.
For more examples of configuring Mako templates, read this knowledge base article.
[Link] 101
Cumulus Networks
auto swp1
iface swp1
alias swp1 hypervisor_port_1
You can query interface descriptions by running ip link show. The alias appears on the alias line:
Interface descriptions also appear in the SNMP OID (see page 380) IF-MIB::ifAlias.
source /etc/interfaces.d/speed_settings
auto swp1
iface swp1
address [Link]/24
As well as /etc/interfaces.d/speed_settings
auto swp1
iface swp1
link-speed 1000
link-duplex full
ifupdown2 correctly parses a configuration like this because the same attributes are not specified in
multiple iface stanzas.
And, as stated in the note above, you cannot purge existing addresses on interfaces with multiple
iface stanzas.
Useful Links
[Link]
[Link]
[Link]
[Link]
Contents
(Click to expand)
Contents (see page 103)
Commands (see page 104)
Man Pages (see page 104)
Configuration Files (see page 104)
Interface Types (see page 104)
Settings (see page 104)
Port Speed and Duplexing (see page 105)
Auto-negotiation (see page 106)
MTU (see page 106)
Configuring Breakout Ports (see page 108)
Breaking out a 40G port into 4x10G Ports (see page 108)
Combining Four 10G Ports into One 40G Port (see page 109)
Logical Switch Port Limitations (see page 110)
Verification and Troubleshooting Commands (see page 111)
Statistics (see page 111)
Querying SFP Port Information (see page 112)
[Link] 103
Cumulus Networks
Commands
ethtool
ip
Man Pages
man ethtool
man interfaces
man ip
man ip addr
man ip link
Configuration Files
/etc/network/interfaces
Interface Types
Cumulus Linux exposes network interfaces for several types of physical and logical devices:
lo, network loopback device
ethN, switch management port(s), for out of band management only
swpN, switch front panel ports
(optional) brN, bridges (IEEE 802.1Q VLANs)
(optional) bondN, bonds (IEEE 802.3ad link aggregation trunks, or port channels)
Settings
You can set the MTU, speed, duplex and auto-negotiation settings under a physical or logical interface
stanza:
auto swp1
iface swp1
address [Link]/24
mtu 9000
link-speed 10000
link-duplex full
link-autoneg off
auto swp1
iface swp1
address [Link]/24
link-speed 10000
link-duplex full
If you specify the port speed in /etc/network/interfaces, you must also specify the
duplex mode setting along with it; otherwise, ethtool defaults to half duplex.
You can also configure these settings at run time, using ethtool.
Runtime Configuration (Advanced)
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
You can use ethtool to configure duplexing and the speed for your switch ports. You must specify
both port speed and duplexing in the ethtool command; auto-negotiation is optional. The following
examples use swp1.
To set the port speed to 1G, run:
[Link] 105
Cumulus Networks
1G 100 Mb
40G 10G*
Auto-negotiation
You can enable or disable auto-negotiation (that is, set it on or off) on a switch port.
auto swp1
iface swp1
link-autoneg off
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
You can use ethtool to configure auto-negotiation for your switch ports. The following example use
swp1:
To enable or disable auto-negotiation, run:
MTU
Interface MTU applies to the management port, front panel port, bridge, VLAN subinterfaces and
bonds.
auto swp1
iface swp1
mtu 9000
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
You must take care to ensure there are no MTU mismatches in the conversation path. MTU
mismatches will result in dropped or truncated packets, degrading or blocking network
performance.
When you are configuring MTU for a bridge, don't set MTU on the bridge itself; set it on the individual
members of the bridge. The MTU setting is the lowest MTU setting of any interface that is a member of
that bridge (that is, every interface specified in bridge-ports in the bridge configuration in the
interfaces file), even if another bridge member has a higher MTU value. Consider this bridge
configuration:
auto br0
iface br0
bridge-ports bond1 bond2 bond3 bond4 peer5
bridge-vlan-aware yes
bridge-vids 100-110
bridge-stp on
In order for br0 to have an MTU of 9000, set the MTU for each of the member interfaces (bond1 to
bond 4, and peer5), to 9000 at minimum.
auto peer5
iface peer5
bond-slaves swp3 swp4
bond-mode 802.3ad
bond-miimon 100
bond-lacp-rate 1
[Link] 107
Cumulus Networks
bond-min-links 1
bond-xmit_hash_policy layer3+4
mtu 9000
When configuring MTU for a bond, configure the MTU value direcly under the bond interface; the
configured value is inherited by member links.
To show MTU, use ip link show:
You configure breakout ports with the /etc/cumulus/[Link] file. After you modify the
configuration, restart switchd to push the new configuration (run sudo service switchd restart
; this interrupts network services (see page 85)).
# QSFP+ ports
#
# <port label 49-52> = [4x10G|40G]
49=40G
50=40G
51=40G
52=40G
To change a 40G port to 4x10G ports, edit the /etc/cumulus/[Link] file with a text editor
(nano, vi, zile). Change 40G to 4x10G.
In the following example, switch port 49 is changed to a breakout port:
# QSFP+ ports
#
# <port label 49-52> = [4x10G|40G]
49=4x10G
50=40G
51=40G
52=40G
Many services depend on switchd. It is highly recommended to restart Cumulus Linux if possible in
this situation.
# SFP+ ports#
# <port label 1-48> = [10G|40G/4]
[Link] 109
Cumulus Networks
1=10G
2=10G
3=10G
4=10G
5=10G
To change four 10G ports into one 40G port, edit the /etc/cumulus/[Link] file with a text
editor (nano, vi, zile). Change 10G to 40G/4 for every port being ganged.
In the following example, switch ports swp1-4 are changed to a ganged port:
# SFP+ ports#
# <port label 1-48> = [10G|40G/4]
1=40G/4
2=40G/4
3=40G/4
4=40G/4
5=10G
Many services depend on switchd. It is highly recommended to restart Cumulus Linux if possible in
this situation.
You must gang four 10G ports in sequential order. For example, you cannot gang
swp1, swp10, swp20 and swp40 together.
The ports must be in increments of four, with the starting port being swp1 (or swp5,
swp9, or so forth); so you cannot gang swp2, swp3, swp4 and swp5 together.
# [Link] --
#
# This file controls port aggregation and subdivision. For example,
QSFP+
# ports are typically configurable as either one 40G interface or four
The means the maximum number of ports for this Dell S6000 is 104.
Statistics
High-level interface statistics are available with the ip -s link command:
[Link] 111
Cumulus Networks
HwIfInMcastPkts: 243
HwIfOutOctets: 1148217
HwIfOutUcastPkts: 0
HwIfOutMcastPkts: 11353
HwIfOutBcastPkts: 0
HwIfInDiscards: 0
HwIfInL3Drops: 0
HwIfInBufferDrops: 0
HwIfInAclDrops: 0
HwIfInBlackholeDrops: 0
HwIfInDot3LengthErrors: 0
HwIfInErrors: 0
SoftInErrors: 0
SoftInDrops: 0
SoftInFrameErrors: 0
HwIfOutDiscards: 0
HwIfOutErrors: 0
HwIfOutQDrops: 0
HwIfOutNonQDrops: 0
SoftOutErrors: 0
SoftOutDrops: 0
SoftOutTxFifoFull: 0
HwIfOutQLen: 0
Useful Links
[Link]
[Link]
[Link]
[Link]
Versions of these files prior to Cumulus Linux 2.1 are incompatible with Cumulus Linux 2.1
and later; using older files will cause switchd to fail to start and return an error that it cannot
find the /var/lib/cumulus/[Link] file.
Each packet is assigned to an ASIC Class of Service (CoS) value based on the packet’s priority value
stored in the 802.1p (Class of Service) or DSCP (Differentiated Services Code Point) header field. The
packet is assigned to a priority group based on the CoS value.
Priority groups include:
Control: Highest priority traffic
Service: Second-highest priority traffic
Lossless: Traffic protected by priority flow control
Bulk: All remaining traffic
A lossless traffic group is protected from packet drops by configuring the datapath to use priority
pause. A lossless priority group requires a port group configuration, which specifies the ports
configured for priority flow control and the additional buffer space assigned to each port for packets in
the lossless priority group.
The scheduler is configured to use a hybrid scheduling algorithm. It applies strict priority to control
traffic queues and a weighted round robin selection from the remaining queues. Unicast packets and
multicast packets with the same priority value are assigned to separate queues, which are assigned
equal scheduling weights.
Datapath configuration takes effect when you initialize switchd. Changes to the [Link] file
require you to restart switchd (see page 85).
Contents
(Click to expand)
Contents (see page 113)
Commands (see page 114)
Configuration Files (see page 114)
Configuring Traffic Marking through ACL Rules (see page 115)
Configuring Link Pause (see page 116)
Useful Links (see page 117)
Caveats and Errata (see page 117)
[Link] 113
Cumulus Networks
Commands
If you modify the configuration in the /etc/cumulus/datapath/[Link] file, you must restart
switchd (see page 85) for the changes to take effect:
Configuration Files
The following configuration applies to 10G and 40G switches only (any switch on the Trident, Trident+,
or Trident II platform).
/etc/cumulus/datapath/[Link]: The default datapath configuration file.
/etc/cumulus/datapath/custom_traffic.conf: An optional customized configuration file.
An example traffic configuration file:
section: traffic
# traffic configurations:
# -- name: an arbitrary label
# -- type: lossless, control, service, or bulk packets
# -- priorities assigned to each group
config_end
Option Description
–set-cos Sets the datapath resource/queuing class value. Values are defined in IEEE_P802.1p.
INT
–set-dscp Sets the DSCP field in packet header to a value, which can be either a decimal or hex
value value.
–set-dscp- Sets the DSCP field in the packet header to the value represented by the DiffServ class
class class value. This class can be EF, BE or any of the CSxx or AFxx classes.
[iptables]
-t mangle -A -FORWARD -i --in-interface swp+ -p tcp --dport bgp -j SETQOS --
[Link] 115
Cumulus Networks
set-dscp 10 --set-cos 5
[ip6tables]
-t mangle -A -FORWARD -i --in-interface swp+ -j SETQOS --set-dscp 10
You can put the rule in either the mangle table or the default filter table; the mangle table and filter
table are put into separate TCAM slices in the hardware.
To put the rule in the mangle table, include -t mangle; to put the rule in the filter table, omit -t
mangle.
A port group refers to one or more sequences of contiguous ports. Multiple port groups can be defined
by:
Adding a comma-separated list of port group names to the port_group_list.
Adding the port_set, rx_enable, and tx_enable configuration lines for each port group.
You can specify the set of ports in a port group in comma-separated sequences of contiguous ports;
you can see which ports are contiguous in /var/lib/cumulus/porttab . The syntax supports:
A single port (swp1s0 or swp5)
A sequence of regular swp ports (swp2-swp5)
A sequence within a breakout swp port (swp6s0-swp6s3)
116 14 December 2015
Cumulus Linux 2.5.5 User Guide
...
swp2
swp3
swp4
swp5
swp6s0
swp6s1
swp6s2
swp6s3
swp7
...
Restart switchd (see page 85) to allow link pause configuration changes to take effect:
Useful Links
iptables-extensions man page
Layer 2 Features
[Link] 117
Cumulus Networks
Layer 2 Features
The STP modes Cumulus Linux supports vary depending upon which bridge driver mode (see
page 154) is in use. For a bridge configured in traditional mode, STP, RSTP, PVST and PVRST
are supported; with the default set to PVRST. VLAN-aware (see page 175) bridges only operate
in RSTP mode.
If a bridge running RSTP (802.1w) receives a common STP (802.1D) BPDU, it will automatically
fall back to 802.1D operation.
You can configure mstpd to be in common STP mode only, by setting setforcevers to STP.
Contents
(Click to expand)
Contents (see page 118)
Commands (see page 118)
PVST/PVRST (see page 119)
Creating a Bridge and Configuring STP (see page 119)
Configuring Spanning Tree Parameters (see page 121)
Understanding the Spanning Tree Parameters (see page 122)
Bridge Assurance (see page 129)
BPDU Guard (see page 130)
Configuring BPDU Guard (see page 130)
Recovering a Port Disabled by BPDU Guard (see page 130)
BPDU Filter (see page 132)
Configuration Files (see page 133)
Man Pages (see page 133)
Useful Links (see page 133)
Caveats and Errata (see page 133)
Commands
brctl
mstpctl
118 14 December 2015
Cumulus Linux 2.5.5 User Guide
mstpctl
mstpctl is a utility to configure STP. mstpd is started by default on bootup. mstpd logs and errors are
located in /var/log/syslog.
PVST/PVRST
Per VLAN Spanning Tree (PVST) creates a spanning tree instance for a bridge. Rapid PVST (PVRST)
supports RSTP enhancements for each spanning tree instance. You must create a bridge corresponding
to the untagged native/access VLAN, and all the physical switch ports must be part of the same VLAN.
When connected to a switch that has a native VLAN configuration, the native VLAN must be configured
to be VLAN 1 only.
Cumulus Linux supports the RSTP/PVRST/PVST modes of STP natively when the bridge is configured in
traditional mode (see page 154).
auto br2
iface br2
bridge-ports swp1.101 swp4.101 swp5.101
bridge-stp on
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
You use brctl to create the bridge, add bridge ports in the bridge and configure STP on the bridge.
mstpctl is used only when an admin needs to change the default configuration parameters for STP:
[Link] 119
Cumulus Networks
auto br2
iface br2 inet static
bridge-ports swp1 swp2 swp3 swp4
bridge-stp on
mstpctl-maxage 20
mstpctl-ageing 300
mstpctl-fdelay 15
mstpctl-maxhops 20
mstpctl-txholdcount 6
mstpctl-forcevers rstp
mstpctl-treeprio 32768
mstpctl-treeportprio swp3=128
mstpctl-hello 2
mstpctl-portpathcost swp1=0 swp2=0
mstpctl-portadminedge swp1=no swp2=no
mstpctl-portautoedge swp1=yes swp2=yes
mstpctl-portp2p swp1=no swp2=no
mstpctl-portrestrrole swp1=no swp2=no
mstpctl-portrestrtcn swp1=no swp2=no
mstpctl-portnetwork swp1=no
mstpctl-bpduguard swp1=no swp2=no
mstpctl-bpdufilter swp4=yes
[Link] 121
Cumulus Networks
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
The mstp daemon is an open source project that some network engineers may be unfamiliar with. For
example, many incumbent vendors use the keyword portfast to describe a port that is automatically
set to forwarding when the port is brought up. The mstpd equivalent is mstpctl-portadminedge. For
more comparison please read this knowledge base article.
Examples are included below:
Parameter Description
maxage Sets the bridge's maximum age to <max_age> seconds. The default is 20.
The maximum age must meet the condition 2 * (Bridge Forward Delay - 1 second)
>= Bridge Max Age.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-maxage 24
ageing Sets the Ethernet (MAC) address ageing time in <time> seconds for the bridge
when the running version is STP, but not RSTP/MSTP. The default is 300.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-ageing 240
Parameter Description
fdelay Sets the bridge's bridge forward delay to <time> seconds. The default is 15.
The bridge forward delay must meet the condition 2 * (Bridge Forward Delay - 1
second) >= Bridge Max Age.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-fdelay 15
maxhops Sets the bridge's maximum hops to <max_hops>. The default is 20.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-maxhops 24
txholdcount Sets the bridge's bridge transmit hold count to <tx_hold_count>. The default is 6.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-txholdcount 6
[Link] 123
Cumulus Networks
Parameter Description
forcevers Sets the bridge's force STP version to either RSTP/STP. MSTP is not supported
currently. The default is RSTP.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-forcevers rstp
treeprio Sets the bridge's tree priority to <priority> for an MSTI instance. The priority
value is a number between 0 and 65535 and must be a multiple of 4096. The
bridge with the lowest priority is elected the root bridge. The default is 32768.
mstpctl-treeprio 8192
treeportprio
Parameter Description
Sets the priority of port <port> to <priority> for the MSTI instance. The priority
value is a number between 0 and 240 and must be a multiple of 16. The default is
128.
mstpctl-treeportprio swp4.101 64
hello Sets the bridge's bridge hello time to <time> seconds. The default is 2.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-hello 20
portpathcost Sets the port cost of the port <port> in bridge <bridge> to <cost>. The default is
0.
mstpd supports only long mode; that is, 32 bits for the path cost.
To set this parameter persistently, configure it under the bridge stanza:
[Link] 125
Cumulus Networks
Parameter Description
mstpctl-portpathcost swp1.101=10
portadminedge Enables/disables the initial edge state of the port <port> in bridge <bridge>. The
default is no.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-portadminedge swp1.101=yes
portautoedge Enables/disables the auto transition to/from the edge state of the port <port> in
bridge <bridge>. The default is yes.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-portautoedge swp1.101=no
Parameter Description
portp2p Enables/disables the point-to-point detection mode of the port <port> in bridge
<bridge>. The default is auto.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-portp2p swp1.101=no
portrestrrole Enables/disables the ability of the port <port> in bridge <bridge> to take the root
role. The default is no.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-portrestrrole swp1.101=no
portrestrtcn Enables/disables the ability of the port <port> in bridge <bridge> to propagate
received topology change notifications. The default is no.
To set this parameter persistently, configure it under the bridge stanza:
[Link] 127
Cumulus Networks
Parameter Description
mstpctl-portrestrtcn swp1.101=yes
portnetwork Enables/disables the bridge assurance capability for a network port <port> in
bridge <bridge>. The default is no.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-portnetwork swp4.101=yes
bpduguard Enables/disables the BPDU guard configuration of the port <port> in bridge
<bridge>. The default is no.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-bpduguard swp1=no
Parameter Description
portbpdufilter Enables/disables the BPDU filter functionality for a port <port> in bridge <bridge>
. The default is no.
To set this parameter persistently, configure it under the bridge stanza:
mstpctl-bpdufilter swp4.101=yes
Bridge Assurance
On a point-to-point link where RSTP is running, if you want to detect unidirectional links and put the
port in a discarding state (in error), you can enable bridge assurance on the port by enabling port type
network. The port would be in a bridge assurance inconsistent state until a BPDU is received from the
peer. You need to configure the port type network on both the ends of the link:
[Link] 129
Cumulus Networks
BPDU Guard
To protect the spanning tree topology from unauthorized switches affecting the forwarding path, you
can configure BPDU guard (Bridge Protocol Data Unit). One very common example is when someone
hooks up a new switch to an access port off of a leaf switch. If this new switch is configured with a low
priority, it could become the new root switch and affect the forwarding path for the entire Layer 2
topology.
auto br2
iface br2 inet static
bridge-ports swp1 swp2 swp3 swp4 swp5 swp6
bridge-stp on
mstpctl-bpduguard swp1=yes swp2=yes swp3=yes swp4=yes
Non-Persistent Configuration
You can also configure BPDU guard on an individual port using a runtime configuration.
Runtime Configuration (Advanced)
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
To determine whether BPDU guard is configured, or if a BPDU has been received, run mstpctl
showportdetail <bridge name>:
The only way to recover a port that has been placed in the disabled state is to manually un-shut or
bring up the port with sudo ifup [port], as shown in the example below:
Bringing up the disabled port does not fix the problem if the configuration on the connected
end-station has not been rectified.
[Link] 131
Cumulus Networks
BPDU Filter
You can enable bpdufilter on a switch port, which filters BPDUs in both directions. This effectively
disables STP on the port.
To enable it, add the following to /etc/network/interfaces under the bridge port iface
section example:
auto br100
iface br100
bridge-ports swp1.100 swp2.100
mstpctl-portbpdufilter swp1=yes swp2=yes
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
Configuration Files
/etc/network/interfaces
Man Pages
brctl(8)
bridge-utils-interfaces(5)
ifupdown-addons-interfaces(5)
mstpctl(8)
mstpctl-utils-interfaces(5)
Useful Links
The source code for mstpd/mstpctl was written by Vitalii Demianets and is hosted at the sourceforge
URL below.
[Link]
[Link]
[Link] 133
Cumulus Networks
Contents
(Click to expand)
Contents (see page 134)
Commands (see page 134)
Man Pages (see page 134)
Configuring LLDP (see page 134)
Example lldpcli Commands (see page 135)
Enabling the SNMP Subagent in LLDP (see page 138)
Configuration Files (see page 139)
Useful Links (see page 139)
Caveats and Errata (see page 139)
Commands
lldpd (daemon)
lldpcli (interactive CLI)
Man Pages
man lldpd
man lldpcli
Configuring LLDP
You configure lldpd settings in /etc/[Link] or /etc/lldpd.d/.
Here is an example persistent configuration:
[Link] 135
Cumulus Networks
Cumulus Linux
MgmtIP: [Link]
Capability: Router, on
Port:
PortID: ifname swp1
PortDescr: swp1
---------------------------------------------------------------------
Interface: swp2, via: CDPv1, RID: 123, Time: 0 day, [Link]
Chassis:
ChassisID: local T2
SysName: T2
SysDescr: Linux running on
Cumulus Linux
MgmtIP: [Link]
Capability: Router, on
Port:
PortID: ifname swp1
PortDescr: swp1
Interface: swp2
Transmitted: 9423
Received: 6264
Discarded: 0
Unrecognized: 0
Ageout: 0
Inserted: 2
Deleted: 0
---------------------------------------------------------------------
Interface: swp3
Transmitted: 9423
Received: 6265
Discarded: 0
Unrecognized: 0
Ageout: 0
Inserted: 2
Deleted: 0
----------------------------------------------------------------------
... and more (output truncated to fit this document)
[Link] 137
Cumulus Networks
Transmit delay: 1
Transmit hold: 4
Receive mode: no
Pattern for management addresses: (none)
Interface pattern: (none)
Interface pattern for chassis ID: (none)
Override description with: (none)
Override platform with: (none)
Advertise version: yes
Disable LLDP-MED inventory: yes
LLDP-MED fast start mechanism: yes
LLDP-MED fast start interval: 1
--------------------------------------------------------------------
A runtime configuration does not persist when you reboot the switch — all changes are lost.
The active interface list always overrides the inactive interface list.
Configuration Files
/etc/[Link]
/etc/lldpd.d
/etc/default/lldpd
Useful Links
[Link]
[Link]
Contents
(Click to expand)
Contents (see page 139)
Supported Features (see page 140)
Configuring PTM (see page 140)
Configuration Parameters (see page 141)
[Link] 139
Cumulus Networks
Supported Features
Topology verification using LLDP. ptmd creates a client connection to the LLDP daemon, lldpd,
and retrieves the neighbor relationship between the nodes/ports in the network and compares
them against the prescribed topology specified in the [Link] file.
Only physical interfaces, like swp1 or eth0, are currently supported. Cumulus Linux does not
support specifying virtual interfaces like bonds or subinterfaces like eth0.200 in the topology
file.
Forwarding path failure detection using Bidirectional Forwarding Detection (BFD); however,
demand mode is not supported. For more information on how BFD operates in Cumulus Linux,
see below (see page 144) and see man ptmd(8).
Integration with Quagga (PTM to Quagga notification).
Client management: ptmd creates an abstract named socket /var/run/[Link] on
startup. Other applications can connect to this socket to receive notifications and send
commands.
Event notifications: see Scripts below.
User configuration via a [Link] file; see below (see page 140).
Configuring PTM
ptmd verifies the physical network topology against a DOT-specified network graph file, /etc/ptm.d
/[Link]. This file must be present or else ptmd will not start. You can specify an alternate file
using the -c option.
At startup, ptmd connects to lldpd, the LLDP daemon, over a Unix socket and retrieves the neighbor
name and port information. It then compares the retrieved port information with the configuration
information that it read from the topology file. If there is a match, then it is a PASS, else it is a FAIL.
PTM performs its LLDP neighbor check using the PortID ifname TLV information. Previously, it
used the PortID port description TLV information.
graph G {
node [shape=record];
graph [hostnametype="hostname", version="1:0", date="04/12/2013"];
edge [dir=none, len=1, headport=center, tailport=center];
//R1's connections - R1 is top-tier spine
"R1":"swp1" -- "R3":"swp3";
"R1":"swp2" -- "R4":"swp3";
}
It’s a good idea to always wrap the hostname in double quotes, like “[Link]”.
Otherwise, ptmd can fail if you specify a fully-qualified domain name as the hostname and do
not wrap it in double quotes.
Configuration Parameters
You can configure ptmd parameters in the topology file. The parameters are classified as host-only,
global, per-port/node and templates.
Host-only Parameters
Host-only parameters apply to the entire host on which PTM is running. You can include the
hostnametype host-only parameter, which specifies whether PTM should use only the host name (
hostname) or the fully-qualified domain name (fqdn) while looking for the self-node in the graph file.
For example, in the graph file below, PTM will ignore the FQDN and only look for switch04, since that is
the host name of the switch it's running on:
graph G {
hostnametype="hostname"
BFD="upMinTx=150,requiredMinRx=250"
"cumulus":swp44 -- "[Link]":swp20
"cumulus":swp46 -- "[Link]":swp22
}
However, in this next example, PTM will compare using the FQDN and look for switch05.
[Link], which is the FQDN of the switch it’s running on:
graph G {
hostnametype="fqdn"
[Link] 141
Cumulus Networks
"cumulus":swp44 -- "[Link]":swp20
"cumulus":swp46 -- "[Link]":swp22
}
Global Parameters
Global parameters apply to every port listed in the topology file. There are two global parameters: LLDP
and BFD. LLDP is enabled by default; if no keyword is present, default values are used for all ports.
However, BFD is disabled if no keyword is present, unless there is a per-port override configured. For
example:
graph G {
LLDP=""
BFD="upMinTx=150,requiredMinRx=250,afi=both"
"cumulus":swp44 -- "qct-ly2-04":swp20
"cumulus":swp46 -- "qct-ly2-04":swp22
}
Per-port Parameters
Per-port parameters provide finer-grained control at the port level. These parameters override any
global or compiled defaults. For example:
graph G {
LLDP=""
BFD="upMinTx=300,requiredMinRx=100"
"cumulus":swp44 -- "qct-ly2-04":swp20 [BFD="upMinTx=150,
requiredMinRx=250,afi=both"]
"cumulus":swp46 -- "qct-ly2-04":swp22
}
Templates
Templates provide flexibility in choosing different parameter combinations and applying them to a
given port. A template instructs ptmd to reference a named parameter string instead of a default one.
There are two parameter strings ptmd supports:
bfdtmpl, which specifies a custom parameter tuple for BFD.
lldptmpl, which specifies a custom parameter tuple for LLDP.
For example:
graph G {
LLDP=""
BFD="upMinTx=300,requiredMinRx=100"
BFD1="upMinTx=200,requiredMinRx=200"
BFD2="upMinTx=100,requiredMinRx=300"
LLDP1="match_type=ifname"
LLDP2="match_type=portdescr"
"cumulus":swp44 -- "qct-ly2-04":swp20 [BFD="bfdtmpl=BFD1", LLDP="
lldptmpl=LLDP1"]
"cumulus":swp46 -- "qct-ly2-04":swp22 [BFD="bfdtmpl=BFD2", LLDP="
lldptmpl=LLDP2"]
"cumulus":swp46 -- "qct-ly2-04":swp22
}
In this template, LLDP1 and LLDP2 are templates for LLDP parameters while BFD1 and BFD2 are
template for BFD parameters.
graph G {
"cumulus-1":swp44 -- "cumulus-2":swp20 [BFD="upMinTx=300,
requiredMinRx=100,afi=v6"]
"cumulus-1":swp46 -- "cumulus-2":swp22 [BFD="detectMult=4"]
}
[Link] 143
Cumulus Networks
graph G {
"cumulus-1":swp44 -- "cumulus-2":swp20 [LLDP="match_hostname=fqdn"]
"cumulus-1":swp46 -- "cumulus-2":swp22 [LLDP="
match_type=portdescr"]
}
When you specify match_hostname=fqdn, ptmd will match the entire FQDN, like cumulus-2.
[Link] in the example below. If you do not specify anything for match_hostname, ptmd
will match based on hostname only, like cumulus-3 below, and ignore the rest of the URL:
graph G {
"cumulus-1":swp44 -- "[Link]":swp20 [LLDP="
match_hostname=fqdn"]
"cumulus-1":swp46 -- "cumulus-3":swp22 [LLDP="
match_type=portdescr"]
}
BFD requires an IP address for any interface on which it is configured. The neighbor IP
address for a single hop BFD session must be in the ARP table before BFD can start sending
control packets.
You cannot specify BFD multihop sessions in the [Link] file since you cannot specify
the source and destination IP address pairs in that file. Use Quagga (see page 293) to
configure multihop sessions.
Configuring BFD
You configure BFD one of two ways: by specifying the configuration in the [Link] file, or using
Quagga (see page 339). However, the topology file has some limitations:
The [Link] file supports creating BFD IPv4 and IPv6 single hop sessions only; you
cannot specify IPv4 or IPv6 multihop sessions in the topology file.
The topology file supports BFD sessions for only link-local IPv6 peers; BFD sessions for global
IPv6 peers discovered on the link will not be created.
Echo Function
Cumulus Linux supports the echo function for IPv4 single hops only, and with the a synchronous
operating mode only (Cumulus Linux does not support demand mode).
You use the echo function primarily to test the forwarding path on a remote system. To enable the
echo function, set echoSupport to 1 in the topology file.
Once the echo packets are looped by the remote system, the BFD control packets can be sent at a
much lower rate. You configure this lower rate by setting the slowMinTx parameter in the topology file
to a non-zero value of milliseconds.
You can use more aggressive detection times for echo packets since the round-trip time is reduced
because they are accessing the forwarding path. You configure the detection interval by setting the
echoMinRx parameter in the topology file to a non-zero value of milliseconds; the minimum setting is
50 milliseconds. Once configured, BFD control packets are sent out at this required minimum echo Rx
interval. This indicates to the peer that the local system can loop back the echo packets. Echo packets
are transmitted if the peer supports receiving echo packets.
0 1 2 3
My Discriminator
Where:
Version is the version of the BFD echo packet.
Length is the length of the BFD echo packet.
My Discriminator is a non-zero value that uniquely identifies a BFD session on the transmitting
side. When the originating node receives the packet after being looped back by the receiving
system, this value uniquely identifies the BFD session.
[Link] 145
Cumulus Networks
Scripts
ptmd executes scripts at /etc/ptm.d/if-topo-pass and /etc/ptm.d/if-topo-failfor each
interface that goes through a change, running if-topo-pass when an LLDP or BFD check passes and
running if-topo-fails when the check fails. The scripts receive an argument string that is the result
of the ptmctl command, described in ptmd Commands below.
You should modify these default scripts as needed.
You only need to do this to check link state; you don't need to enable PTM to determine BFD
status.
quagga# conf t
quagga(config)# ptm-enable
quagga(config)#
quagga# conf t
quagga(config)# no ptm-enable
quagga(config)#
When the ptm-enable flag is configured by the user, the zebra daemon connects to ptmd over a Unix
socket. Any time there is a change of status for an interface, ptmd sends notifications to zebra. Zebra
maintains a ptm-status flag per interface and evaluates routing adjacency based on this flag. To
check the per-interface ptm-status:
ptmctl Examples
For basic output, use ptmctl without any options:
-------------------------------------------------------------
port cbl BFD BFD BFD BFD
status status peer local type
-------------------------------------------------------------
swp1 pass pass [Link] N/A singlehop
[Link] 147
Cumulus Networks
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------------
port cbl exp act sysname portID portDescr match last
BFD BFD BFD BFD det_mult tx_timeout rx_timeout
echo_tx_timeout echo_rx_timeout max_hop_cnt
status nbr nbr on upd
Type state peer DownDiag
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------------
swp45 pass h1:swp1 h1:swp1 h1 swp1 swp1 IfName 5m: 5s N
/A N/A N/A N/A N/A N/A N/A N
/A N/A N/A
swp46 fail h2:swp1 h2:swp1 h2 swp1 swp1 IfName 5m: 5s N
/A N/A N/A N/A N/A N/A N/A N
/A N/A N/A
To return information on active BFD sessions ptmd is tracking, use the -b option:
----------------------------------------------------------
port peer state local type diag
----------------------------------------------------------
swp1 [Link] Up N/A singlehop N/A
N/A [Link] Up [Link] multihop N/A
To return LLDP information, use the -l option. It returns only the active neighbors currently being
tracked by ptmd.
---------------------------------------------
port sysname portID port match last
descr on upd
---------------------------------------------
swp45 h1 swp1 swp1 IfName 5m:59s
swp46 h2 swp1 swp1 IfName 5m:59s
To return detailed information on active BFD sessions ptmd is tracking, use the -b and -d options
(results are for an IPv6-connected peer):
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-----
port peer state local type diag det tx_timeout
rx_timeout echo echo max rx_ctrl tx_ctrl rx_echo
tx_echo
[Link] 149
Cumulus Networks
Unsupported command
For example:
If you encounter errors with the [Link] file, you can use dot (included in the Graphviz
package) to validate the syntax of the topology file.
Configuration Files
/etc/ptm.d/[Link]
/etc/ptm.d/if-topo-pass
/etc/ptm.d/if-topo-fail
Useful Links
Bidirectional Forwarding Detection (BFD)
Graphviz
LLDP on Wikipedia
PTMd GitHub repo
Contents
(Click to expand)
Contents (see page 151)
Example: Bonding 4 Slaves (see page 151)
Hash Distribution (see page 154)
Configuration Files (see page 154)
Useful Links (see page 154)
Caveats and Errata (see page 154)
In this example, front panel port interfaces swp1-swp4 are slaves in bond0 (swp5 and swp6 are not
part of bond0). The name of the bond is arbitrary as long as it follows Linux interface naming
guidelines, and is unique within the switch. The only bonding mode supported in Cumulus Linux is
802.3ad. There are several 802.3ad settings that can be applied to each bond:
auto bond0
iface bond0
address [Link]/30
bond-slaves swp1 swp2 swp3 swp4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
However, if you are intending that the bond become part of a bridge, you don't need to specify an IP
address. The configuration would look like this:
auto bond0
iface bond0
bond-slaves glob swp1-4
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
And
All slave interfaces within a bond will have the same MAC address as the bond. Typically, the
first slave added to the bond donates its MAC address for the bond. The other slaves’ MAC
addresses are set to the bond MAC address. The bond MAC address is used as source MAC
address for all traffic leaving the bond, and provides a single destination MAC address to
address traffic to the bond.
[Link] 153
Cumulus Networks
Hash Distribution
Egress traffic through a bond is distributed to a slave based on a packet hash calculation. This
distribution provides load balancing over the slaves. The hash calculation uses packet header data to
pick which slave to transmit the packet. For IP traffic, IP header source and destination fields are used
in the calculation. For IP + TCP/UDP traffic, source and destination ports are included in the hash
calculation. Traffic for a given conversation flow will always hash to the same slave. Many flows will be
distributed over all the slaves to load balance the total traffic. In a failover event, the hash calculation is
adjusted to steer traffic over available slaves.
Configuration Files
/etc/network/interfaces
Useful Links
[Link]
802.3ad (Accessible writeup)
Link aggregation from Wikipedia
You can configure both VLAN-aware and traditional mode bridges on the same network in
Cumulus Linux; however you should not have more than one VLAN-aware bridge on a given
switch. If you are implementing VXLANs (see page 226), you must use traditional bridge
mode.
Contents
(Click to expand)
Contents (see page 155)
Configuration Files (see page 155)
Commands (see page 155)
Creating a Bridge between Physical Interfaces (see page 155)
Creating the Bridge and Adding Interfaces (see page 156)
Showing and Verifying the Bridge Configuration (see page 157)
Examining MAC Addresses (see page 158)
Multiple Bridges (see page 159)
Configuring an SVI (Switch VLAN Interface) (see page 161)
Showing and Verifying the Bridge Configuration (see page 163)
Using Trunks in Traditional Bridging Mode (see page 164)
Trunk Example (see page 165)
Showing and Verifying the Trunk (see page 166)
Additional Examples (see page 166)
Configuration Files (see page 166)
Useful Links (see page 167)
Caveats and Errata (see page 167)
Configuration Files
/etc/network/interfaces
Commands
brctl
bridge
ip addr
ip link
[Link] 155
Cumulus Networks
auto my_bridge
iface my_bridge
bridge-ports bond0 swp5 swp6
bridge-ageing 150
bridge-stp on
Keyword Explanation
bridge- List of logical and physical ports belonging to the logical bridge.
ports
bridge- Maximum amount of time before a MAC addresses learned on the bridge expires from
ageing the bridge MAC cache. The default value is 300 seconds.
bridge- Enables spanning tree protocol on this bridge. The default spanning tree mode is Per
stp VLAN Rapid Spanning Tree Protocol (PVRST).
For more information on spanning-tree configurations see the configuration section:
Spanning Tree and Rapid Spanning Tree (see page 118).
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
Do not try to bridge the management port, eth0, with any switch ports (like swp0, swp1, and
[Link] 157
Cumulus Networks
Do not try to bridge the management port, eth0, with any switch ports (like swp0, swp1, and
so forth). For example, if you created a bridge with eth0 and swp1, it will not work.
You can use the bridge fdb command to display the MAC address table as well:
You can clear a MAC address from the table using the bridge fdb command:
Multiple Bridges
Sometimes it is useful to logically divide a switch into multiple layer 2 domains, so that hosts in one
domain can communicate with other hosts in the same domain but not in other domains. You can
achieve this by configuring multiple bridges and putting different sets of interfaces in the different
bridges. In the following example, host-1 and host-2 are connected to the same bridge (bridge-A), while
host-3 and host-4 are connected to another bridge (bridge-B). host-1 and host-2 can communicate with
each other, so can host-3 and host-4, but host-1 and host-2 cannot communicate with host-3 and host-
4.
[Link] 159
Cumulus Networks
auto bridge-A
iface bridge-A
bridge-ports swp1 swp2
bridge-stp on
auto my_bridge
iface my_bridge
bridge-ports swp3 swp4
bridge-stp on
To bring up the bridges bridge-A and bridge-B, use the ifreload command:
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
When an interface is added to a bridge, it ceases to function as a router interface, and the IP
address on the interface, if any, becomes reachable.
[Link] 161
Cumulus Networks
The configuration for the two bridges example looks like the following:
auto swp5
iface swp5
address [Link]/24
address [Link]/64
auto bridge-A
iface bridge-A
address [Link]/24
address [Link]/64
bridge-ports swp1 swp2
bridge-stp on
auto bridge-B
iface bridge-B
address [Link]/24
address [Link]/64
bridge-ports swp3 swp4
bridge-stp on
To bring up swp5 and bridges bridge-A and bridge-B, use the ifreload command:
To see all the routes on the switch use the ip route show command:
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
[Link] 163
Cumulus Networks
The interaction of tagged and un-tagged frames on the same trunk often leads to undesired
and unexpected behavior. A switch that uses VLAN 1 for the native VLAN may send frames to
a switch that uses VLAN 2 for the native VLAN, thus merging those two VLANs and their
spanning tree state.
Trunk Example
auto br-VLAN100
iface br-VLAN100
bridge-ports swp1.100 swp2.100
bridge-stp on
auto br-VLAN200
[Link] 165
Cumulus Networks
iface br-VLAN200
bridge-ports swp1.200 swp2.200
bridge-stp on
Additional Examples
You can find additional examples of VLAN tagging in this chapter (see page 167).
Configuration Files
/etc/network/interfaces
/etc/network/interfaces.d/
/etc/network/if-down.d/
/etc/network/if-post-down.d/
166 14 December 2015
Cumulus Linux 2.5.5 User Guide
/etc/network/if-post-down.d/
/etc/network/if-pre-up.d/
/etc/network/if-up.d/
Useful Links
[Link]
[Link]
[Link]
VLAN Tagging
This article shows two examples of VLAN tagging (see page ), one basic and one more advanced.
They both demonstrate the streamlined interface configuration from ifupdown2. For more
information, see Configuring and Managing Network Interfaces (see page 89).
Contents
(Click to expand)
Contents (see page 167)
VLAN Tagging, a Basic Example (see page 167)
Persistent Configuration (see page 168)
VLAN Tagging, an Advanced Example (see page 168)
Persistent Configuration (see page 169)
VLAN Translation (see page 174)
[Link] 167
Cumulus Networks
host1 connects to swp1 with both untagged frames and with 802.1Q frames tagged for vlan100.
host2 connects to swp2 with 802.1Q frames tagged for vlan120 and vlan130.
Persistent Configuration
To configure the above example persistently, configure /etc/network/interfaces like this:
auto swp1
iface swp1
auto swp1.100
iface swp1.100
auto swp2
iface swp2
auto swp2.120
iface swp2.120
auto swp2.130
iface swp2.130
host1 connects to bridge br-untagged with bare Ethernet frames and to bridge br-tag100 with
802.1q frames tagged for vlan100.
host2 connects to bridge br-tag100 with 802.1q frames tagged for vlan100 and to bridge br-
vlan120 with 802.1q frames tagged for vlan120.
host3 connects to bridge br-vlan120 with 802.1q frames tagged for vlan120 and to bridge v130
with 802.1q frames tagged for vlan130.
bond2 carries tagged and untagged frames in this example.
Although not explicitly designated, the bridge member ports function as 802.1Q access ports and trunk
ports. In the example above, comparing Cumulus Linux with a traditional Cisco device:
swp1 is equivalent to a trunk port with untagged and vlan100.
swp2 is equivalent to a trunk port with vlan100 and vlan120.
swp3 is equivalent to a trunk port with vlan120 and vlan130.
bond2 is equivalent to an EtherChannel in trunk mode with untagged, vlan100, vlan120, and
vlan130.
Bridges br-untagged, br-tag100, br-vlan120, and v130 are equivalent to SVIs (switched virtual
interfaces).
Persistent Configuration
From /etc/network/interfaces :
[Link] 169
Cumulus Networks
# swp1 does not need an iface section unless it has a specific setting,
# it will be picked up as a dependent of swp1.100.
# And swp1 must exist in the system to create the .1q subinterfaces..
# but it is not applied to any bridge..or assigned an address.
auto swp1.100
iface swp1.100
auto swp2.100
iface swp2.100
auto swp2.120
iface swp2.120
auto swp3.120
iface swp3.120
auto swp3.130
iface swp3.130
auto bond2
iface bond2
bond-slaves glob swp4-7
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto br-untagged
iface br-untagged
address [Link]/24
bridge-ports swp1 bond2
bridge-stp on
auto br-tag100
iface br-tag100
address [Link]/24
bridge-ports swp1.100 swp2.100 bond2.100
bridge-stp on
auto br-vlan120
iface br-vlan120
address [Link]/24
bridge-ports swp2.120 swp3.120 bond2.120
bridge-stp on
auto v130
iface v130
address [Link]/24
bridge-ports swp2.130 swp3.130 bond2.130
bridge-stp on
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
To verify:
[Link] 171
Cumulus Networks
802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 3
Number of ports: 4
Actor Key: 33
Partner Key: 33
Partner Mac Address: [Link]
[Link] 173
Cumulus Networks
Aggregator ID: 3
Slave queue ID: 0
A single bridge cannot contain multiple subinterfaces of the same port as members.
Attempting to apply such a configuration will result in an error:
VLAN Translation
By default, Cumulus Linux does not allow VLAN subinterfaces associated with different VLAN IDs to be
part of the same bridge. Base interfaces are not explicitly associated with any VLAN IDs and are exempt
from this restriction:
cumulus@switch:~$ sudo ip link add link swp10 name swp10.100 type vlan id
100
cumulus@switch:~$ sudo ip link add link swp11 name swp11.200 type vlan id
200
In some cases, it may be useful to relax this restriction. For example, two servers may be connected to
the switch using VLAN trunks, but the VLAN numbering provisioned on the two servers are not
consistent. You can choose to just bridge two VLAN subinterfaces of different VLAN IDs from the
servers. You do this by enabling the sysctl [Link]-allow-multiple-vlans. Packets
entering a bridge from a member VLAN subinterface will egress another member VLAN subinterface
with the VLAN ID translated.
A bridge in VLAN-aware mode (see page 175) cannot have VLAN translation enabled for it;
only bridges configured in traditional mode can utilize VLAN translation.
If the sysctl is enabled and you want to disable it, run the above example, setting the sysctl net.
[Link]-allow-multiple-vlans to 0.
Once the sysctl is enabled, ports with different VLAN IDs can be added to the same bridge. In the
following example, packets entering the bridge br-mix from swp10.100 will be bridged to swp11.200
with the VLAN ID translated from 100 to 200:
native VLAN — see below). MAC address learning, filtering and forwarding are VLAN-aware. This
significantly reduces the configuration size, and eliminates the large overhead of managing the port
/VLAN instances as subinterfaces, replacing them with lightweight VLAN bitmaps and state updates.
You can configure both VLAN-aware and traditional mode bridges on the same network in
Cumulus Linux; however you should not have more than one VLAN-aware bridge on a given
switch. If you are implementing VXLANs (see page 226), you must use non-aware bridges.
Contents
(Click to expand)
Contents (see page 176)
Defining VLAN-aware Bridge Attributes (see page 176)
Basic Trunking (see page 176)
VLAN Filtering/VLAN Pruning (see page 177)
Untagged/Access Ports (see page 178)
VLAN Layer 3 Addressing/Switch Virtual Interfaces and other VLAN Attributes (see page 179)
Using the glob Keyword to Configure Multiple Ports in a Range (see page 179)
Example Configuration with Access Ports and Pruned VLANs (see page 179)
Example Configuration with Bonds (see page 180)
Converting a Traditional Bridge to VLAN-aware or Vice Versa (see page 182)
Caveats and Errata (see page 183)
Basic Trunking
A basic configuration for a VLAN-aware bridge configured for STP that contains two switch ports looks
like this:
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports swp1
swp2
bridge-vids 100 200
bridge-pvid 1
bridge-stp on
The above configuration actually includes 3 VLANs: the tagged VLANs 100 and 200 and the untagged
(native) VLAN of 1.
The bridge-pvid 1 is implied by default. You do not have to specify bridge-pvid. And
while it does not hurt the configuration, it helps other users for readability.
The following configurations are identical to each other and the configuration above:
[Link] 177
Cumulus Networks
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports swp1 swp2
swp3
bridge-vids 100 200
bridge-pvid 1
bridge-stp on
auto swp3
iface swp3
bridge-vids 200
Untagged/Access Ports
As described above, access ports ignore all tagged packets. In the configuration below, swp1 and swp2
are configured as access ports. All untagged traffic goes to the specified VLAN, which is VLAN 100 in the
example below.
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports swp1
swp2
bridge-vids 100 200
bridge-pvid 1
bridge-stp on
auto swp1
iface swp1
bridge-access 100
auto swp2
iface swp2
bridge-access 100
auto bridge.100
iface bridge.100
address [Link]/24
address [Link]/32
hwaddress [Link]
# l2 attributes
auto bridge.100
vlan bridge.100
bridge-igmp-querier-src [Link]
auto bridge.[1-2000]
vlan bridge.[1-2000]
ATTRIBUTE VALUE
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports glob swp1-52
bridge-stp on
bridge-vids 310 700 707 712 850 910
[Link] 179
Cumulus Networks
# ports swp3-swp48 are trunk ports which inherit vlans from the
'bridge'
# ie vlans 310,700,707,712,850,910
#
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports glob swp1-52
bridge-stp on
bridge-vids 310 700 707 712 850 910
auto swp1
iface swp1
mstpctl-portadminedge yes
mstpctl-bpduguard yes
bridge-access 310
# The following port is the trunk uplink and inherits all vlans
# from 'bridge'; bridge assurance is enabled using 'portnetwork'
attribute
auto swp49
iface swp49
mstpctl-portpathcost 10
mstpctl-portnetwork yes
# The following port is the trunk uplink and inherits all vlans
# from 'bridge'; bridge assurance is enabled using 'portnetwork'
attribute
auto swp50
iface swp50
mstpctl-portpathcost 0
mstpctl-portnetwork yes
#
# vlan-aware bridge with bonds example
#
# uplink1, peerlink and downlink are bond interfaces.
# 'bridge' is a vlan aware bridge with ports uplink1, peerlink
# and downlink (swp2-20).
#
# native vlan is by default 1
#
# 'bridge-vids' attribute is used to declare vlans.
# 'bridge-pvid' attribute is used to specify native vlans if other
than 1
# 'bridge-access' attribute is used to declare access port
#
auto lo
iface lo
auto eth0
iface eth0 inet dhcp
# bond interface
auto uplink1
iface uplink1
bond-slaves swp32
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
bridge-vids 2000-2079
# bond interface
auto peerlink
iface peerlink
bond-slaves swp30 swp31
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
bridge-vids 2000-2079 4094
# bond interface
auto downlink
iface downlink
bond-slaves swp1
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
[Link] 181
Cumulus Networks
bridge-vids 2000-2079
#
# Declare vlans for all swp ports
# swp2-20 get vlans from 2004 to 2022.
# The below uses mako templates to generate iface sections
# with vlans for swp ports
#
%for port, vlanid in zip(range(2, 20), range(2004, 2022)) :
auto swp${port}
iface swp${port}
bridge-vids ${vlanid}
%endfor
#
# vlan-aware bridge
#
auto bridge
iface bridge
bridge-vlan-aware yes
bridge-ports uplink1 peerlink downlink glob swp2-20
bridge-stp on
1. Delete the traditional mode bridge from the configuration and bring down all its member switch
port interfaces.
2. Create a new VLAN-aware bridge, as described above.
3. Bring up the bridge.
These steps assume you are converting a traditional mode bridge to a VLAN-aware one. To do the
182 14 December 2015
Cumulus Linux 2.5.5 User Guide
These steps assume you are converting a traditional mode bridge to a VLAN-aware one. To do the
opposite, delete the VLAN-aware bridge in step 1, and create a new traditional mode bridge in step 2.
While restarting switchd, all running ports will flap and forwarding will be
interrupted (see page 85).
VLAN translation: A bridge in VLAN-aware mode cannot have VLAN translation enabled for it;
only bridges configured in traditional mode (see page 154) can utilize VLAN translation.
[Link] 183
Cumulus Networks
The two switches, S1 and S2, known as peer switches, cooperate so that they appear as a single device
to host H1's bond. H1 distributes traffic between the two links to S1 and S2 in any manner that you
configure on the host. Similarly, traffic inbound to H1 can traverse S1 or S2 and arrive at H1.
Contents
(Click to expand)
Contents (see page 184)
MLAG Requirements (see page 185)
LACP and Dual-Connectedness (see page 186)
Understanding Switch Roles (see page 186)
Configuring MLAG (see page 187)
Configuring the Host or Switch (see page 187)
Configuring the Interfaces (see page 188)
Example MLAG Configuration (see page 189)
Configuring MLAG with a Traditional Mode Bridge (see page 193)
Using the clagd Command Line Interface (see page 193)
Peer Link Interfaces and the PROTO_DOWN State (see page 194)
Specifying a Backup Link (see page 195)
Monitoring Dual-Connected Peers (see page 196)
IGMP Snooping with MLAG (see page 196)
Monitoring the Status of the clagd Service (see page 197)
MLAG Best Practices (see page 198)
Understanding MTU in an MLAG Configuration (see page 198)
STP Interoperability with MLAG (see page 199)
Debugging STP with MLAG (see page 199)
Best Practices for STP with MLAG (see page 200)
Troubleshooting MLAG (see page 200)
Caveats and Errata (see page 200)
MLAG Requirements
MLAG has these requirements:
There must be a direct connection between the two peer switches implementing MLAG (S1 and
S2). This is typically a bond for increased reliability and bandwidth.
There must be only two peer switches in one MLAG configuration, but you can have multiple
configurations in a network for switch-to-switch MLAG (see below).
The peer switches implementing MLAG must be running Cumulus Linux version 2.5 or later.
You must specify a unique clag-id for every dual-connected bond on each peer switch; the
value must be between 1 and 65535 and must be the same on both peer switches in order for
the bond to be considered dual-connected.
The dual-connected devices (hosts or switches) must use LACP (IEEE 802.3ad/802.1ax) to form
the bond. The peer switches must also use LACP.
More elaborate configurations are also possible. The number of links between the host and the
switches can be greater than two, and does not have to be symmetrical:
Additionally, since S1 and S2 appear as a single switch to other bonding devices, pairs of MLAG
switches can also be connected to each other in a switch-to-switch MLAG setup:
[Link] 185
Cumulus Networks
In this case, L1 and L2 are also MLAG peer switches, and thus present a two-port bond from a single
logical system to S1 and S2. S1 and S2 do the same as far as L1 and L2 are concerned. For a switch-to-
switch MLAG configuration, each switch pair must have a unique system MAC address. In the above
example, switches L1 and L2 each have the same system MAC address configured. Switch pair S1 and
S2 each have the same system MAC address configured; however, it is a different system MAC address
than the one used by the switch pair L1 and L2.
All of the dual-connected bonds on the peer switches have their system ID set to the MLAG system ID.
Therefore, from the point of view of the hosts, each of the links in its bond is connected to the same
system, and so the host will use both links.
Each peer switch periodically makes a list of the LACP partner MAC addresses of all of their bonds and
sends that list to its peer (using the clagd service; see below). The LACP partner MAC address is the
MAC address of the system at the other end of a bond, which in the figure above would be hosts H1
and H2. When a switch receives this list from its peer, it compares the list to the LACP partner MAC
addresses on its switch. If any matches are found and the clag-id for those bonds match, then that
bond is a dual-connected bond. You can also find the LACP partner MAC address in the /sys/class
/net/<bondname>/bonding/ad_partner_mac sysfs file for each bond.
By default, the role is determined by comparing the MAC addresses of the two sides of the peering link;
the switch with the lower MAC address assumes the primary role. You can override this by setting the
priority configuration, either by specifying the clagd-priority option in /etc/network/interfaces
, or by using clagctl. The switch with the lower priority value is given the primary role; the default
value is 32768, and the range is 0 to 65535. Read the clagd(8) and clagctl(8) man pages for more
information.
When the clagd service is exited during switch reboot or the service is stopped in the primary switch,
the peer switch that is in the secondary role will become primary. If the primary switch goes down
without stopping the clagd service for any reason or the peer link goes down, the secondary switch
will not change its role. In case the peer switch is determined to be not alive, the switch in the
secondary role will roll back the LACP system ID to be the bond interface MAC address instead of the
clagd-sys-mac and the switch in primary role uses the clagd-sys-mac as the LACP system ID on the
bonds.
Configuring MLAG
Configuring MLAG involves:
On the dual-connected devices, create a bond that uses LACP.
On each peer switch, configure the interfaces, including bonds, VLANs, bridges and peer links.
MLAG synchronizes the dynamic state between the two peer switches, but it does not
synchronize the switch configurations. After modifying the configuration of one peer switch,
you must make the same changes to the configuration on the other peer switch. This applies
to all configuration changes, including:
Port configuration: For example, VLAN membership, MTU (see page 198), and bonding
parameters.
Bridge configuration: For example, spanning tree parameters or bridge properties.
Static address entries: For example, static FDB entries and static IGMP entries.
QoS configuration: For example, ACL entries.
You can verify the configuration of VLAN membership using the clagctl -v verifyvlans
command.
[Link] 187
Cumulus Networks
auto peerlink.4094
iface peerlink.4094
address [Link]/30
clagd-enable yes
clagd-peer-ip [Link]
clagd-backup-ip [Link]
clagd-sys-mac [Link]
Then run ifup on the peerlink VLAN interface. In this example, the command would be sudo ifup
peerlink.4094.
There is no need to add VLAN 4094 to the bridge VLAN list, as it is unnecessary there.
Keep in mind that when you change the MLAG configuration in the interfaces file, the
188 14 December 2015
Cumulus Linux 2.5.5 User Guide
Keep in mind that when you change the MLAG configuration in the interfaces file, the
changes take effect when you bring the peerlink interface up with ifup. Do not use service
clagd restart to apply the new configuration.
Configuring these interfaces uses syntax from ifupdown2 and the VLAN-aware bridge driver mode (see
page 175). The bridges use these Cumulus Linux-specific keywords:
bridge-vids, which defines the allowed list of tagged 802.1q VLAN IDs for all bridge member
interfaces. You can specify non-contiguous ranges with a space-separated list, like
bridge-vids 100-200 300 400-500.
bridge-pvid, which defines the untagged VLAN ID for each port. This is commonly referred to
as the native VLAN.
The bridge configurations below indicate that each bond carries tagged frames on VLANs 1000 to 3000
but untagged frames on VLAN 1. Also, take note on how you configure the VLAN subinterface used for
clagd communication (peerlink.4094 in the sample configuration below).
At minimum, this VLAN subinterface should not be in your Layer 2 domain, and you should
give it a very high VLAN ID (up to 4094). Read more about the range of VLAN IDs you can use
(see page ).
The configuration for the spines should look like the following (note that the clag-id and clagd-sys-
mac must be the same for the corresponding bonds on spine1 and spine2):
spine1 spine2
[Link] 189
Cumulus Networks
auto br auto br
iface br iface br
bridge-vlan-aware yes bridge-vlan-aware yes
bridge-ports uplinkA bridge-ports uplinkA
peerlink downlink1 downlink2 peerlink downlink1 downlink2
bridge-stp on bridge-stp on
bridge-vids 1000-3000 bridge-vids 1000-3000
bridge-pvid 1 bridge-pvid 1
bridge-mcsnoop 1 bridge-mcsnoop 1
Here is an example configuration file for the switches leaf1 and leaf2. Note that the clag-id and
clagd-sys-mac must be the same for the corresponding bonds on leaf1 and leaf2:
leaf1 leaf2
[Link] 191
Cumulus Networks
The configuration is almost identical, except for the IP addresses used for managing the clagd service.
auto br
iface br
bridge-ports peerlink spine1-2 host1 host2
For a deeper comparison of traditional versus VLAN-aware bridge modes, read this
knowledge base article.
cumulus@switch$ clagctl
The peer is alive
[Link] 193
Cumulus Networks
The PROTO_DOWN state is an experimental feature. As such, the name and format could
change in a future version of Cumulus Linux.
auto peerlink.4094
iface peerlink.4094
address [Link]
netmask [Link]
clagd-enable yes
clagd-priority 8192
clagd-peer-ip [Link]
clagd-backup-ip [Link]
clagd-sys-mac [Link]
clagd-args --priority 1000
The backup IP address must be different than the peer link IP address ( clagd-peer-ip
above). It must be reachable by a route that doesn't use the peer link and it must be in the
same network namespace as the peer link IP address.
Cumulus Networks recommends you use the switch's management IP address for this
purpose.
You can also specify the backup UDP port. The port defaults to 5342, but you can configure it as an
argument in clagd-args using --backupPort <PORT>.
auto peerlink.4094
iface peerlink.4094
address [Link]
netmask [Link]
clagd-enable yes
clagd-priority 8192
clagd-peer-ip [Link]
clagd-backup-ip [Link]
clagd-sys-mac [Link]
clagd-args --backupPort 5400
[Link] 195
Cumulus Networks
cumulus@switch$ clagctl
The peer is alive
Our Priority, ID, and Role: 8192 [Link] primary
Peer Priority, ID, and Role: 8192 [Link] secondary
Peer Interface and IP: peerlink.4094 [Link]
Backup IP: [Link]
System MAC: [Link]
IGMP snooping is enabled by default on the bridge. IGMP snooping multicast database entries and
router port entries are synced to the peer MLAG switch. If there is no multicast router in the VLAN, the
IGMP querier can be configured on the switch to generate IGMP query messages by adding a
configuration like the following to /etc/network/interfaces:
auto br.100
vlan br.100
#igmp snooping is enabled by default, but is shown here for completeness
bridge-mcsnoop 1
# If you need to specify the querier IP address
bridge-igmp-querier-source [Link]
To display multicast group and router port information, use the bridge -d mdb show command:
service. This check is performed every 30 seconds. Due to the way the jdoo process implements
[Link] 197
Cumulus Networks
service. This check is performed every 30 seconds. Due to the way the jdoo process implements
this check, it may start the clagd process twice. This is harmless, since clagd checks to make
sure another instance is not already running when it begins executing. This is indicated with a
message in the clagd log file, /var/log/[Link].
The modification time of the /var/run/[Link] file. As clagd runs, it periodically
updates the modification time of the /var/run/[Link] file (by default, every 4 seconds).
If jdoo notices that this file's modification time has not been updated within the last 4 minutes,
it will assume clagd is alive, but hung, and will restart clagd. If clagd is not enabled to run,
this check still occurs and jdoo will start clagd. But since clagd is not configured to run,
nothing will happen except that a message is written to the jdoo log file that it tried to start
clagd.
You can check the status of clagd monitoring by using the jdoo summary command:
auto br0
iface br0
bridge-vlan-aware yes
bridge-ports spine1-2 peerlink host1 host2 <- List of bridge member
interfaces
...
Likewise, to ensure the MTU 9000 path is respected through the spine switches above, also change the
MTU setting for bridge br by configuring mtu 9000 for each of the following members of bridge br on
spine1 and spine2: uplinkA, peerlink, downlink1, downlink2.
auto br
iface br
bridge-vlan-aware yes
bridge-ports uplinkA peerlink downlink1 downlink2
...
[Link] 199
Cumulus Networks
11:1
clag remote portID [Link] clag system mac [Link]
[Link]
root@se3-sp1:~#
Troubleshooting MLAG
By default, when clagd is running, it logs its status to the /var/log/[Link] file and syslog.
Example log file output is below:
Configuration Files
/etc/network/interfaces
LACP Bypass
On Cumulus Linux, LACP Bypass is a feature that allows a bond (see page 151) configured in 802.3ad
mode to become active and forward traffic even when there is no LACP partner. A typical use case for
this feature is to enable a host, without the capability to run LACP, to PXE boot while connected to a
switch on a bond configured in 802.3ad mode. Once the pre-boot process finishes and the host is
capable of running LACP, the normal 802.3ad link aggregation operation takes over.
Contents
200 14 December 2015
Cumulus Linux 2.5.5 User Guide
Contents
(Click to expand)
Contents (see page 200)
Understanding LACP Bypass Modes (see page 201)
LACP Bypass Timeout (see page 201)
LACP Bypass and MLAG Deployments (see page 202)
Configuring LACP Bypass (see page 202)
Configuration Examples (see page 202)
Default Configuration with Priority Mode and Optional Timeout Period (see page 202)
All-active Mode Configuration with Multiple Simultaneous Active Interfaces (see page 204)
All-active mode is not supported on bonds that are not specified as bridge ports on
the switch.
STP does not run on the individual bond slave interfaces, when the LACP bond is in all-
active mode. Therefore, only use all-active mode on host-facing LACP bonds. Cumulus
Networks highly recommends you configure STP BPDU guard along with all-active
mode.
[Link] 201
Cumulus Networks
Configuration Examples
auto bond0
iface bond0
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-lacp-bypass-allow 1
bond-slaves swp4 swp5
bond-lacp-bypass-period 300
bond-lacp-bypass-priority swp4=2 swp5=1
The following command shows that swp4 bypass timeout has expired and the bond is operationally
down:
[Link] 203
Cumulus Networks
auto bond1
iface bond1 inet static
bond-slaves swp3 swp4
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-lacp-bypass-allow 1
bond-lacp-bypass-all-active 1
mstpctl-bpduguard yes
auto br0
iface br0 inet static
bridge-vlan-aware yes
bridge-ports bond1 bond2 bond3 bond4 peer5
bridge-stp on
bridge-vids 100-105
802.3ad info
LACP rate: fast
Min links: 1
Aggregator selection policy (ad_select): stable
System Identification: 65535 [Link]
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 33
Partner Key: 33
Partner Mac Address: [Link]
LACP Bypass Info:
Allowed: 1
Timeout: 0
All-active: 1
The following configuration shows LACP bypass enabled for multiple active interfaces (all-active mode)
with a bridge in traditional bridge mode (see page 154):
[Link] 205
Cumulus Networks
auto bond1
iface bond1 inet static
bond-slaves swp3 swp4
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-lacp-bypass-allow 1
bond-lacp-bypass-all-active 1
auto br0
iface br0 inet static
bridge-ports bond1 bond2 bond3 bond4 peer5
bridge-stp on
mstpctl-bpduguard bond1=yes
An actual implementation will have many more server hosts and network connections than are shown
here. But this basic configuration provides a complete description of the important aspects of the VRR
setup.
Contents
(Click to expand)
Contents (see page 207)
Configuring the Network (see page 208)
Configuring the Hosts (see page 209)
Configuring the Routers (see page 209)
Other Network Connections (see page 209)
Handling ARP Requests (see page 209)
Monitoring Peer Links and Uplinks (see page 209)
Using ifplugd (see page 210)
Notes (see page 211)
[Link] 207
Cumulus Networks
auto bridge.500
iface bridge.500
address [Link]/24
address-virtual [Link] [Link]/24
Notice the simpler configuration of the bridge with ifupdown2. For more information, see
Configuring and Managing Network Interfaces (see page 89).
You should always use ifupdown2 to configure VRR, because it ensures correct ordering
when bringing up the virtual and physical interfaces and it works best with VLAN-aware
bridges (see page 175).
If you are using the non-VLAN-aware bridge driver, the configuration would look like this:
auto bridge500
iface bridge500
address [Link]/24
address-virtual [Link] [Link]/24
bridge-ports bond1.500 bond2.500 bond3.500
The IP address assigned to the bridge is the unique address for the bridge. The parameters of this
configuration are:
bridge.500: 500 represents a VLAN subinterface of the bridge, sometimes called a switched
virtual interface, or SVI.
[Link]/24: The unique IP address assigned to this bridge. It is unique because, unlike
the [Link] address, it is assigned only to this bridge, not the bridge on the other router.
[Link] The MAC address of the virtual router. This must be the same on all
virtual routers.
[Link]/24: The IP address of the virtual router, including the routing prefix. This must
be the same on all the virtual routers and must match the default gateway address configured
on the servers as well as the size of the subnet.
address-virtual: This keyword enables and configures VRR.
The above bridge configuration enables VRR by creating a MAC VLAN interface on the SVI. This MAC
VLAN interface is:
Named bridge-500-v0, which is the name of the SVI with dots changed to dashes and "-v0"
appended to the end.
If you are configuring VRR without MLAG (see page 183), use active-standby mode instead.
If you are not using MLAG, then the bridge should have one switch port interface to each host
instead of a bond.
Using ifplugd
ifplugd is a link state monitoring daemon that can execute user-specified scripts on link transitions
(not admin-triggered transitions, but transitions when a cable is plugged in or removed).
Run the following commands to install the ifplugd service:
Next, configure ifplugd. The example below indicates that when the peerbond goes down in a MLAG
environment, ifplugd brings down all the uplinks. Run the following ifplugd script on both the
primary and secondary MLAG (see page 183) switches.
To configure ifplugd, modify /etc/default/ifplugd and add the appropriate peerbond interface
name. /etc/default/ifplugd will look like this:
INTERFACES="peerbond"
HOTPLUG_INTERFACES=""
ARGS="-q -f -u0 -d1 -w -I"
SUSPEND_ACTION="stop"
#!/bin/sh
set -e
case "$2" in
up)
clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
if [ "$clagrole" = "secondary" ]
then
#List all the interfaces below to bring up when clag
peerbond comes up.
for interface in swp1 bond1 bond3 bond4
do
echo "bringing up : $interface"
ip link set $interface up
done
fi
;;
down)
clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
if [ "$clagrole" = "secondary" ]
then
#List all the interfaces below to bring down when clag
peerbond goes down.
for interface in swp1 bond1 bond3 bond4
do
echo "bringing down : $interface"
ip link set $interface down
done
fi
;;
esac
Notes
The default shell is /bin/sh, which is dash and not bash. This makes for faster execution of the
script since dash is small and quick, but consequently less featureful than bash. For example, it
doesn't handle multiple uplinks.
Network Virtualization
Cumulus Linux supports these forms of network virtualization:
VXLAN (Virtual Extensible LAN), is a standard overlay protocol that abstracts logical virtual networks
from the physical network underneath. You can deploy simple and scalable layer 3 Clos architectures
while extending layer 2 segments over that layer 3 network.
VXLAN uses a VLAN-like encapsulation technique to encapsulate MAC-based layer 2 Ethernet frames
within layer 3 UDP packets. Each virtual network is a VXLAN logical L2 segment. VXLAN scales to 16
million segments – a 24-bit VXLAN network identifier (VNI ID) in the VXLAN header – for multi-tenancy.
Hosts on a given virtual network are joined together through an overlay protocol that initiates and
terminates tunnels at the edge of the multi-tenant network, typically the hypervisor vSwitch or top of
rack. These edge points are the VXLAN tunnel end points (VTEP).
Cumulus Linux can initiate and terminate VTEPs in hardware and supports wire-rate VXLAN with
[Link] 211
Cumulus Networks
Cumulus Linux can initiate and terminate VTEPs in hardware and supports wire-rate VXLAN with
Trident II platforms. VXLAN provides an efficient hashing scheme across IP fabric during the
encapsulation process; the source UDP port is unique, with the hash based on L2-L4 information from
the original frame. The UDP destination port is the standard port 4789.
Cumulus Linux includes the native Linux VXLAN kernel support and integrates with controller-based
overlay solutions like VMware NSX and Midokura MidoNet.
VXLAN is supported only on switches in the Cumulus Linux HCL using Trident II chipsets.
VXLAN encapsulation over layer 3 subinterfaces is not supported. Therefore, VXLAN uplinks
should be only configured as layer 3 interfaces without any subinterfaces.
Commands
brctl
bridge fdb
ip link
ovs-pki
ovsdb-client
vtep-ctl
Useful Links
VXLAN IETF draft
ovsdb-server
Contents
(Click to expand)
Contents (see page 213)
Getting Started (see page 213)
Caveats and Errata (see page 213)
Bootstrapping the NSX Integration (see page 214)
Enabling the openvswitch-vtep Package (see page 214)
Using the Bootstrapping Script (see page 214)
Manually Bootstrapping the NSX Integration (see page 215)
Generating the Credentials Certificate (see page 215)
Configuring the Switch as a VTEP Gateway (see page 217)
Configuring the Transport Layer (see page 220)
Configuring the Logical Layer (see page 221)
Defining Logical Switches (see page 221)
Defining Logical Switch Ports (see page 223)
Verifying the VXLAN Configuration (see page 225)
Persistent VXLAN Configuration in NSX (see page 226)
Troubleshooting VXLANs in NSX (see page 226)
Getting Started
Before you integrate VXLANs with NSX, make sure you have the following components:
A switch (L2 gateway) with a Trident II chipset running Cumulus Linux 2.0 or later;
OVSDB server (ovsdb-server), included in Cumulus Linux 2.0 and later
VTEPd (ovs-vtepd), included in Cumulus Linux 2.0 and later
Integrating a VXLAN with NSX involves:
Bootstrapping the NSX Integration
Configuring the Transport Layer
Configuring the Logical Layer
Verifying the VXLAN Configuration
Once you finish the integration, you can make the configuration persistent across upgrades (see
Persistent VXLAN Configuration in NSX (see page 226) below).
[Link] 213
Cumulus Networks
For more information about NSX, see the VMware NSX User Guide, version 4.0.0 or later.
Make sure to include this file in your persistent configuration (see Persistent VXLAN Configuration in
NSX (see page 226) below) so it’s available after you upgrade Cumulus Linux.
In the above example, the following information was passed to the vtep-bootstrap script:
--credentials-path /var/lib/openvswitch: Is the path to where the certificate and key
pairs for authenticating with the NSX controller are stored.
vtep7: is the ID for the VTEP.
[Link]: is the IP address of the NSX controller.
[Link]: is the datapath IP address of the VTEP.
[Link]: is the IP address of the management interface on the switch.
These IP addresses will be used throughout the rest of the examples below.
# Start ovsdb-server.
set ovsdb-server "$DB_FILE"
set "$@" -vANY:CONSOLE:EMER -vANY:SYSLOG:ERR -vANY:FILE:INFO
set "$@" --remote=punix:"$DB_SOCK"
set "$@" --remote=db:Global,managers
set "$@" --remote=ptcp:6633:$LOCALIP
set "$@" --private-key=/root/[Link]
set "$@" --certificate=/root/[Link]
set "$@" --bootstrap-ca-cert=/root/[Link]
If files have been moved or regenerated, restart the OVSDB server and vtepd:
3. Define the NSX controller cluster IP address in OVSDB. This causes the OVSDB server to start
contacting the NSX controller:
4. Define the local IP address on the VTEP for VXLAN tunnel termination. First, find the physical
switch name as recorded in OVSDB:
Then set the tunnel source IP address of the VTEP. This is the datapath address of the VTEP,
which is typically an address on a loopback interface on the switch that is reachable from the
underlying L3 network:
Once you finish generating the certificate, keep the terminal session active, as you need to paste the
certificate into NSX Manager when you configure the VTEP gateway.
1.
[Link] 217
Cumulus Networks
1. In NSX Manager, add a new gateway. Click the Network Components tab, then the Transport
Layer category. Under Transport Node, click Add, then select Manually Enter All Fields. The
Create Gateway wizard appears.
2. In the Create Gateway dialog, select Gateway for the Transport Node Type, then click Next.
3. In the Display Name field, give the gateway a name, then click Next.
4. Enable the VTEP service. Select the VTEP Enabled checkbox, then click Next.
5. From the terminal session connected to the switch where you generated the certificate, copy the
certificate and paste it into the Security Certificate text field. Copy only the bottom portion,
including the BEGIN CERTIFICATE and END CERTIFICATE lines. For example, copy all the
highlighted text in the terminal:
Once communication is established between the switch and the controller, a [Link] file
will be downloaded onto the switch.
Verify the controller and switch handshake is successful. In a terminal connected to the switch, run this
command:
[Link] 219
Cumulus Networks
1. In NSX Manager, add a new gateway service. Click the Network Components tab, then the
Services category. Under Gateway Service, click Add. The Create Gateway Service wizard
appears.
2. In the Create Gateway Service dialog, select VTEP L2 Gateway Service as the Gateway Service Type
.
3. Give the service a Display Name to represent the VTEP in NSX.
4. Click Add Gateway to associate the service with the gateway you created earlier.
5. In the Transport Node field, choose the name of the gateway you created earlier.
6. In the Port ID field, choose the physical port on the gateway (for example, swp10) that will
connect to a logical L2 segment and carry data traffic.
7. Click OK to save this gateway in the service, then click Save to save the gateway service.
1. In NSX Manager, add a new logical switch. Click the Network Components tab, then the Logical
Layer category. Under Logical Switch, click Add. The Create Logical Switch wizard appears.
2. In the Display Name field, enter a name for the logical switch, then click Next.
4. Specify the transport zone bindings for the logical switch. Click Add Binding. The Create
[Link] 221
Cumulus Networks
4. Specify the transport zone bindings for the logical switch. Click Add Binding. The Create
Transport Zone Binding dialog appears.
5. In the Transport Type list, select VXLAN, then click OK to add the binding to the logical switch.
6. In the VNI field, assign the switch a VNI ID, then click OK.
Do not use 0 or 16777215 as the VNI ID, as they are reserved values under Cumulus
Linux.
1.
[Link] 223
Cumulus Networks
1. In NSX Manager, add a new logical switch port. Click the Network Components tab, then the
Logical Layer category. Under Logical Switch Port, click Add. The Create Logical Switch Port
wizard appears.
2. In the Logical Switch UUID list, select the logical switch you created above, then click Create.
3. In the Display Name field, give the port a name that indicates it is the port that connects the
gateway, then click Next.
4. In the Attachment Type list, select VTEP L2 Gateway.
5. In the VTEP L2 Gateway Service UUID list, choose the name of the gateway service you created
earlier.
6. In the VLAN list, you can optionally choose a VLAN if you wish to connect only traffic on a specific
VLAN of the physical network. Leave it blank to handle all traffic.
7.
224 14 December 2015
Cumulus Linux 2.5.5 User Guide
7. Click Save to save the logical switch port. Connectivity is established. Repeat this procedure for
each logical switch port you want to define.
or
[Link] 225
Cumulus Networks
Copying the ovsdb database file is optional; the persistent database file helps to
speed up convergence on a system upgrade. NSX Manager pushes any configuration
created or changed in NSX Manager when the connection with the VTEP is
reestablished, which overwrites the database file.
Contents
(Click to expand)
Contents (see page 226)
Requirements (see page 227)
226 14 December 2015
Cumulus Linux 2.5.5 User Guide
Requirements
A VXLAN configuration requires a platform with hardware support for:
Switches with a Trident II chipset running Cumulus Linux 2.0 or later.
A service to carry unknown destination, broadcast and multicast frames. As mentioned in the
VXLAN IETF documents, you can do this through various mechanisms such as a learning-based
control plane (like multicast) or through a central authority (like a service node).
For a basic VXLAN configuration, you should ensure that:
The VXLAN has a network identifier (VNI); do not use 0 or 16777215 as the VNI ID, as they are
reserved values under Cumulus Linux.
The VXLAN instance is modeled as a link (netdev).
The VXLAN link and local interfaces are added to bridge to create the association between port,
VLAN and VXLAN instance.
Each bridge on the switch has only one VXLAN interface. Cumulus Linux does not support more
than one VXLAN link in a bridge; however a switch can have multiple bridges.
You use static ARP entries to assign MAC addresses to a VXLAN interface.
[Link] 227
Cumulus Networks
Pre-configuring remote MAC addresses does not scale. A better solution is to use a VXLAN
controller, such as LNV (see page 232), or implement an integrated solution such as VMware
NSX (see page 212).
auto vtep1000
iface vtep1000
vxlan-id 1000
vxlan-local-tunnelip [Link]
auto br-100
iface br-100
bridge-ports swp1.100 swp2.100 vtep1000
post-up bridge fdb add [Link] dev vtep1000 dst [Link] vni
1000
auto vtep1000
iface vtep1000
vxlan-id 1000
vxlan-local-tunnelip [Link]
auto br-100
iface br-100
bridge-ports swp1.100 swp2.100 vtep1000
post-up bridge fdb add [Link] dev vtep1000 dst [Link]
vni 1000
post-up bridge fdb add [Link] dev vtep1000 dst [Link]
vni 1000
A runtime configuration is non-persistent, which means the configuration you create here
does not persist after you reboot the switch.
In general, to configure a VXLAN in Cumulus Linux without a controller, run the following commands in
a terminal connected to the switch:
If you are specifying ageing, you must specify the service node (svcnode) .
cumulus@switch1:~$ sudo bridge fdb add <mac addr> dev <device> dst <ip
addr> vni <vni> port <port> via <device>
To create a runtime configuration that matches the image above, do the following:
1. Configure hosts A and B as part of the same tenant as C (VNI 10) on switch1. Hosts A and B are
part of VLAN 100. To configure the VTEP interface with VNI 10, run the following commands in a
terminal connected to switch1 running Cumulus Linux:
cumulus@switch1:~$ sudo ip link add link swp1 name swp1.100 type vlan
id 100
cumulus@switch1:~$ sudo ip link add link swp2 name swp2.100 type vlan
id 100
cumulus@switch1:~$ sudo ip link add vtep1000 type vxlan id 10 local
[Link] nolearning
cumulus@switch1:~$ sudo ip link set swp1 up
[Link] 229
Cumulus Networks
2. Configure VLAN 100 and VTEP 1000 to be part of the same bridge br-100 on switch1:
3. Install a static MAC binding to a remote tunnel IP, assuming the MAC address for host C is [Link]
[Link]
cumulus@switch2:~$ sudo ip link add link swp1 name swp1.100 type vlan
id 100
cumulus@switch2:~$ sudo ip link add name vtep1000 type vxlan id 10
local [Link] nolearning
cumulus@switch2:~$ sudo ip link set swp1 up
cumulus@switch2:~$ sudo ip link set vtep1000 up
5. Configure VLAN 100 and VTEP 1000 to be part of the same bridge br-100 on switch2:
6. Install a static MAC binding to a remote tunnel IP on switch2, assuming the MAC address for host
A is [Link] and the MAC address for host B is [Link]
[Link] 231
Cumulus Networks
LNV is a lightweight controller option. Please contact Cumulus Networks with your scale
requirements so we can make sure this is the right fit for you. There are also other controller
options that can work on Cumulus Linux.
Contents
(Click to expand)
Contents (see page 232)
Understanding LNV Concepts (see page 233)
Acquiring the Forwarding Database at the Service Node (see page 234)
MAC Learning and Flooding (see page 234)
Handling BUM Traffic (see page 234)
Requirements (see page 235)
Hardware Requirements (see page 235)
232 14 December 2015
Cumulus Linux 2.5.5 User Guide
[Link] 233
Cumulus Networks
The two switches running Cumulus Linux, called leaf1 and leaf2, each have a bridge configured. These
two bridges contain the physical switch port interfaces connecting to the servers as well as the logical
VXLAN interface associated with the bridge. By creating a logical VXLAN interface on both leaf switches,
the switches become VTEPs (virtual tunnel end points). The IP address associated with this VTEP is most
commonly configured as its loopback address — in the image above, the loopback address is [Link]
for leaf1 and [Link] for leaf2.
You cannot have both service node and head end replication configured simultaneously, as
this causes the BUM traffic to be duplicated — both the source VTEP and the service node
sending their own copy of each packet to every remote VTEP.
Requirements
Hardware Requirements
Switches with a Trident II chipset running Cumulus Linux 2.5.4 or later. Please refer to the
Cumulus Networks hardware compatibility list for a list of supported switch models.
Configuration Requirements
The VXLAN has an associated VXLAN Network Identifier (VNI), also interchangeably called a
VXLAN ID.
The VNI should not be 0 or 16777215, as these two numbers are reserved values under
Cumulus Linux.
The VXLAN link and physical interfaces are added to the bridge to create the association
between the port, VLAN and VXLAN instance.
Each bridge on the switch has only one VXLAN interface. Cumulus Linux does not support more
[Link] 235
Cumulus Networks
Each bridge on the switch has only one VXLAN interface. Cumulus Linux does not support more
than one VXLAN link in a bridge; however, a switch can have multiple bridges.
Only use bridges in traditional mode (see page 154); VLAN-aware bridges (see page 175) are not
supported with VXLAN at this time.
An SVI (Switch VLAN Interface) or L3 address on the bridge is not supported. For example, you
can't ping from the leaf1 SVI to the leaf2 SVI via the VXLAN tunnel; you would need to use
server1 and server2 to verify. See Creating a Layer 3 Gateway (see page 252) below for more
information.
Want to try out configuring LNV and don't have a Cumulus Linux switch? Sign up to use the
Cumulus Workbench, which has this exact topology.
Network Connectivity
236 14 December 2015
Cumulus Linux 2.5.5 User Guide
Network Connectivity
There must be full network connectivity before you can configure LNV. The layer 3 IP addressing
information as well as the OSPF configuration (/etc/quagga/[Link]) below is provided to
make the LNV example easier to understand.
OSPF is not a requirement for LNV, LNV just requires L3 connectivity. With Cumulus Linux this
can be achieved with static routes, OSPF or BGP.
Layer 3 IP Addressing
Here is the configuration for the IP addressing information used in this example.
auto lo auto lo
iface lo inet loopback iface lo inet loopback
address [Link]/32 address [Link]/32
auto lo auto lo
iface lo inet loopback iface lo inet loopback
address [Link]/32 address [Link]/32
[Link] 237
Cumulus Networks
Layer 3 Fabric
The service nodes and registration nodes must all be routable between each other. The L3 fabric on
Cumulus Linux can either be BGP (see page 318) or OSPF (see page 305). In this example, OSPF is used
to demonstrate full reachability. Expand the Quagga configurations below.
Quagga configuration using OSPF:
spine1 spine2
interface lo interface lo
ip ospf area [Link] ip ospf area [Link]
interface swp49 interface swp49
ip ospf network point-to-point ip ospf network point-to-point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp50 interface swp50
ip ospf network point-to-point ip ospf network point-to-point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp51 interface swp51
ip ospf network point-to-point ip ospf network point-to-point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp52 interface swp52
ip ospf network point-to-point ip ospf network point-to-point
ip ospf area [Link] ip ospf area [Link]
! !
! !
! !
! !
! !
router-id [Link] router-id [Link]
router ospf router ospf
ospf router-id [Link] ospf router-id [Link]
leaf1 leaf2
interface lo interface lo
ip ospf area [Link] ip ospf area [Link]
interface swp1s0 interface swp1s0
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp1s1 interface swp1s1
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp1s2 interface swp1s2
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp1s3 interface swp1s3
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
! !
! !
! !
! !
router-id [Link] router-id [Link]
router ospf router ospf
ospf router-id [Link] ospf router-id [Link]
Host Configuration
In this example, the servers are running Ubuntu 14.04. There needs to be a trunk mapped from server1
and server2 to the respective switch. In Ubuntu this is done with subinterfaces. You can expand the
configurations below.
[Link] 239
Cumulus Networks
server1 server2
On Ubuntu it is more reliable to use ifup and if down to bring the interfaces up and down
individually, rather than restarting networking entirely (that is, there is no equivalent to if reload like
there is in Cumulus Linux):
Add the following to the loopback stanza Add the following to the loopback stanza
auto lo auto lo
iface lo iface lo
vxrd-src-ip [Link] vxrd-src-ip [Link]
vxrd-svcnode-ip [Link] vxrd-svcnode-ip [Link]
Now append the following for the VXLAN Now append the following for the VXLAN
configuration itself: configuration itself:
To bring up the bridges and VNIs, use the To bring up the bridges and VNIs, use the
ifreload command: ifreload command:
[Link] 241
Cumulus Networks
Why is br-20 not vni-20? For example, why not tie VLAN 20 to VNI 20, or why was 2000 used?
VXLANs and VLANs do not need to be the same number. This was done on purpose to
highlight this fact. However if you are using fewer than 4096 VLANs, there is no reason not to
make it easy and correlate VLANs to VXLANs. It is completely up to you.
As with any logical interfaces on Linux, the name does not matter (other than a 15-character limit). To
verify the associated VNI for the logical name, use the ip -d link show command:
The vxlan id 10 indicates the VXLAN ID/VNI is indeed 10 as the logical name suggests.
START=yes
Save and quit the text editor and reboot the vxsnd daemon:
START=yes
1Save and quit the text editor and reboot the vxrd daemon:
Open the vxrd configuration file on leaf2 with the following commands:
[Link] 243
Cumulus Networks
START=yes
Save and quit the text editor and reboot the vxrd daemon:
svcnode_ip = [Link]
svcnode_ip = [Link]
Restart the registration node daemon for the change to take effect:
loglevel The log level, which can be DEBUG, INFO, WARNING, ERROR, CRITICAL. INFO
logdest The destination for log messages. It can be a file name, stdout or syslog
syslog.
logfilesize Log file size in bytes. Used when logdest is a file name. 512000
logbackupcount Maximum number of log files stored on the disk. Used when logdest 14
is a file name.
pidfile The PIF file location for the vxrd daemon. /var/run
/vxrd.
pid
udsfile The file name for the Unix domain socket used for management. /var/run
/vxrd.
sock
vxfld_port The UDP port used for VXLAN control messages. 10001
svcnode_ip The address to which registration daemons send control messages for
registration and/or BUM packets for replication. This can also be
configured under /etc/network/interfaces with the vxrd-
svcnode-ip keyword.
holdtime Hold time (in seconds) for soft state, which is how long the service 90
node waits before ageing out an IP address for a VNI. The vxrd seconds
includes this in the register messages it sends to a vxsnd.
src_ip
[Link] 245
Cumulus Networks
Local IP address to bind to for receiving control traffic from the service
node daemon.
refresh_rate Number of times to refresh within the hold time. The higher this 3
number, the more lost UDP refresh messages can be tolerated. seconds
config_check_rate The number of seconds to poll the system for current VXLAN 5
membership. seconds
head_rep Enables self replication. Instead of using the service node to replicate true
BUM packets, it will be done in hardware on the VTEP switch.
Use 1, yes, true or on for True for each relevant option. Use 0, no, false or off for False.
For the example configuration, default values are used, except for the svcnode_ip field.
The address field is set to the loopback address of the switch running the vxsnd dameon.
svcnode_ip = [Link]
Restart the service node daemon for the change to take effect:
loglevel The log level, which can be DEBUG, INFO, WARNING, ERROR, INFO
CRITICAL.
logdest Destination for log messages. It can be a file name, stdout or syslog
syslog.
logfilesize The log file size in bytes. Used when logdest is a file name. 512000
logbackupcount Maximum number of log files stored on disk. Used when logdest is 14
a file name.
pidfile The PID file location for the vxrd daemon. /var/run
/vxrd.
pid
udsfile The file name for the Unix domain socket used for management. /var/run
/vxrd.
sock
vxfld_port The UDP port used for VXLAN control messages. 10001
svcnode_ip This is the address to which registration daemons send control [Link]
messages for registration and/or BUM packets for replication.
holdtime Holdtime (in seconds) for soft state. It is used when sending a 90
register message to peers in response to learning a <vni, addr> from
a VXLAN data packet.
src_ip Local IP address to bind to for receiving inter-vxsnd control traffic. [Link]
svcnode_peers Space-separated list of IP addresses with which the vxsnd shares its
state.
enable_vxlan_listen When set to true, the service node listens for VXLAN data traffic. true
install_svcnode_ip When set to true, the snd_peer_address gets installed on the false
loopback interface. It gets withdrawn when the vxsnd is not in
service. If set to true, you must define the snd_peer_address
configuration variable.
age_check Number of seconds to wait before checking the database to age out 90
stale entries. seconds
Use 1, yes, true or on for True for each relevant option. Use 0, no, false or off for False.
Use the vxrdctl peers command to see configured VNIs and all VTEPs (leaf switches) within the
network that have them configured.
cumulus@leaf1$ cumulus@leaf2$
vxrdctl peers vxrdctl peers
VNI Peer Addrs VNI Peer Addrs
=== ========== === ==========
10 [Link], 10 [Link],
[Link] [Link]
30 [Link], 30 [Link],
[Link] [Link]
2000 [Link], 2000 [Link],
[Link] [Link]
When head end replication mode is disabled, the command won't work.
Use the vxrdctl peers command to see the other VTEPs (leaf switches) and what VNIs are
associated with them. This does not show anything unless you enabled head end replication
mode by setting the head_rep option to True. Otherwise, replication is done by the service
node.
[Link] 249
Cumulus Networks
As mentioned above, SVIs (switch VLAN interfaces) are not supported when using VXLAN. That
is, there cannot be an IP address on the bridge that also contains a VXLAN.
10 [Link] [Link]
30 [Link] [Link]
The other VNIs were also tested and can be viewed in the expanded output below.
250 14 December 2015
Cumulus Linux 2.5.5 User Guide
The other VNIs were also tested and can be viewed in the expanded output below.
Test connectivity between VNI-2000 connected servers by pinging from server1:
[Link] 251
Cumulus Networks
[Link] appears in the MAC address table, which indicates that connectivity is occurring
between leaf1 and server1.
auto swp47
iface swp47
alias l2 port connected to swp48
auto swp48
iface swp48
alias gateway
address [Link]/24
auto vni-10
iface vni-10
vxlan-id 10
vxlan-local-tunnelip [Link]
auto br-10
iface br-10
bridge-ports swp47 swp32s0.10 vni-10
A loopback cable must be connected between swp47 and swp48 for this to work. This will be addressed
in a future version of Cumulus Linux so a physical port does not need to be used for this purpose.
START=yes
Save and quit the text editor and reboot the vxsnd daemon:
[Link] 253
Cumulus Networks
spine1 spine2
Use a text editor to edit the network Use a text editor to edit the network
configuration: configuration:
Add the [Link]/32 address to the loopback Add the [Link]/32 address to the loopback
address: address:
auto lo auto lo
iface lo inet loopback iface lo inet loopback
address [Link]/32 address [Link]/32
address [Link]/32 address [Link]/32
spine1 spine2
Use a text editor to edit the network Use a text editor to edit the network
configuration: configuration:
This sets the address on which the This sets the address on which the
service node listens to VXLAN messages service node listens to VXLAN messages
to the configured Anycast address and to the configured Anycast address and
sets it to sync with spine2. sets it to sync with spine1.
[Link] 255
Cumulus Networks
leaf1 leaf2
Use a text editor to edit the network Use a text editor to edit the network
configuration: configuration:
Change the vxrd-svcnode-ip field to the Change the vxrd-svcnode-ip field to the
Anycast address: Anycast address:
auto lo auto lo
iface lo inet loopback iface lo inet loopback
address [Link] address [Link]
vxrd-svcnode-ip [Link] vxrd-svcnode-ip [Link]
Verify the new service node is configured: Verify the new service node is configured:
[Link] 257
Cumulus Networks
Testing Connectivity
Repeat the ping tests from the previous section. Here is the table again for reference:
10 [Link] [Link]
30 [Link] [Link]
Additional Resources
Both vxsnd and vxrd have man pages in Cumulus Linux.
For vxsnd:
For vxrd:
258 14 December 2015
Cumulus Linux 2.5.5 User Guide
For vxrd:
See Also
[Link]
[Link]
LNV is a lightweight controller option. Please contact Cumulus Networks with your scale
requirements and we can make sure this is the right fit for you. There are also other
controller options that can work on Cumulus Linux.
Contents
(Click to expand)
Contents (see page 259)
Example LNV Configuration (see page 259)
Layer 3 IP Addressing (see page 260)
Quagga Configuration (see page 262)
Host Configuration (see page 263)
Service Node Configuration (see page 265)
See Also (see page 266)
[Link] 259
Cumulus Networks
Want to try out configuring LNV and don't have a Cumulus Linux switch? Sign up to use the
Cumulus Workbench, which has this exact topology.
Feeling Overwhelmed? Come join a Cumulus Boot Camp and get instructor-led
training!
Layer 3 IP Addressing
Here is the configuration for the IP addressing information used in this example:
auto lo auto lo
iface lo inet loopback iface lo inet loopback
address [Link]/32 address [Link]/32
auto lo auto lo
iface lo inet loopback iface lo inet loopback
address [Link]/32 address [Link]/32
vxrd-src-ip [Link] vxrd-src-ip [Link]
vxrd-svcnode-ip [Link] vxrd-svcnode-ip [Link]
[Link] 261
Cumulus Networks
vxlan-id 30 vxlan-id 30
vxlan-local-tunnelip [Link] vxlan-local-tunnelip [Link]
Quagga Configuration
The service nodes and registration nodes must all be routable between each other. The L3 fabric on
Cumulus Linux can either be BGP (see page 318) or OSPF (see page 305). In this example, OSPF is used
to demonstrate full reachability.
Here is the Quagga configuration using OSPF:
interface lo interface lo
ip ospf area [Link] ip ospf area [Link]
interface swp49 interface swp49
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp50 interface swp50
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp51 interface swp51
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp52 interface swp52
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
! !
! !
! !
! !
router-id [Link] router-id [Link]
router ospf router ospf
ospf router-id [Link] ospf router-id [Link]
interface lo interface lo
ip ospf area [Link] ip ospf area [Link]
interface swp1s0 interface swp1s0
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp1s1 interface swp1s1
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp1s2 interface swp1s2
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp1s3 interface swp1s3
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
! !
! !
! !
! !
router-id [Link] router-id [Link]
router ospf router ospf
ospf router-id [Link] ospf router-id [Link]
Host Configuration
In this example, the servers are running Ubuntu 14.04. A trunk must be mapped from server1 and
server2 to the respective switch. In Ubuntu this is done with subinterfaces.
[Link] 263
Cumulus Networks
server1 server2
spine1:/etc/[Link] spine2:/etc/[Link]
[common] [common]
# Log level is one of DEBUG, # Log level is one of DEBUG,
INFO, WARNING, ERROR, CRITICAL INFO, WARNING, ERROR, CRITICAL
#loglevel = INFO #loglevel = INFO
# Destination for log # Destination for log
message. Can be a file name, ' message. Can be a file name, '
stdout', or 'syslog' stdout', or 'syslog'
#logdest = syslog #logdest = syslog
# log file size in bytes. Used # log file size in bytes. Used
when logdest is a file when logdest is a file
#logfilesize = 512000 #logfilesize = 512000
# maximum number of log files # maximum number of log files
stored on disk. Used when stored on disk. Used when
logdest is a file logdest is a file
#logbackupcount = 14 #logbackupcount = 14
# The file to write the pid. # The file to write the pid.
If using monit, this must If using monit, this must
match the one match the one
# in the [Link] # in the [Link]
#pidfile = /var/run/[Link] #pidfile = /var/run/[Link]
# The file name for the unix # The file name for the unix
domain socket used for mgmt. domain socket used for mgmt.
#udsfile = /var/run/[Link] #udsfile = /var/run/[Link]
# UDP port for vxfld control # UDP port for vxfld control
messages messages
#vxfld_port = 10001 #vxfld_port = 10001
# This is the address to which # This is the address to which
registration daemons send registration daemons send
control messages for control messages for
# registration and/or BUM # registration and/or BUM
packets for replication packets for replication
svcnode_ip = [Link] svcnode_ip = [Link]
# Holdtime (in seconds) for # Holdtime (in seconds) for
soft state. It is used when soft state. It is used when
sending a sending a
# register msg to peers in # register msg to peers in
response to learning a <vni, response to learning a <vni,
addr> from a addr> from a
# VXLAN data pkt # VXLAN data pkt
#holdtime = 90 #holdtime = 90
# Local IP address to bind to f # Local IP address to bind to f
or receiving inter-vxsnd or receiving inter-vxsnd
control traffic control traffic
src_ip = [Link] src_ip = [Link]
[Link] 265
Cumulus Networks
[vxsnd] [vxsnd]
# Space separated list of IP # Space separated list of IP
addresses of vxsnd to share addresses of vxsnd to share
state with state with
svcnode_peers = [Link] svcnode_peers = [Link]
# When set to true, the # When set to true, the
service node will listen for service node will listen for
vxlan data traffic vxlan data traffic
# Note: Use 1, yes, true, or # Note: Use 1, yes, true, or
on, for True and 0, no, false, on, for True and 0, no, false,
or off, or off,
# for False # for False
#enable_vxlan_listen = true #enable_vxlan_listen = true
# When set to true, the # When set to true, the
svcnode_ip will be installed svcnode_ip will be installed
on the loopback on the loopback
# interface, and it will be # interface, and it will be
withdrawn when the vxsnd is no withdrawn when the vxsnd is no
longer in longer in
# service. If set to true, # service. If set to true,
the svcnode_ip configuration the svcnode_ip configuration
# variable must be defined. # variable must be defined.
# Note: Use 1, yes, true, or # Note: Use 1, yes, true, or
on, for True and 0, no, false, on, for True and 0, no, false,
or off, or off,
# for False # for False
#install_svcnode_ip = false #install_svcnode_ip = false
# Seconds to wait before # Seconds to wait before
checking the database to age checking the database to age
out stale entries out stale entries
#age_check = 90 #age_check = 90
See Also
[Link]
[Link]
Detailed LNV Configuration Guide (see page 232)
Cumulus Networks Training
Contents
Contents (see page 266)
Requirements
Each MLAG switch should be provisioned with a virtual IP address in the form of an anycast IP
address for VXLAN datapath termination.
All MLAG requirements (see page 185) apply for VXLAN Active-Active mode.
LNV (see page 232) is the only supported control plane option for VXLAN active-active mode in
this release. LNV can be configured for either service node replication or head-end replication.
If STP (see page 118) is enabled on the bridge that is connected to VXLAN, then BPDU filter and
BPDU guard (see page 129) should be enabled in the VXLAN interface.
Anycast IP Addresses
The VXLAN termination address is an anycast IP address that you configure as a clagd parameter (
clagd-vxlan-anycast-ip) under the loopback interface. clagd dynamically adds and removes this
address as the loopback interface address as follows:
When the switches come up, ifupdown2 places all VXLAN interfaces in a PROTO_DOWN state
(see page 274).
Upon MLAG peering and a successful VXLAN interface consistency check between the switches,
clagd adds the anycast address as the interface address to the loopback interface. It then
changes the local IP address of the VXLAN interface from a unique non-virtual IP address to an
anycast virtual IP address and puts the interface in an UP state.
If after establishing MLAG peering, the peer link goes down, then the primary switch continues
to keep all VXLAN interfaces up with the anycast IP address while the secondary switch brings
down all VXLAN interfaces and places them in a PROTO_DOWN state. It also removes the
anycast IP address from the loopback interface and changes the local IP address of the VXLAN
interface to a unique non-virtual IP address.
If after establishing MLAG peering, one of the switches goes down, then the other running
[Link] 267
Cumulus Networks
If after establishing MLAG peering, one of the switches goes down, then the other running
switch continues to use the anycast IP address.
If after establishing MLAG peering, clagd is stopped, all VXLAN interfaces are put in a
PROTO_DOWN state. The anycast IP address is removed from the loopback interface and the
local IP addresses of the VXLAN interfaces are changed from the anycast IP address to unique
non-virtual IP addresses.
If MLAG peering could not be established between the switches, clagd brings up all the VXLAN
interfaces after the reload timer expires with unique non-virtual IP addresses. This allows the
VXLAN interface to be up and running on both switches even though peering is not established.
auto lo
iface lo
address [Link]/32
clagd-vxlan-anycast-ip [Link]
This is not a loopback interface address configuration. It's a clagd parameter configuration
under the loopback interface. Only clagd can add or remove an anycast virtual IP address as
an interface address to the loopback interface.
Configuring MLAG
Refer to the MLAG chapter (see page 187) for configuration information.
Configuration LNV
Refer to the LNV chapter (see page 232) for configuration information.
Configuring STP
You should enable BPDU filter and BPDU guard (see page 129) in the VXLAN interfaces if STP (see page
118) is enabled in the bridge that is connected to the VXLAN.
Note the configuration of the local IP address in the VXLAN interfaces below. They are configured with
individual IP addresses, which clagd changes to anycast upon MLAG peering.
leaf1 Configuration
leaf1 configuration; click here to expand...
auto eth0
address [Link]
netmask [Link]
auto lo
iface lo
address [Link]/32
clagd-vxlan-anycast-ip [Link]
auto swp1
iface swp1
address [Link]/30
mtu 9050
auto swp2
iface swp2
address [Link]/30
mtu 9050
auto swp3
iface swp3
[Link] 269
Cumulus Networks
address [Link]/30
mtu 9050
auto swp4
iface swp4
address [Link]/30
mtu 9050
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-min-links 1
bond-xmit_hash_policy layer3+4
bond-lacp-rate 1
mtu 9050
auto peerlink.4094
iface peerlink.4094
address [Link]/32
address [Link]/30
mtu 9050
clagd-priority 4096
clagd-sys-mac [Link]
clagd-peer-ip [Link]
clagd-backup-ip [Link]
auto host1
iface host1
bond-slaves swp5
bond-mode 802.3ad
bond-miimon 100
bond-min-links 1
bond-xmit_hash_policy layer3+4
bond-lacp-rate 1
mtu 9050
clag-id 1
auto host2
iface host2
bond-slaves swp6
bond-mode 802.3ad
bond-miimon 100
bond-min-links 1
bond-xmit_hash_policy layer3+4
bond-lacp-rate 1
mtu 9050
clag-id 2
auto vxlan-1000
iface vxlan-1000
vxlan-id 1000
vxlan-local-tunnelip [Link]
mtu 9000
auto vxlan-2000
iface vxlan-2000
vxlan-id 2000
vxlan-local-tunnelip [Link]
mtu 9000
auto br1000
iface br1000
bridge-ports host1 host2.1000 peerlink.1000 vxlan-1000
bridge-stp on
mstpctl-portbpdufilter vxlan-1000=yes
mstpctl-bpduguard vxlan-1000=yes
mstpctl-portautoedge host1=yes host2.1000=yes peerlink.1000=yes
auto br2000
iface br2000
bridge-ports host1.2000 host2 peerlink.2000 vxlan-2000
bridge-stp on
mstpctl-portbpdufilter vxlan-2000=yes
mstpctl-bpduguard vxlan-2000=yes
mstpctl-portautoedge host1.2000=yes host2=yes peerlink.2000=yes
leaf2 Configuration
leaf2 configuration; click here to expand...
auto eth0
address [Link]
netmask [Link]
auto lo
iface lo
address [Link]/32
clagd-vxlan-anycast-ip [Link]
auto swp1
iface swp1
address [Link]/30
mtu 9050
auto swp2
iface swp2
address [Link]/30
mtu 9050
auto swp3
iface swp1
address [Link]/30
mtu 9050
auto swp4
iface swp2
address [Link]/30
mtu 9050
auto peerlink
iface peerlink
bond-slaves swp31 swp32
bond-mode 802.3ad
bond-miimon 100
bond-min-links 1
[Link] 271
Cumulus Networks
bond-xmit_hash_policy layer3+4
bond-lacp-rate 1
mtu 9050
auto peerlink.4094
iface peerlink.4094
address [Link]/32
address [Link]/30
mtu 9050
clagd-priority 4096
clagd-sys-mac [Link]
clagd-peer-ip [Link]
clagd-backup-ip [Link]
auto host1
iface host1
bond-slaves swp5
bond-mode 802.3ad
bond-miimon 100
bond-min-links 1
bond-xmit_hash_policy layer3+4
bond-lacp-rate 1
mtu 9050
clag-id 1
auto host2
iface host2
bond-slaves swp6
bond-mode 802.3ad
bond-miimon 100
bond-min-links 1
bond-xmit_hash_policy layer3+4
bond-lacp-rate 1
mtu 9050
clag-id 2
auto vxlan-1000
iface vxlan-1000
vxlan-id 1000
vxlan-local-tunnelip [Link]
mtu 9000
auto vxlan-2000
iface vxlan-2000
vxlan-id 2000
vxlan-local-tunnelip [Link]
mtu 9000
auto br1000
iface br1000
bridge-ports host1 host2.1000 peerlink.1000 vxlan-1000
bridge-stp on
mstpctl-portbpdufilter vxlan-1000=yes
mstpctl-bpduguard vxlan-1000=yes
mstpctl-portautoedge host1=yes host2.1000=yes peerlink.1000=yes
auto br2000
iface br2000
bridge-ports host1.2000 host2 peerlink.2000 vxlan-2000
bridge-stp on
mstpctl-portbpdufilter vxlan-2000=yes
mstpctl-bpduguard vxlan-2000=yes
mstpctl-portautoedge host1.2000=yes host2=yes peerlink.2000=yes
Quagga Configuration
The layer 3 fabric can be configured using BGP (see page 318) or OSPF (see page 305). The following
example uses OSPF; the configuration needed in the MLAG switches in the above specified topology is
as follows:
interface lo interface lo
ip ospf area [Link] ip ospf area [Link]
interface swp1 interface swp1
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp2 interface swp2
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp3 interface swp3
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
interface swp4 interface swp4
ip ospf network point-to- ip ospf network point-to-
point point
ip ospf area [Link] ip ospf area [Link]
! !
! !
! !
! !
! !
router-id [Link] router-id [Link]
router ospf router ospf
ospf router-id [Link] ospf router-id [Link]
[Link] 273
Cumulus Networks
LNV Configuration
The following configuration variables should be set in leaf1 and leaf2 in /etc/[Link]. This
configuration assumes head-end replication is used to replicate BUM traffic. If service node based
replication is used, then svcnode_ip variable has to be set with service node address. Please refer to
Configuring the Registration Node (see page 244) for setting that variable.
leaf1 Configuration
# Local IP address to bind to for receiving control traffic from the snd
src_ip = [Link]
leaf2 Configuration
# Local IP address to bind to for receiving control traffic from the snd
src_ip = [Link]
You can use the clagctl command to check if any VXLAN devices are in a PROTO_DOWN state. As
shown below, VXLAN devices are kept in a PROTO_DOWN state due to the missing anycast
configuration.
cumulus@switch$ clagctl
The peer is alive
Our Priority, ID, and Role: 4096 [Link] primary
Peer Priority, ID, and Role: 8192 [Link] secondary
Peer Interface and IP: peerlink.4094 [Link]
Backup IP: [Link] (active)
System MAC: [Link]
CLAG Interfaces
Our Interface Peer Interface CLAG Id Conflicts
Proto-Down Reason
---------------- ---------------- ------- --------------------
-----------------
host1 host2 1 -
-
host1 host2 2 -
-
vxlan-1000 - - -
vxlan-single,no-anycast-ip
vxlan-2000 - - -
vxlan-single,no-anycast-ip
An IGMP query message received on a port is used to identify the port that is connected to a router and
[Link] 275
Cumulus Networks
An IGMP query message received on a port is used to identify the port that is connected to a router and
is interested in receiving multicast traffic.
MLD snooping processes MLD v1/v2 reports, queries and v1 done messages for IPv6 groups. If IGMP or
MLD snooping is disabled, multicast traffic will be flooded to all the bridge ports in the bridge. The
multicast group IP address is mapped to a multicast MAC address and a forwarding entry is created
with a list of ports interested in receiving multicast traffic destined to that group.
Contents
(Click to expand)
Contents (see page 276)
Commands (see page 277)
Creating a Bridge and Configuring IGMP/MLD Snooping (see page 277)
Configuring IGMP/MLD Snooping Parameters (see page 279)
Persistent Configuration (see page 279)
Querier and Fast Leave Configuration (see page 280)
Static Group and Router Port Configuration (see page 280)
Configuration Files (see page 281)
Man Pages (see page 281)
Useful Links (see page 281)
Commands
brctl
bridge
[Link] 277
Cumulus Networks
flags
swp1 (1)
port id 8001 state
forwarding
designated root 8000.7072cf8c272c path cost 2
designated bridge 8000.7072cf8c272c message age timer
0.00
designated port 8001 forward delay timer
0.00
designated cost 0 hold timer
0.00
mc router 1 mc fast leave 0
flags
swp2 (2)
port id 8002 state
forwarding
designated root 8000.7072cf8c272c path cost 2
designated bridge 8000.7072cf8c272c message age timer
0.00
designated port 8002 forward delay timer
0.00
designated cost 0 hold timer
0.00
mc router 1 mc fast leave 0
flags
swp3 (3)
port id 8003 state
forwarding
designated root 8000.7072cf8c272c path cost 2
designated bridge 8000.7072cf8c272c message age timer
0.00
designated port 8003 forward delay timer
8.98
designated cost 0 hold timer
0.00
mc router 1 mc fast leave 0
flags
To get the groups and bridge port state, use bridge mdb show command. To display router ports and
group information use bridge -d mdb show command:
Persistent Configuration
The configuration in /etc/network/interfaces below is for the example bridge above:
auto br0
iface br0 inet static
[Link] 279
Cumulus Networks
If only one host is attached to each host port, fast leave can be configured on that port. When a leave
message is received on that port, no query messages will be sent and the group will be deleted
immediately:
cumulus@switch:~# sudo bridge mdb add dev br0 port swp2 grp ff1a::9
permanent
cumulus@switch:~# sudo bridge mdb add dev br0 port swp1 grp [Link]
permanent
A static temporary multicast group can also be configured on a port, which would be deleted after the
membership timer expires, if no report is received on that port:
cumulus@switch:~# sudo bridge mdb add dev br0 port swp1 grp [Link]
temp
Configuration Files
/etc/network/interfaces
Man Pages
brctl(8)
bridge(8)
bridge-utils-interfaces(5)
Useful Links
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
Layer 3 Features
[Link] 281
Cumulus Networks
Layer 3 Features
Routing
This chapter discusses routing on switches running Cumulus Linux.
Contents
(Click to expand)
Contents (see page 282)
Commands (see page 282)
Static Routing via ip route (see page 282)
Persistently Adding a Static Route (see page 284)
Static Routing via quagga (see page 284)
Persistent Configuration (see page 286)
Supported Route Table Entries (see page 286)
Configuration Files (see page 287)
Useful Links (see page 287)
Caveats and Errata (see page 287)
Commands
ip route
[Link] 283
Cumulus Networks
auto swp3
iface swp3
address [Link]/24
up ip route add [Link]/24 via [Link]
Notice the simpler configuration of swp3 due to ifupdown2. For more information, see
Configuring Network Interfaces with ifupdown (see page 89).
switch# conf t
switch(config)# ip route [Link]/24 [Link]
switch(config)# end
switch# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
switch# conf t
switch(config)# no ip route [Link]/24 [Link]
[Link] 285
Cumulus Networks
switch(config)# end
switch# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
Persistent Configuration
From the quagga CLI, the running configuration can be saved so it persists between reboots:
Following are the number of route supported on Trident II switches with ALPM:
32K IPv4 routes
16K IPv6 routes
32K total routes (both IPv4 and IPv6)
Following are the number of route supported on Trident and Trident+ switches:
16K IPv4 routes
8K IPv6 routes
16K total routes (both IPv4 and IPv6)
Configuration Files
/etc/network/interfaces
/etc/quagga/[Link]
Useful Links
[Link]
[Link]
Contents
(Click to expand)
Contents (see page 287)
Defining Routing Protocols (see page 287)
Configuring Routing Protocols (see page 288)
Protocol Tuning (see page 288)
Configuration Files (see page 289)
IP routing protocols are typically distributed; that is, an instance of the routing protocol runs on each of
[Link] 287
Cumulus Networks
IP routing protocols are typically distributed; that is, an instance of the routing protocol runs on each of
the routers in a network.
Cumulus Linux does not support running multiple instances of the same protocol on a router.
Distributed routing protocols compute reachability between end points by disseminating relevant
information and running a routing algorithm on this information to determine the routes to each end
station. To scale the amount of information that needs to be exchanged, routes are computed on
address prefixes rather than on every end point address.
Protocol Tuning
Most protocols provide certain tunable parameters that are specific to convergence during changes.
Wikipedia defines convergence as the “state of a set of routers that have the same topological
information about the network in which they operate”. It is imperative that the routers in a network
have the same topological state for the proper functioning of a network. Without this, traffic can be
blackholed, and thus not reach its destination. It is normal for different routers to have differing
topological states during changes, but this difference should vanish as the routers exchange
information about the change and recompute the forwarding paths. Different protocols converge at
different speeds in the presence of changes.
A key factor that governs how quickly a routing protocol converges is the time it takes to detect the
change. For example, how quickly can a routing protocol be expected to act when there is a link failure.
Routing protocols classify changes into two kinds: hard changes such as link failures, and soft changes
such as a peer dying silently. They’re classified differently because protocols provide different
mechanisms for dealing with these failures.
It is important to configure the protocols to be notified immediately on link changes. This is also true
when a node goes down, causing all of its links to go down.
Even if a link doesn’t fail, a routing peer can crash. This causes that router to usually delete the routes it
has computed or worse, it makes that router impervious to changes in the network, causing it to go out
of sync with the other routers in the network because it no longer shares the same topological
information as its peers.
Configuration Files
/etc/quagga/daemons
Network Topology
In computer networks, topology refers to the structure of interconnecting various nodes. Some
commonly used topologies in networks are star, hub and spoke, leaf and spine, and broadcast.
Contents
(Click to expand)
Contents (see page 289)
Clos Topologies (see page 289)
Over-Subscribed and Non-Blocking Configurations (see page 290)
Containing the Failure Domain (see page 290)
Load Balancing (see page 290)
Clos Topologies
In the vast majority of modern data centers, Clos or fat tree topology is very popular. This topology is
shown in the figure below. It is also commonly referred to as leaf-spine topology. We shall use this
topology throughout the routing protocol guide.
This topology allows the building of networks of varying size using nodes of different port counts and
[Link] 289
Cumulus Networks
This topology allows the building of networks of varying size using nodes of different port counts and
/or by increasing the tiers. The picture above is a three-tiered Clos network. We number the tiers from
the bottom to the top. Thus, in the picture, the lowermost layer is called tier 1 and the topmost tier is
called tier 3.
The number of end stations (such as servers) that can be attached to such a network is determined by
a very simple mathematical formula.
In a 2-tier network, if each node is made up of m ports, then the total number of end stations that can
be connected is m^2/2. In more general terms, if tier-1 nodes are m-port nodes and tier-2 nodes are n-
port nodes, then the total number of end stations that can be connected are (m*n)/2. In a three tier
network, where tier-3 nodes are o-port nodes, the total number of end stations that can be connected
are (m*n*o)/2^(number of tiers-1).
Let’s consider some practical examples. In many data centers, it is typical to connect 40 servers to a top-
of-rack (ToR) switch. The ToRs are all connected via a set of spine switches. If a ToR switch has 64 ports,
then after hooking up 40 ports to the servers, the remaining 24 ports can be hooked up to 24 spine
switches of the same link speed or to a smaller number of higher link speed switches. For example, if
the servers are all hooked up as 10GE links, then the ToRs can connect to the spine switches via 40G
links. So, instead of connecting to 24 spine switches with 10G links, the ToRs can connect to 6 spine
switches with each link being 40G. If the spine switches are also 64-port switches, then the total
number of end stations that can be connected is 2560 (40*64) stations.
In a three tier network of 64-port switches, the total number of servers that can be connected are
(40*64*64)/2 = 81920. As you can see, this kind of topology can serve quite a large network with three
tiers.
Load Balancing
290 14 December 2015
Cumulus Linux 2.5.5 User Guide
Load Balancing
In a Clos network, traffic is load balanced across the multiple links using equal cost multi-pathing
(ECMP).
Routing algorithms compute shortest paths between two end stations where shortest is typically the
lowest path cost. Each link is assigned a metric or cost. By default, a link’s cost is a function of the link
speed. The higher the link speed, the lower its cost. A 10G link has a higher cost than a 40G or 100G
link, but a lower cost than a 1G link. Thus, the link cost is a measure of its traffic carrying capacity.
In the modern data center, the links between tiers of the network are homogeneous; that is, they have
the same characteristics (same speed and therefore link cost) as the other links. As a result, the first
hop router can pick any of the spine switches to forward a packet to its destination (assuming that
there is no link failure between the spine and the destination switch). Most routing protocols recognize
that there are multiple equal-cost paths to a destination and enable any of them to be selected for a
given traffic flow.
Quagga Overview
Cumulus Linux uses quagga, an open source routing software suite, to provide the routing protocols
for dynamic routing. Cumulus Linux supports the l atest Quagga version, [Link]. Quagga is a fork of
the GNU Zebra project.
Quagga provides many routing protocols, of which Cumulus Linux supports the following:
Open Shortest Path First ( v2 (see page 305) and v3 (see page 315))
Border Gateway Protocol (see page 318)
Contents
(Click to expand)
Contents (see page 291)
Architecture (see page 292)
Zebra (see page 292)
Configuration Files (see page 292)
Useful Links (see page 293)
[Link] 291
Cumulus Networks
Architecture
As shown in the figure above, the Quagga routing suite consists of various protocol-specific daemons
and a protocol-independent daemon called zebra. Each of the protocol-specific daemons are
responsible for running the relevant protocol and building the routing table based on the information
exchanged.
It is not uncommon to have more than one protocol daemon running at the same time. For example, at
the edge of an enterprise, protocols internal to an enterprise (called IGP for Interior Gateway Protocol)
such as OSPF (see page 305) or RIP run alongside the protocols that connect an enterprise to the rest of
the world (called EGP or Exterior Gateway Protocol) such as BGP (see page 318).
zebra is the daemon that resolves the routes provided by multiple protocols (including static routes
specified by the user) and programs these routes in the Linux kernel via netlink (in Linux). zebra
does more than this, of course.
Zebra
The quagga documentation defines zebra as the IP routing manager for quagga that “provides kernel
routing table updates, interface lookups, and redistribution of routes between different routing
protocols.”
Configuration Files
/etc/quagga/[Link]
/etc/quagga/daemons
/etc/quagga/[Link]
/etc/quagga/[Link]
/etc/quagga/[Link]
/etc/quagga/[Link]
/etc/quagga/[Link]
Useful Links
[Link]
[Link]
Configuring Quagga
This section provides an overview of configuring quagga.
Before you run quagga, make sure all relevant daemons, such as zebra, are running. Make your
changes in /etc/quagga/daemons then restart quagga with service quagga restart.
Contents
(Click to expand)
Contents (see page 293)
Configuration Files (see page 294)
Starting Quagga (see page 294)
Understanding Integrated Configurations (see page 294)
Interface IP Addresses (see page 296)
Using the vtysh Modal CLI (see page 296)
Using the Cumulus Linux Non-Modal CLI (see page 300)
Comparing vtysh and Cumulus Linux Commands (see page 301)
Displaying the Routing Table (see page 301)
Creating a New Neighbor (see page 301)
Redistributing Routing Information (see page 301)
Defining a Static Route (see page 302)
Configuring an IPv6 Interface (see page 302)
Enabling PTM (see page 302)
Configuring MTU in IPv6 Network Discovery (see page 303)
Logging OSPF Adjacency Changes (see page 303)
Setting OSPF Interface Priority (see page 303)
Configuring Timing for OSPF SPF Calculations (see page 304)
Configuring Hello Packet Intervals (see page 304)
Displaying OSPF Debugging Status (see page 304)
Displaying BGP Information (see page 305)
Useful Links (see page 305)
[Link] 293
Cumulus Networks
Configuration Files
At startup, quagga reads a set of files to determine the startup configuration. The files and what they
contain are specified below:
File Description
[Link] The default, integrated, single configuration file for all quagga daemons.
Starting Quagga
Quagga does not start by default in Cumulus Linux 2.0 and later versions.
Before you start quagga, modify /etc/quagga/daemons to enable the corresponding daemons:
If you disable the integrated configuration mode, quagga saves each daemon-specific configuration file
in a separate file. At a minimum for a daemon to start, that daemon must be specified in the daemons
file and the daemon-specific configuration file must be present, even if that file is empty.
For example, to start bgpd, the daemons file needs to be formatted as follows, at minimum:
When the integrated configuration mode disabled, the output looks like this:
The daemons file is not written using the write mem command.
[Link] 295
Cumulus Networks
Interface IP Addresses
Quagga inherits the IP addresses for the network interfaces from the /etc/network/interfaces file.
This is the recommended way to define the addresses. For more information, see Configuring IP
Addresses (see page 92).
quagga#
Launching vtysh brings you into zebra initially. From here, you can log into other protocol daemons,
such as bgpd, ospfd or babeld.
vtysh provides a Cisco-like modal CLI, and many of the commands are similar to Cisco IOS commands.
By modal CLI, we mean that there are different modes to the CLI, and certain commands are only
available within a specific mode. Configuration is available with the configure terminal command,
which is invoked thus:
The prompt displays the mode the CLI is in. For example, when the interface-specific commands are
invoked, the prompt changes to:
When the routing protocol specific commands are invoked, the prompt changes to:
At any level, ”?” displays the list of available top-level commands at that level:
296 14 December 2015
Cumulus Linux 2.5.5 User Guide
At any level, ”?” displays the list of available top-level commands at that level:
quagga(config-if)# ?
babel Babel interface commands
bandwidth Set bandwidth informational parameter
description Interface specific description
end End current mode and change to enable mode
exit Exit current mode and down to previous mode
ip Interface Internet Protocol config commands
ipv6 Interface IPv6 config commands
isis IS-IS commands
link-detect Enable link detection on interface
list Print command list
mpls-te MPLS-TE specific commands
multicast Set multicast flag to interface
no Negate a command or set its defaults
ospf OSPF interface commands
quit Exit current mode and down to previous mode
shutdown Shutdown the selected interface
?-based completion is also available to see the parameters that a command takes:
quagga(config-if)# bandwidth ?
<1-10000000> Bandwidth in kilobits
quagga(config-if)# ip ?
address Set the IP address of an interface
irdp Alter ICMP Router discovery preference this interface
ospf OSPF interface commands
rip Routing Information Protocol
router IP router interface commands
Displaying state can be done at any level, including the top level. For example, to see the routing table
as seen by zebra, you use:
[Link] 297
Cumulus Networks
Running single commands with vtysh is possible using the -c option of vtysh:
Notice that the commands also take a partial command name (for example, sh ip route above) as
long as the partial command name is not aliased:
A command or feature can be disabled by prepending the command with no. For example:
Current configuration:
!
hostname quagga
log file /media/node/[Link]
log file /media/node/[Link]
log timestamp precision 6
!
service integrated-vtysh-config
!
password xxxxxx
enable password xxxxxx
!
interface eth0
ipv6 nd suppress-ra
link-detect
!
interface lo
link-detect
!
interface swp1
ipv6 nd suppress-ra
[Link] 299
Cumulus Networks
link-detect
!
interface swp2
ipv6 nd suppress-ra
link-detect
!
router bgp 65000
bgp router-id [Link]
bgp log-neighbor-changes
bgp scan-time 20
network [Link]/24
timers bgp 30 90
neighbor tier-2 peer-group
neighbor [Link] remote-as 65000
neighbor [Link] ttl-security hops 1
neighbor [Link] advertisement-interval 30
neighbor [Link] timers 30 90
neighbor [Link] timers connect 30
neighbor [Link] next-hop-self
neighbor [Link] remote-as 65000
neighbor [Link] next-hop-self
neighbor [Link] remote-as 65000
!
ip forwarding
ipv6 forwarding
!
line vty
exec-timeout 0 0
!
end
Command Description
cl-bgp BGP (see page 318) commands. See man cl-bgp for details.
Command Description
cl-rctl Zebra and non-routing protocol-specific commands. See man cl-rctl for details.
To display the routing table with the Cumulus Linux CLI, run:
[Link] 301
Cumulus Networks
To redistribute routing information from static route entries into RIP tables with the Cumulus Linux CLI,
run:
Enabling PTM
To enable topology checking (PTM) under Quagga, you would run:
quagga(config)# ptm-enable
To enable topology checking (PTM) with the Cumulus Linux CLI, run:
302 14 December 2015
Cumulus Linux 2.5.5 User Guide
To enable topology checking (PTM) with the Cumulus Linux CLI, run:
To configure MTU in IPv6 network discovery for an interface with the Cumulus Linux CLI, run:
To log adjacency changes of OSPF with the Cumulus Linux CLI, run:
To set the OSPF interface priority with the Cumulus Linux CLI, run:
[Link] 303
Cumulus Networks
To configure timing for OSPF SPF calculations with the Cumulus Linux CLI, run:
To configure the OSPF Hello packet interval in number of seconds for an interface with the Cumulus
Linux CLI, run:
To display OSPF debugging status with the Cumulus Linux CLI, run:
Useful Links
[Link]
[Link]
[Link]
Contents
(Click to expand)
Contents (see page 305)
Scalability and Areas (see page 306)
Configuring OSPFv2 (see page 307)
Activating the OSPF and Zebra Daemons (see page 307)
Enabling OSPF (see page 307)
Defining (Custom) OSPF Parameters on the Interfaces (see page 309)
[Link] 305
Cumulus Networks
Here are some points to note about areas and OSPF behavior:
Routers that have links to multiple areas are called area border routers (ABR). For example,
routers R3, R4, R5, R6 are ABRs in the diagram. An ABR performs a set of specialized tasks, such
as SPF computation per area and summarization of routes across areas.
Most of the LSAs have an area-level flooding scope. These include router LSA, network LSA, and
summary LSA.
In the diagram, we reused the same non-zero area address. This is fine since the area address
is only a scoping parameter provided to all routers within that area. It has no meaning outside
the area. Thus, in the cases where ABRs do not connect to multiple non-zero areas, the same
area address can be used, thus reducing the operational headache of coming up with area
addresses.
Configuring OSPFv2
Configuring OSPF involves the following tasks:
Activating the OSPF daemon
Enabling OSPF
Defining (Custom) OSPF parameters on the interfaces
Enabling OSPF
As we discussed in Introduction to Routing Protocols (see page 287), there are three steps to the
configuration:
There are two ways to achieve (2) and (3) in the Quagga OSPF:
1. The network statement under router ospf does both. The statement is specified with an IP
subnet prefix and an area address. All the interfaces on the router whose IP address matches
the network subnet are put into the specified area. OSPF process starts bringing up peering
adjacency on those interfaces. It also advertises the interface IP addresses formatted into LSAs
(of various types) to the neighbors for proper reachability.
From the Cumulus Linux shell:
[Link] 307
Cumulus Networks
The subnets in the network subnet can be as coarse as possible to cover the most number of
interfaces on the router that should run OSPF.
There may be interfaces where it’s undesirable to bring up OSPF adjacency. For example, in a
data center topology, the host-facing interfaces need not run OSPF; however the corresponding
IP addresses should still be advertised to neighbors. This can be achieved using the passive-
interface construct.
From the vytsh/quagga CLI:
Or use the passive-interface default command to put all interfaces as passive and
selectively remove certain interfaces to bring up protocol adjacency:
2. Explicitly enable OSPF for each interface by configuring it under the interface configuration
mode:
If OSPF adjacency bringup is not desired, you should configure the corresponding interfaces as
passive as explained above.
This model of configuration is required for unnumbered interfaces as discussed later in this
guide.
For achieving step (3) alone, the quagga configuration provides another method: redistribution.
For example:
Redistribution, however, unnecessarily loads the database with type-5 LSAs and should be
limited to generating real external prefixes (for example, prefixes learned from BGP). In general,
it is a good practice to generate local prefixes using network and/or passive-interface
statements.
[Link] 309
Cumulus Networks
Summarize in the direction to the backbone. The backbone receives summarized routes and
injects them to other areas already summarized.
As shown in the diagram, the ABRs in the right non-zero area summarize the host prefixes as [Link]
/16. When the link between R5 and R10 fails, R5 will send a worse metric for the summary route (metric
for the summary route is the maximum of the metrics of intra-area routes that are covered by the
summary route. Upon failure of the R5-R10 link, the metric for [Link]/24 goes higher at R5 as the path
is R5-R9-R6-R10). As a result, other backbone routers shift traffic destined to [Link]/16 towards R6.
This breaks ECMP and is an under-utilization of network capacity for traffic destined to [Link]/24.
Stub areas still receive information about networks that belong to other areas of the same OSPF
domain. Especially, if summarization is not configured (or is not comprehensive), the information can
be overwhelming for the nodes. Totally stubby areas address this issue. Routers in totally stubby areas
keep in their LSDB information about routing within their area, plus the default route.
To configure a totally stubby area:
Type Behavior
Normal non- zero LSA types 1, 2, 3, 4 area-scoped, type 5 externals, inter-area routes
area summarized
Totally stubby area LSA types 1, 2 area-scoped, default summary, No type 3, 4, 5 LSA types
allowed
[Link] 311
Cumulus Networks
Unless the Ethernet media is intended to be used as a LAN with multiple connected routers,
we recommend configuring the interface as point-to-point. It has the additional advantage of
a simplified adjacency state machine; there is no need for DR/BDR election and LSA reflection.
See RFC5309 for a more detailed discussion.
To configure an unnumbered interface, take the IP address of another interface (called the anchor) and
use that as the IP address of the unnumbered interface:
ECMP
During SPF computation for an area, if OSPF finds multiple paths with equal cost (metric), all those
paths are used for forwarding. For example, in the reference topology diagram, R8 uses both R3 and R4
as next hops to reach a destination attached to R9.
For the maintenance events, operators typically raise the OSPF administrative weight of the link(s) to
ensure that all traffic is diverted from the link or the node (referred to as costing out). The speed of
reconvergence does not matter. Indeed, changing the OSPF cost causes LSAs to be reissued, but the
links remain in service during the SPF computation process of all routers in the network.
For the failure events, traffic may be lost during reconvergence; that is, until SPF on all nodes computes
an alternative path around the failed link or node to each of the destinations. The reconvergence
depends on layer 1 failure detection capabilities and at the worst case DeadInterval OSPF timer.
Example Configurations
Example configuration for event 1, using vtysh:
Debugging OSPF
OperState lists all the commands to view the operational state of OSPF.
The three most important states while troubleshooting the protocol are:
1. Neighbors, with show ip ospf neighbor. This is the starting point to debug neighbor states
(also see tcpdump below).
2.
[Link] 313
Cumulus Networks
2. Database, with show ip ospf database. This is the starting point to verify that the LSDB is, in
fact, synchronized across all routers in the network. For example, sweeping through the output
of show ip ospf database router taken from all routers in an area will ensure if the
topology graph building process is complete; that is, every node has seen all the other nodes in
the area.
3. Routes, with show ip ospf route. This is the outcome of SPF computation that gets
downloaded to the forwarding table, and is the starting point to debug, for example, why an
OSPF route is not being forwarded correctly.
Compare the route output with kernel by using show ip route | grep zebra and
with the hardware entries using cl-route-check -V.
Using cl-ospf:
COMMANDs
{ set | clear } (all | event | ism | ism [OBJECT] | lsa | lsa
[OBJECT] |
nsm | nsm [OBJECT] | nssa | packet | packet [OBJECT] |
zebra [OBJECT] | zebra all)
Configuration Files
/etc/quagga/daemons
/etc/quagga/[Link]
Supported RFCs
RFC2328
RFC3137
RFC5309
Useful Links
Bidirectional forwarding detection (see page 339) (BFD) and OSPF
[Link]
[Link]
Perlman, Radia (1999). Interconnections: Bridges, Routers, Switches, and Internetworking
Protocols (2 ed.). Addison-Wesley.
Moy, John T. OSPF: Anatomy of an Internet Routing Protocol. Addison-Wesley.
IETF has defined extensions to OSPFv3 to support multiple address families (that is, both IPv6
and IPv4). Quagga (see page 291) does not support it yet.
Contents
[Link] 315
Cumulus Networks
Contents
(Click to expand)
Contents (see page 315)
Configuring OSPFv3 (see page 316)
Unnumbered Interfaces (see page 317)
Debugging OSPF (see page 317)
Configuration Files (see page 318)
Supported RFCs (see page 318)
Useful Links (see page 318)
Configuring OSPFv3
Configuring OSPFv3 involves the following tasks:
2. Enabling OSPF6 and map interfaces to areas. From Quagga's vtysh shell:
R3# conf t
R3# configure terminal
R3(config)# router ospf6
R3(config-router)# router-id 0.0.1
R3(config-router)# log-adjacency-changes detail
R3(config-router)# interface swp1 area [Link]
R3(config-router)# interface swp2 area [Link]
R3(config-router)#
Unnumbered Interfaces
Unlike OSPFv2, OSPFv3 intrinsically supports unnumbered interfaces. Forwarding to the next hop
router is done entirely using IPv6 link local addresses. Therefore, you are not required to configure any
global IPv6 address to interfaces between routers.
Debugging OSPF
See Debugging OSPF (see page 313) for OSPFv2 for the troubleshooting discussion. The equivalent
commands are:
[Link] 317
Cumulus Networks
Another helpful command is show ipv6 ospf6 [area <id>] spf tree. It dumps the node
topology as computed by SPF to help visualize the network view.
Configuration Files
/etc/quagga/daemons
/etc/quagga/[Link]
Supported RFCs
RFC5340
RFC3137
Useful Links
Bidirectional forwarding detection (see page 339) (BFD) and OSPF
[Link]
[Link]
Contents
(Click to expand)
Contents (see page 318)
Commands (see page 319)
Autonomous System Number (ASN) (see page 320)
eBGP and iBGP (see page 320)
Route Reflectors (see page 320)
Commands
Cumulus Linux:
bgp
vtysh
Quagga:
bgp
neighbor
router
[Link] 319
Cumulus Networks
router
show
Route Reflectors
Route reflectors are quite easy to understand in a Clos topology. In a two-tier Clos network, the leaf (or
tier 1) switches are the only ones connected to end stations. Subsequently, this means that the spines
themselves do not have any routes to announce. They’re merely reflecting the routes announced by
one leaf to the other leaves. Thus, the spine switches function as route reflectors while the leaf
switches serve as route reflector clients.
In a three-tier network, the tier 2 nodes (or mid-tier spines) act as both route reflector servers and
route reflector clients. They act as route reflectors because they announce the routes learned from the
tier 1 nodes to other tier 1 nodes and to tier 3 nodes. They also act as route reflector clients to the tier
3 nodes, receiving routes learned from other tier 2 nodes. Tier 3 nodes act only as route reflectors.
In the following illustration, tier 2 node 2.1 is acting as a route reflector server, announcing the routes
between tier 1 nodes 1.1 and 1.2 to tier 1 node 1.3. It is also a route reflector client, learning the routes
between tier 2 nodes 2.2 and 2.3 from the tier 3 node, 3.1.
Configuring BGP
1. Activate the BGP and Zebra daemons:
Add the following line to /etc/quagga/daemons:
zebra=yes
bgpd = yes
A slightly more useful configuration file would contain the following lines:
hostname R7
password *****
enable password *****
log timestamp precision 6
log file /var/log/quagga/[Link]
[Link] 321
Cumulus Networks
!
line vty
exec-timeout 0 0
!
The most important information here is the specification of the location of the log file,
where the BGP process can log debugging and other useful information. A common
convention is to store the log files under /var/log/quagga.
You must restart quagga when a new daemon is enabled:
Specifying the peer’s IP address allows BGP to set up a TCP socket with this peer, but it doesn’t
distribute any prefixes to it, unless it is explicitly told that it must via the activate command:
As you can see, activate has to be specified for each address family that is being announced by
the BGP session.
4. Specify some properties of the BGP session:
It is node R3, the route reflector, on which the peer is specified as a client.
[Link] 323
Cumulus Networks
It is assumed that the IPv6 implementation on the peering device will use the MAC address as
the interface ID when assigning the IPv6 link-local address, as suggested by RFC 4291.
interface swp1
no ipv6 nd suppress-ra
ipv6 nd ra-interval 5
!
router bgp 10
neighbor swp1 interface
neighbor swp1 remote-as 20
neighbor swp1 capability extended-nexthop
!
# show ip bgp
BGP table version is 66, local router ID is [Link]
Status codes: s suppressed, d damped, h history, * valid, > best, =
multipath,
i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> [Link]/32 [Link] 0 32768 ?
*= [Link]/32 swp2 0 65534 64503 ?
# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel, T - Table,
> - selected route, * - FIB route
K>* [Link]/0 via [Link], eth0
C>* [Link]/32 is directly connected, lo
B>* [Link]/32 [20/0] via fe80::202:ff:fe00:45, swp3, [Link]
* via fe80::202:ff:fe00:35, swp1, [Link]
* via fe80::202:ff:fe00:3d, swp2, [Link]
* via fe80::202:ff:fe00:4d, swp4, [Link]
* via fe80::202:ff:fe00:55, swp5, [Link]
* via fe80::202:ff:fe00:5a, swp6, [Link]
The following commands show how the IPv4 link-local address [Link] is used to install the route
and static neighbor entry to facilitate proper forwarding without having to install an IPv4 prefix with
IPv6 next-hop in the kernel:
[Link] 325
Cumulus Networks
# ip neigh
fe80::202:ff:fe00:35 dev swp1 lladdr [Link] router REACHABLE
fe80::202:ff:fe00:5a dev swp6 lladdr [Link] router REACHABLE
fe80::202:ff:fe00:3d dev swp2 lladdr [Link] router REACHABLE
fe80::202:ff:fe00:55 dev swp5 lladdr [Link] router REACHABLE
fe80::202:ff:fe00:45 dev swp3 lladdr [Link] router REACHABLE
fe80::202:ff:fe00:4d dev swp4 lladdr [Link] router REACHABLE
[Link] dev swp5 lladdr [Link] PERMANENT
[Link] dev eth0 lladdr [Link] REACHABLE
[Link] dev swp3 lladdr [Link] PERMANENT
[Link] dev swp1 lladdr [Link] PERMANENT
[Link] dev swp4 lladdr [Link] PERMANENT
[Link] dev swp6 lladdr [Link] PERMANENT
[Link] dev swp2 lladdr [Link] PERMANENT
2.
326 14 December 2015
Cumulus Linux 2.5.5 User Guide
2. Otherwise (if it's not reflecting the route), two next-hops are sent if explicitly configured (
nexthop-local unchanged) or the peer is directly connected (that is, either peering is
on link-local address or the global IPv4 or IPv6 address is directly connected) and the route
is either a local/self-originated route or the peer is an eBGP peer.
3. In all other cases, only one next-hop gets sent, unless an outbound route-map adds
another next-hop.
route-map can impose two next-hops in scenarios where Cumulus Linux would only send one
next-hop — by specifying set ipv6 nexthop link-local.
For all routes to eBGP peers and self-originated routes to iBGP peers, the global next-hop (first
value) is the peering address of the local system. If the peering is on the link-local address, this is
the global IPv6 address on the peering interface, if present; otherwise, it is the link-local IPv6
address on the peering interface.
For other routes to iBGP peers (eBGP to iBGP or reflected), the global next-hop will be the global
next-hop in the received attribute.
If this address were a link-local IPv6 address, it would get reset so that the link-local
IPv6 address of the eBGP peer is not passed along to an iBGP peer, which most likely
may be on a different link.
route-map and/or the peer configuration can change the above behavior. For example, route-
map can set the global IPv6 next-hop or the peer configuration can set it to self — which is
relevant for iBGP peers. The route-map or peer configuration can also set the next-hop to
unchanged, which ensures the source IPv6 global next-hop is passed around — which is
relevant for eBGP peers.
Whenever two next-hops are being sent, the link-local next-hop (the second value of the two) is
the link-local IPv6 address on the peering interface unless it is due to nh-local-unchanged or
route-map has set the link-local next-hop.
Network administrators cannot set martian values for IPv6 next-hops in route-map. Also, global
and link-local next-hops are validated to ensure they match the respective address types.
In a received update, a martian check is imposed for the IPv6 global next-hop. If the check fails,
it gets treated as an implicit withdraw.
If two next-hops are received in an update and the second next-hop is not a link-local address, it
gets ignored and the update is treated as if only one next-hop was received.
Whenever two next-hops are received in an update, the second next-hop is used to install the
route into zebra. As per the previous point, it is already assured that this is a link-local IPv6
address. Currently, this is assumed to be reachable and is not registered with NHT.
When route-map specifies the next-hop as peer-address, the global IPv6 next-hop as well as
the link-local IPv6 next-hop (if it's being sent) is set to the peering address. If the peering is on a
link-local address, the former could be the link-local address on the peering interface, unless
there is a global IPv6 address present on this interface.
The above rules imply that there are scenarios where a generated update has two IPv6 next-hops, and
both of them are the IPv6 link-local address of the peering interface on the local system. If you are
peering with a switch or router that is not running Cumulus Linux and expects the first next-hop to be a
global IPv6 address, a route-map can be used on the sender to specify a global IPv6 address. This
conforms with the recommendations in the Internet draft [Link], "BGP4+
Peering Using IPv6 Link-local Address".
[Link] 327
Cumulus Networks
Limitations
Interface-based peering with separate IPv4 and IPv6 sessions is not supported.
ENHE is sent for IPv6 link-local peerings only.
If a IPv4 /30 or /31 IP address is assigned to the interface IPv4 peering will be used over IPv6 link-
local peering.
Make sure that IPv6 neighbor discovery router advertisements are supported and not
suppressed. In Quagga, you do this by checking the running configuration. Under the
interface configuration, use no ipv6 nd suppress-ra to remove router suppression.
Cumulus Networks recommends you adjust the router advertisement's interval to a shorter
value (ipv6 nd ra-interval <interval>) to address scenarios when nodes come up and
miss router advertisement processing to relay the neighbor’s link-local address to BGP. The
interval is measured in seconds and defaults to 600 seconds.
To connect to a different AS using the neighbor command, modify your configuration similar to the
following:
To connect to the same AS using the peer-group command, modify your configuration similar to the
following:
[Link] 329
Cumulus Networks
To connect to a different AS using the peer-group command, modify your configuration similar to
the following:
Configuration Tips
If you're using eBGP, besides specifying the neighbor's IP address, you also have to specify the
neighbor's ASN, since it is different for each neighbor. In such a case, you wouldn't specify the remote-
as for the peer-group.
Troubleshooting
The most common starting point for troubleshooting BGP is to view the summary of neighbors
connected to and some information about these connections. A sample output of this command is as
follows:
(Pop quiz: Are these iBGP or eBGP sessions? Hint: Look at the ASNs.)
It is also useful to view the routing table as defined by BGP:
[Link] 331
Cumulus Networks
A more detailed breakdown of a specific neighbor can be obtained using show ip bgp neighbor
<neighbor ip address>:
To see the details of a specific route such as from whom it was received, to whom it was sent, and so
forth, use the show ip bgp <ip address/prefix> command:
This shows that the routing table prefix seen by BGP is [Link]/24, that this route was not advertised
to any neighbor, and that it was heard by two neighbors, [Link] and [Link].
Here is another output of the same command, on a different node in the network:
The output is sent to the specified log file, usually /var/log/quagga/[Link], and looks like this:
[Link] 333
Cumulus Networks
type 6/3
2013/07/08 [Link].682071 BGP: %ADJCHANGE: neighbor [Link] Up
2013/07/08 [Link].682660 BGP: %ADJCHANGE: neighbor [Link] Up
Instead of the IPv6 address, the peering interface name is displayed in the show ip bgp summary
command and wherever else applicable:
Most of the show commands can take the interface name instead of the IP address, if that level of
specificity is needed:
[Link] 335
Cumulus Networks
Protocol Tuning
See Caveats and Errata below for information regarding ttl-security hops.
Here is an example:
itself has a keepalive timer that is exchanged between neighbors. By default, this keepalive timer is set
336 14 December 2015
Cumulus Linux 2.5.5 User Guide
itself has a keepalive timer that is exchanged between neighbors. By default, this keepalive timer is set
to 60 seconds. This time can be reduced to a lower number, but this has the disadvantage of increasing
the CPU load, especially in the presence of a lot of neighbors. keepalive-time is the periodicity with
which the keepalive message is sent. hold-time specifies how many keepalive messages can be lost
before the connection is considered invalid. It is usually set to 3 times the keepalive time. Here is an
example of reducing these timers:
We can make these the default for all BGP neighbors using a different command:
The following display snippet shows that the default values have been modified for this neighbor:
When you're in a configuration mode, such as when you're configuring BGP parameters, you
can run any show command by adding do to the original command. For example, do show
ip bgp neighbor was shown above. Under a non-configuration mode, you'd simply run:
Reconnecting Quickly
A BGP process attempts to connect to a peer after a failure (or on startup) every connect-time
seconds. By default, this is 120 seconds. To modify this value, use:
[Link] 337
Cumulus Networks
This command has to be specified per each neighbor, peer-group doesn’t support this option in quagga
.
Advertisement Interval
BGP by default chooses stability over fast convergence. This is very useful when routing for the
Internet. For example, unlike link-state protocols, BGP typically waits for a duration of advertisement-
interval seconds between sending consecutive updates to a neighbor. This ensures that an unstable
neighbor flapping routes won’t be propagated throughout the network. By default, this is set to 30
seconds for an eBGP session and 5 seconds for an iBGP session. For very fast convergence, set the
timer to 0 seconds. You can modify this as follows:
See this IETF draft for more details on the use of this value.
338 14 December 2015
Cumulus Linux 2.5.5 User Guide
See this IETF draft for more details on the use of this value.
Configuration Files
/etc/quagga/daemons
/etc/quagga/[Link]
Useful Links
Bidirectional forwarding detection (see page 339) (BFD) and BGP
Wikipedia entry for BGP (includes list of useful RFCs)
Quagga online documentation for BGP (may not be up to date)
IETF draft discussing BGP use within data centers
ttl-security Issue
Enabling ttl-security does not cause the hardware to be programmed with the relevant
information. This means that frames will come up to the CPU and be dropped there. It is
recommended that you use the cl-acltool command to explicitly add the relevant entry to
hardware.
For example, you can configure a file, like /etc/cumulus/acl/policy.d/01control_plane_bgp.
rules, with a rule like this for TTL:
INGRESS_INTF = swp1
INGRESS_CHAIN = INPUT, FORWARD
[iptables]
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport bgp -m
ttl --ttl 255 POLICE --set-mode pkt --set-rate 2000 --set-burst 1000
-A $INGRESS_CHAIN --in-interface $INGRESS_INTF -p tcp --dport bgp DROP
For more information about ACLs and cl-acltool, see Netfilter (ACLs) (see page 71).
[Link] 339
Cumulus Networks
BFD Parameters
You can configure the following BFD parameters for both IPv4 and IPv6 sessions:
The required minimum interval between the received BFD control packets.
The minimum interval for transmitting BFD control packets.
The detection time multiplier.
Configuring BFD
You configure BFD one of two ways: by specifying the configuration in the PTM [Link] file (see
page 139), or using Quagga (see page 291).
The Quagga CLI (see page ) can track IPv4 and IPv6 peer connectivity — both single hop and
multihop, and both link-local IPv6 peers and global IPv6 peers — using BFD sessions without needing
the [Link] file. Use Quagga to register multihop peers with PTM and BFD as well as for
monitoring the connectivity to the remote BGP multihop peer. Quagga can dynamically register and
unregister both IPv4 and IPv6 peers with BFD when the BFD-enabled peer connectivity is established or
de-established, respectively. Also, you can configure BFD parameters for each BGP or OSPF peer using
Quagga.
The BFD parameter configured in the topology file is given higher precedence over the client-
configured BFD parameters for a BFD session that has been created by both topology file and
client (Quagga).
BFD in BGP
For Quagga when using BGP, neighbors are registered and de-registered with PTM (see page 139)
dynamically when you enable BFD in BGP:
You can configure BFD parameters for each BGP neighbor. For example:
BFD in BGP
To see neighbor information in BGP, including BFD status, run show bgp neighbors <IP address>.
[Link] 341
Cumulus Networks
Keepalives: 2 1
Route Refresh: 0 0
Capability: 0 0
Total: 5 4
Minimum time between advertisement runs is 30 seconds
Update source is [Link]
BFD in OSPF
For Quagga using OSFP, neighbors are registered and de-registered dynamically with PTM (see page
139) when you enable or disable BFD in OSPF. A neighbor is registered with BFD when two-way
adjacency is established and deregistered when adjacency goes down if the BFD is enabled on the
interface. The BFD configuration is per interface and any IPv4 and IPv6 neighbors discovered on that
interface inherit the configuration.
BFD in OSPF
quagga(config)# interface X
quagga(config-if)# ipv6 ospf6 bfd
<2-255> Detect Multiplier
<cr>
quagga(config-if)# ipv6 ospf6 bfd 5
<50-60000> Required min receive interval
quagga(config-if)# ipv6 ospf6 bfd 5 500
<50-60000> Desired min transmit interval
quagga(config-if)# ipv6 ospf6 bfd 5 500 500
<cr>
quagga(config-if)# ipv6 ospf6 bfd 5 500 500
[Link] 343
Cumulus Networks
Troubleshooting BFD
To troubleshoot BFD, use ptmctl -b. For more information, see Prescriptive Topology Manager - PTM
(see page 139).
Contents
(Click to expand)
Contents (see page 345)
Understanding Equal Cost Routing (see page 345)
Understanding ECMP Hashing (see page 346)
Using cl-ecmpcalc to Determine the Hash Result (see page 346)
cl-ecmpcalc Limitations (see page 347)
ECMP Hash Buckets (see page 347)
Resilient Hashing (see page 349)
Resilient Hash Buckets (see page 350)
Removing Next Hops (see page 350)
Adding Next Hops (see page 352)
Configuring Resilient Hashing (see page 352)
Caveats (see page 353)
Useful Links (see page 353)
Have equal cost. If two routes from the same protocol are unequal, only the best route is
[Link] 345
Cumulus Networks
Have equal cost. If two routes from the same protocol are unequal, only the best route is
installed in the routing table.
BGP does not install multiple routes by default. To do so, use the maximum-paths command.
See the ECMP section (see page 318) of the BGP chapter for more information.
To prevent out of order packets, ECMP hashing is done on a per-packet basis. However, all packets with
the same source and destination IP addresses and the same source and destination ports always hash
to the same next hop. ECMP hashing does not keep a record of flow states.
ECMP hashing does not keep a record of packets that have hashed to each next hop and does not
guarantee that traffic sent to each next hop is equal.
cl-ecmpcalc: error: --sport and --dport required for TCP and UDP frames
cl-ecmpcalc Limitations
cl-ecmpcalc can only take input interfaces that can be converted to a single physical port in the port
tab file, like the physical switch ports (swp). Virtual interfaces like bridges, bonds, and subinterfaces are
not supported.
A new next hop is added and a new hash bucket is created. As a result, the hash and hash bucket
assignment changed, causing the existing flows to be sent to different next hops.
A next hop fails and the next hop and hash bucket are removed. The remaining next hops may be
reassigned.
In most cases, the modification of hash buckets has no impact on traffic flows as traffic is being forward
to a single end host. In deployments where multiple end hosts are using the same IP address (anycast),
resilient hashing must be used.
Resilient Hashing
In Cumulus Linux when a next hop fails is or is removed from an ECMP pool, the hashing or hash
bucket assignment can change. For deployments where there is a need for flows to always use the
same next hop, like TCP anycast deployments, this can create session failures.
The ECMP hash performed with resilient hashing is exactly the same as the default hashing mode. Only
the method in which next hops are assigned to hash buckets differs.
Resilient hashing supports both IPv4 and IPv6 routes.
Resilient hashing is not enabled by default. See below for steps on configuring it.
Resilient hashing prevents disruptions when new next hops are removed. It does not prevent
disruption when next hops are added.
[Link] 349
Cumulus Networks
With 12 buckets assigned and four next hops, instead of reducing the number of buckets — which
would impact flows to known good hosts — the remaining next hops replace the failed next hop.
[Link] 351
Cumulus Networks
After the failed next hop is removed, the remaining next hops are installed as replacements. This
prevents impact to any flows that hash to working next hops.
As a result, some flows may hash to new next hops, which can impact anycast deployments.
An ECMP route counts as a single route with multiple next hops. The following example is
considered to be a single ECMP route:
All ECMP routes must use the same number of buckets (the number of buckets cannot be configured
per ECMP route).
The number of buckets can be configured as 64, 128, 256, 512 or 1024; the default is 128:
64 1024
128 512
256 256
512 128
1024 64
A larger number of ECMP buckets reduces the impact on adding new next hops to an ECMP route.
However, the system supports fewer ECMP routes. If the maximum number of ECMP routes have been
installed, new ECMP routes log an error and are not installed.
To enable resilient hashing, edit /etc/cumulus/datapath/[Link]:
Caveats
Resilient hashing is only supported on switches with the Trident II chipsets. You can run netshow
system to determine the chipset.
Useful Links
[Link]
[Link] 353
Cumulus Networks
Management VRF
Management VRF (multiple routing tables and forwarding) provides routing separation between the out-
of-band management network and the in-band data plane network. When management VRF is
enabled, applications running on control plane processor communicate out from the management
network unless configured otherwise.
Management VRF creates two routing tables within the Linux kernel:
main: This is the routing table for all the data plane switch ports.
mgmt: This is the routing table for eth0.
Cumulus Linux only supports eth0 as the management interface. VLAN subinterfaces, bonds, bridges
and the front panel switch ports are not supported as management interfaces.
Management VRF assumes all traffic generated by the switch (except via Quagga) will exit eth0 by default,
so unless there is application-level intervention, any packet generated by an application on the switch
will only reference the eth0 routing table (the mgmt table). Applications that need to communicate over
the data plane network (the main table) must bind to the loopback IP address.
For example, if the switch is responding to an inbound SSH connection or inbound ping, management
VRF does not force the traffic out through eth0. However, if you attempt to SSH from the switch
outbound, then management VRF will force the traffic to exit eth0, unless you specify otherwise. For
example, when initiating an SSH connection, you can use -b <loopback IP address> to SSH to a
device via the data plane network.
Management VRF has hooks in the eth0 DHCP client to force the correct mgmt table
routes when the DHCP address is obtained. If you use static IP address assignment on
eth0, you have to manually configure the routes before you execute this step. See the
'Using Static IP Addresses on eth0' section below for more information.
4. Restart Quagga:
You can also bounce adjacency to the peer advertising the default route to get the
default route from the data plane network into the main routing table.
cl-mgmtvrf --status
This will display cl-mgmtvrf is NOT enabled or cl-mgmtvrf is enabled, depending upon
whether management VRF is disabled or enabled.
If management VRF is disabled and the data plane adds a default route, the default route via
the management interface will not be added to main routing table.
ping -I eth0
or
[Link] 355
Cumulus Networks
DNS does not work with traceroute or ping unless you explicitly add support for the DNS
server in the main routing table.
SSH
If you SSH to the switch through a switch port, it works as expected. If you need to SSH from the device
out a switch port, use ssh -b <ip_address_of_swp_port>. For example:
Or:
auto eth0
iface eth0 inet static
address [Link]/24
post-up ip route add [Link]/24 dev eth0 table mgmt
post-up ip route add default via [Link] dev eth0 table mgmt
post-up ip route del [Link]/24 dev eth0 table main
post-down ip route del [Link]/24 dev eth0 table mgmt
post-down ip route del default via [Link] dev eth0 table mgmt
[Link] 357
Cumulus Networks
Enabling management VRF via cl-mgmtvrf --enable after this step should lead to the expected
routing behavior.
The post-down commands are there to ensure that no routing race condition can occur on
an interface experiencing route flapping. As a result, the following error messages during a
link flap are harmless and can be ignored:
warning: eth0: post-down cmd 'ip route del [Link]/24 dev eth0
table mgmt' failed (RTNETLINK answers: No such process)
warning: eth0: post-down cmd 'ip route del default [Link] via
eth0 table mgmt' failed (Error: either "to" is duplicate, or
"[Link]" is a garbage.)
If you are using the Cumulus Linux management namespace feature (via the cl-ns-mgmt
utility), you cannot enable management VRF, as the two features are incompatible.
Management VRF does not run if Cumulus Linux detects that you have management
namespaces enabled, and vice versa.
Log Files
/var/log/[Link]
Monitoring
358 and Troubleshooting 14 December 2015
Cumulus Linux 2.5.5 User Guide
Contents
(Click to expand)
Contents (see page 359)
Commands (see page 359)
Using the Serial Console (see page 359)
Configuring the Serial Console on PowerPC or ARM Switches (see page 359)
Configuring the Serial Console on x86 Switches (see page 360)
Diagnostics Using cl-support (see page 361)
Sending Log Files to a syslog Server (see page 362)
Next Steps (see page 364)
Commands
cl-support
fw_setenv
You must reboot the switch for the baudrate change to take effect.
The valid values for baudrate are:
300
600
[Link] 359
Cumulus Networks
600
1200
2400
4800
9600
19200
38400
115200
Incorrect configuration settings in grub can cause the switch to be inaccessible via the
console. Grub changes should be carefully reviewed before implementation.
2. After you save your changes to the grub configuration, type the following at the command
prompt:
cumulus@switch:~$ update-grub
3. If you plan on accessing your switch's BIOS over the serial console, you need to update the baud
rate in the switch BIOS. For more information, see this knowledge base article.
4. Reboot the switch.
Example output:
cumulus@switch:~$ ls /var/support
cl_support_20130806_032720.[Link]
The directory structure is compressed using LZMA2 compression and can be extracted using the unxz
command:
cumulus@switch:~$ cd /var/support
cumulus@switch:~$ sudo unxz cl_support_20130729_140040.[Link]
cumulus@switch:~$ sudo tar xf cl_support_20130729_140040.tar
cumulus@switch:~$ ls -l cl_support_20130729_140040/
[Link] 361
Cumulus Networks
Directory Description
core Contains the core files generated from Cumulus Linux HAL process, switchd.
etc Is a replica of the switch’s /etc directory. /etc contains all the general Linux
configuration files, as well as configurations for the system’s network interfaces, quagga,
jdoo, and other packages.
log Is a replica of the switch's /var/log directory. Most Cumulus Linux log files are located
in this directory. Notable log files include [Link], [Link], quagga log files,
and syslog. For more information, read this knowledge base article.
proc Is a replica of the switch’s /proc directory. In Linux, /proc contains runtime system
information (like system memory, devices mounted, and hardware configuration). These
files are not actual files but the current state of the system.
support Is a set of files containing further system information, which is obtained by cl-support
running commands such as ps -aux, netstat -i, and so forth — even the routing
tables.
cl-support, when untarred, contains a [Link] file. This file indicates what reason triggered it.
When contacting Cumulus Networks technical support, please attach the cl-support file if possible.
For more information about cl-support, read Understanding and Decoding the cl-support Output File
(see page 392).
1.
362 14 December 2015
Cumulus Linux 2.5.5 User Guide
*.* @[Link]:514
2. Restart rsyslog.
Starting with Cumulus Linux 2.5.4, all Cumulus Linux rules have been moved from /etc
/[Link] into separate files in /etc/rsyslog.d/, which are called at the end of the
GLOBAL DIRECTIVES section of /etc/[Link]. As a result, the RULES section at the
end of [Link] is ignored because the messages have to be processed by the rules in
/etc/rsyslog.d and then dropped by the last line in /etc/rsyslog.d/[Link].
If you need to send other log files to a syslog server, configure a new file in /etc/rsyslog.d, as
above, and add the following lines:
$ModLoad imfile
$InputFileName /var/log/[Link]
$InputFileStateFile logfile-log
$InputFileTag switchd:
$InputFileSeverity info
$InputFileFacility local7
$InputFilePollInterval 5
$InputRunFileMonitor
Setting Description
[Link] 363
Cumulus Networks
Setting Description
$InputFileName The file to be sent to the syslog server. In this example, you are going to
send changes made to /var/log/[Link] to the syslog server.
$InputFileStateFile This is used by rsyslog to track state of the file being monitored. This must
be unique for each file being monitored.
$InputFileTag Defines the syslog tag that will precede the syslog messages. In this
example, all logs are prefaced with switchd.
$InputFileSeverity Defines the logging severity level sent to the syslog server.
$InputFilePollInterval Defines how frequently in seconds rsyslog looks for new information in the
file. Lower values provide faster updates but create slightly more load on the
CPU.
$InputRunFileMonitor Enables the file monitor module with the configured settings.
Setting Description
Finally, the if $programname line is what sends the log files to the syslog server. It follows the same
syntax as the /var/log/syslog file, where @ indicates UDP, [Link] is the IP address of the
syslog server, and 514 is the UDP port. The value switchd must match the value in $InputFileTag.
Next Steps
The links below discuss more specific monitoring topics.
Contents
364 14 December 2015
Cumulus Linux 2.5.5 User Guide
Contents
(Click to expand)
Contents (see page 364)
Entering Single User Mode on a PowerPC or ARM Switch (see page 365)
Entering Single User Mode on an x86 Switch (see page 365)
2. After the system boots, the shell command prompt appears. In this mode, you can change the
root password or test a boot service that is hanging the boot process.
3. Reboot the system.
In this example, you are selecting the slot2 image. Under the linux option, add init=/bin/bash:
[Link] 365
Cumulus Networks
|^
| insmod ext2 |
| set root='(hd0,gpt3)' |
| search --no-floppy --fs-uuid --set=root c42be287-5321-4e77-975f-54e237a\|
| d72b0 |
| echo 'Loading Linux ...' |
| linux /cl-vmlinuz-3.2.60-1+deb7u1+cl2.5-slot-2 root=UUID=f01a2d40-d2fe-\|
| 435b-b3d1-7edc1eb0c42f console=ttyS0,115200n8 cl_platform=dell_s6000_s1\|
| 220 quiet active=2 init=/bin/bash |
| echo 'Loading initial ramdisk ...' A |
| initrd /[Link]-3.2.60-1+deb7u1+cl2.5-slot-2 |
|
|
+-------------------------------------------------------------------------+
Installing netshow
Starting with Cumulus Linux 2.5.5, netshow is included in the main repository for Cumulus Linux.
However, it is not installed by default if you upgraded to this version using apt-get dist-upgrade.
You install netshow in Cumulus Linux in one of two ways:
By doing a binary image install (see page 17) of Cumulus Linux 2.5.5 using cl-img-install
Install the netshow package using apt-get install netshow
Debian and Red Hat packages will be available in the near future.
Using netshow
Running netshow with no arguments displays all available command line arguments usable by
netshow. (Running netshow --help gives you the same information.) The output looks like this:
cumulus@leaf1$ netshow
Usage:
netshow system [--json | -j ]
netshow counters [errors] [all] [--json | -j | -l | --legend ]
netshow lldp [--json | -j | -l | --legend ]
netshow interface [<iface>] [all] [--mac | -m ] [--oneline | -1 | --
json | -j | -l | --legend ]
netshow access [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l
| --legend ]
netshow bridges [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l
| --legend ]
netshow bonds [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l |
--legend ]
netshow bondmems [all] [--mac | -m ] [--oneline | -1 | --json | -j | -
l | --legend ]
netshow mgmt [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l |
--legend ]
netshow l2 [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l | --
legend ]
netshow l3 [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l | --
legend ]
netshow trunks [all] [--mac | -m ] [--oneline | -1 | --json | -j | -l
| --legend ]
netshow (--version | -V)
Help:
* default is to show intefaces only in the UP state.
counters summary of physical port counters.
interface summary info of all interfaces
[Link] 367
Cumulus Networks
Options:
all show all ports include those are down or admin down
--mac show inteface MAC in output
--version netshow software version
--oneline output each entry on one line
-1 alias for --oneline
--json print output in json
-l alias for --legend
--legend print legend key explaining abbreviations
cumulus@leaf1$
A Linux administrator can quickly see the few options available with the tool. One core tenet of
netshow is for it to have a small number of command options. netshow is not designed to solve your
network problem, but to help answer this simple question: "What is the basic network setup of my
Linux device?" By helping to answer that question, a Linux administrator can spend more time
troubleshooting the specific network problem instead of spending most of their time understanding
the basic network state.
Originally developed for Cumulus Linux, netshow works on Debian-based servers and switches and
Red Hat-based Linux systems.
netshow is designed by network operators, which has rarely occurred in the networking industry,
where most command troubleshooting tools are designed by developers and are most useful in the
network application development process.
Showing Interfaces
To show all available interfaces that are physically UP, run netshow interface:
[Link] 369
Cumulus Networks
You can get information about the switch itself by running netshow system:
UpTime: [Link]
cumulus@leaf1$
For server2, netshow can help us see the OpenStack network configuration. The netshow output
below shows an summary of a Kilo-based OpenStack server running 3 tenants.
OpenStack interface numbering is not the easiest read, but here netshow can quickly show you:
A list of all the interfaces in admin UP state and carrier UP state
3 bridges
That STP is disabled for all the bridges
An uplink trunk interface with 3 VLANs configured on it
Many tap interfaces, most likely the virtual machines
This output took about 5 seconds to get and another 1 minute to analyze. To get this same level of
understanding using traditional tools such as:
ip link show
brctl show
ip addr show
... could take about 10 minutes. This is a significant improvement in productivity!
netshow uses a plugin architecture and can be easily expanded. An OpenStack interface discovery
module is currently in development. If netshow is run on a hypervisor with OpenStack Keystone login
environment variables like OS_TENANT_NAME, netshow should show the above output with a better
interface discovery state, where netshow collects from OpenStack information from libvirt, nova
and neutron to overlay the virtual machine and tenant subnet information over the interface kernel
state information.
Interface discovery is one of the most powerful features of netshow. The ability to expand its interface
[Link] 371
Cumulus Networks
Interface discovery is one of the most powerful features of netshow. The ability to expand its interface
discovery capabilities further simplifies understanding basic network troubleshooting, making the
Linux administrator more productive and improving time to resolution while investigating network
problems.
Contributions Welcome!
netshow is an open source project licensed under GPLv2. To contribute please contact Cumulus
Networks through the Cumulus Community Forum or the Netshow Linux Provider Github Repository
Home. You can find developer documentation at [Link]. The documentation is still
under development.
Contents
(Click to expand)
Contents (see page 372)
Commands (see page 372)
Monitoring Interfaces Using ethtool (see page 372)
Viewing and Clearing Interface Counters (see page 374)
Monitoring Switch Port SFP/QSFP Using ethtool (see page 375)
Commands
cl-netstat
ethtool
[Link] 373
Cumulus Networks
HwIfOutQDrops: 0
HwIfOutNonQDrops: 0
SoftOutErrors: 0
SoftOutDrops: 0
SoftOutTxFifoFull: 0
HwIfOutQLen: 0
Option Description
-c Copies and clears statistics. It does not clear counters in the kernel or hardware.
Option Description
routes: 8092 <<<< if all routes are IPv6, or 16384 if all routes are IPv4
long mask routes 2048 <<<< these are routes with a mask longer than the
route mask limit
route mask limit 64
[Link] 375
Cumulus Networks
host_routes: 8192
ecmp_nhs: 16346
ecmp_nhs_per_route: 52
This translates to about 314 routes with ECMP next hops, if every route has the maximum ECMP NHs.
For systems with a Trident+ chipset, the limits are as follows:
This translates to about 77 routes with ECMP next hops, if every route has the maximum ECMP NHs.
You can monitor this in Cumulus Linux with the cl-resource-query command. Results vary between
switches running on Trident+ and Trident II chipsets.
cl-resource-query results for a Trident II switch:
IPv6 neighbors: 0
IPv4/IPv6 entries: 33, 0% of maximum value 16284
Long IPv6 entries: 0, 0% of maximum value 256
IPv4 Routes: 29
IPv6 Routes: 2
Total Routes: 31, 0% of maximum value 32768
ECMP nexthops: 0, 0% of maximum value 4041
MAC entries: 0, 0% of maximum value 131072
Contents
(Click to expand)
Contents (see page 377)
Commands (see page 377)
Monitoring Hardware Using decode-syseeprom (see page 378)
Command Options (see page 378)
Related Commands (see page 379)
Monitoring Hardware Using sensors (see page 379)
Command Options (see page 380)
Monitoring Switch Hardware Using SNMP (see page 380)
Starting SNMP daemon (see page 380)
Managing the Switch (see page 381)
Public Community Disabled (see page 383)
Monitoring System Units Using smond (see page 383)
Command Options (see page 384)
Keeping the Switch Alive Using the Hardware Watchdog (see page 384)
Configuration Files (see page 385)
Useful Links (see page 385)
Commands
decode-syseeprom
[Link] 377
Cumulus Networks
dmidecode
lshw
sensors
smond
cumulus@switch:~# decode-syseeprom
TlvInfo Header:
Id String: TlvInfo
Version: 1
Total Length: 114
TLV Name Code Len Value
-------------------- ---- --- -----
Product Name 0x21 4 4804
Part Number 0x22 14 R0596-F0009-00
Device Version 0x26 1 2
Serial Number 0x23 19 D1012023918PE000012
Manufacture Date 0x25 19 10/09/2013 [Link]
Base MAC Address 0x24 6 [Link]
MAC Addresses 0x2A 2 53
Vendor Name 0x2D 17 Penguin Computing
Label Revision 0x27 4 4804
Manufacture Country 0x2C 2 CN
CRC-32 0xFE 4 0x96543BC5
(checksum valid)
Command Options
Usage: /usr/cumulus/bin/decode-syseeprom [-a][-r][-s [args]][-t]
Option Description
Option Description
-s Sets the EEPROM content if the EEPROM is writable. args can be supplied in command line
in a comma separated list of the form '<field>=<value>, ...'. ',' and '=' are
illegal characters in field names and values. Fields that are not specified will default to their
current values. If args are supplied in the command line, they will be written without
confirmation. If args is empty, the values will be prompted interactively.
-t Selects the target EEPROM (board, psu2, psu1) for the read or write operation; default is
TARGET board.
Related Commands
You can also use the dmidecode command to retrieve hardware configuration information that’s been
populated in the BIOS.
You can use apt-get to install the lshw program on the switch, which also retrieves hardware
configuration information.
cumulus@switch:~$ sensors
tmp75-i2c-6-48
Adapter: i2c-1-mux (chan_id 0)
temp1: +39.0 C (high = +75.0 C, hyst = +25.0 C)
tmp75-i2c-6-49
Adapter: i2c-1-mux (chan_id 0)
temp1: +35.5 C (high = +75.0 C, hyst = +25.0 C)
ltc4215-i2c-7-40
Adapter: i2c-1-mux (chan_id 1)
in1: +11.87 V
in2: +11.98 V
power1: 12.98 W
curr1: +1.09 A
max6651-i2c-8-48
Adapter: i2c-1-mux (chan_id 2)
[Link] 379
Cumulus Networks
Output from the sensors command varies depending upon the switch hardware you use, as
each platform ships with a different type and number of sensors.
Command Options
Usage: sensors [OPTION]... [CHIP]...
Option Description
-c, --config- Specify a config file; use - after -c to read the config file from stdin; by default,
file sensors references the configuration file in /etc/sensors.d/.
-s, --set Executes set statements in the config file (root only); sensors -s is run once at boot
time and applies all the settings to the boot drivers.
If [CHIP] is not specified in the command, all chip info will be printed. Example chip names include:
lm78-i2c-0-2d *-i2c-0-2d
lm78-i2c-0-* *-i2c-0-*
lm78-i2c-*-2d *-i2c-*-2d
lm78-i2c-*-* *-i2c-*-*
lm78-isa-0290 *-isa-0290
lm78-isa-* *-isa-*
lm78-*
jdoo and monit are mutually exclusive, so the monit package is not installed on Cumulus
Linux 2.5.2 and later. If you would prefer to use monit, it will uninstall jdoo from Cumulus
Linux. However, Cumulus Networks will not provide support for issues with monit. Read this
knowledge base article for more information about upgrading to jdoo.
#######################################################################
#######
## Services
#######################################################################
#######
check process snmpd with pidfile /var/run/[Link]
every 6 cycles
group networking
start program = "/etc/init.d/snmpd start"
stop program = "/etc/init.d/snmpd stop"
jdoo takes care of monitoring snmpd and starts the service, if it is not already running.
UDP
[Link] 381
Cumulus Networks
UDP
UCD-SNMP (For information on exposing CPU and memory information via SNMP, see this
knowledge base article.)
IF-MIB
LLDP (note, you need to enable the SNMP subagent (see page 138) in LLDP)
LM-SENSORS MIB
NET-SNMP-EXTEND-MIB (See also this knowledge base article on extending NET-SNMP in
Cumulus Linux to include data from power supplies, fans and temperature sensors.)
Resource utilization: Cumulus Linux includes its own resource utilization MIB, which is similar to
using cl-resource-query. It monitors L3 entries by host, route, nexthops, ECMP groups and
L2 MAC/BDPU entries. The MIB is defined in /usr/share/snmp/Cumulus-Resource-Query-
[Link].
Discard counters: Cumulus Linux also includes its own counters MIB, defined in /usr/share
/snmp/[Link].
The overall Cumulus Linux MIB is defined in /usr/share/snmp/[Link].
Some MIBs, like storage information, are not included by default in [Link] in Cumulus
Linux. This results in some default views on common network tools (like librenms) to return
less than optimal data.
To include more of these MIBs, consider enabling all of the .[Link].2.1 range. This provides
for a very simple configuration file with little worry of any "default" MIBs being missed by the
monitoring system. However, this grants access to a large number of MIBs (all of the MIB2
MIBS), which could reveal more data than expected and consumes more CPU resources.
To enable the .[Link].2.1 range, replace line 39 - 71 in [Link] with the following code
snippet:
#####################################################################
##########
#
# ACCESS CONTROL
#
# system
view systemonly included .[Link].2.1
# quagga ospf6
view systemonly included .[Link].3.102
# lldpd
view systemonly included .1.0.8802.1.1.2
#lmsensors
view systemonly included .[Link].4.1.2021.13.16
# Cumulus specific
view systemonly included .[Link].4.1.40310.1
view systemonly included .[Link].4.1.40310.2
cumulus@switch:~$ smonctl
Board : OK
Fan : OK
PSU1 : OK
PSU2 : BAD
Temp1 (Networking ASIC Die Temp Sensor ): OK
Temp10 (Right side of the board ): OK
Temp2 (Near the CPU (Right) ): OK
Temp3 (Top right corner ): OK
Temp4 (Right side of Networking ASIC ): OK
Temp5 (Middle of the board ): OK
Temp6 (P2020 CPU die sensor ): OK
[Link] 383
Cumulus Networks
Command Options
Usage: smonctl [OPTION]... [CHIP]...
Option Description
run_watchdog=1
To disable the watchdog, edit the /etc/watchdog.d/<your_platform> file and set run_watchdog
to 0:
run_watchdog=0
You can modify the settings for the watchdog — like the timeout setting and scheduler priority — in its
configuration file, /etc/[Link].
Configuration Files
/etc/cumulus/[Link]
/etc/cumulus/[Link]
/etc/sensors.d/<switch>.conf - sensor configuration file (do not edit it!)
/etc/[Link]
Useful Links
[Link]
[Link]
Net-SNMP tutorials
Contents
(Click to expand)
Contents (see page 385)
Installing hsflowd (see page 385)
Configuring sFlow (see page 385)
Configuring sFlow via DNS-SD (see page 386)
Manually Configuring /etc/[Link] (see page 386)
Configuring sFlow Visualization Tools (see page 387)
Configuration Files (see page 387)
Useful Links (see page 387)
Installing hsflowd
To download and install the hsflowd package, use apt-get:
Configuring sFlow
[Link] 385
Cumulus Networks
Configuring sFlow
You can configure hsflowd to send to the designated collectors via two methods:
DNS service discovery (DNS-SD)
Manually configuring /etc/[Link]
The above snippet instructs hsflowd to send sFlow data to collector1 on port 6343 and to collector2
on port 6344. hsflowd will poll counters every 20 seconds and sample 1 out of every 2048 packets.
After the initial configuration is ready, bring up the sFlow daemon by running:
DNSSD = off
sampling.1G=2048
sampling.10G=4096
sampling.40G=8192
collector {
ip = [Link]
udpport = 6343
}
collector {
ip = [Link]
udpport = 6344
}
This configuration polls the counters every 20 seconds, samples 1 of every 2048 packets and sends this
information to a collector at [Link] on port 6343 and to another collector at [Link] on port
6344.
Some collectors require each source to transmit on a different port, others may listen on only
one port. Please refer to the documentation for your collector for more information.
Configuration Files
/etc/[Link]
Useful Links
sFlow Collectors
sFlow Wikipedia page
[Link] 387
Cumulus Networks
Contents
(Click to expand)
Contents (see page 388)
Sample VXLAN Statistics (see page 388)
Sample VLAN Statistics (see page 389)
For VLANs Using the non-VLAN-aware Bridge Driver (see page 389)
For VLANs Using the VLAN-aware Bridge Driver (see page 390)
Configuring the Counters in switchd (see page 390)
Configuring the Poll Interval (see page 391)
Configuring Internal VLAN Statistics (see page 391)
Clearing Statistics (see page 391)
Caveats and Errata (see page 391)
[Link] 389
Cumulus Networks
23201498 227514 0 0 0 0
TX: bytes packets errors dropped carrier collsns
18198262 178443 0 0 0 0
If you change one of these settings on the fly, the new configuration applies only to those
VNIs or VLANs set up after the configuration changed; previously allocated counters remain
as is.
#[Link].show_internal_vlans = FALSE
Clearing Statistics
Since ethtool is not supported for virtual devices, you cannot clear the statistics cache maintained by
the kernel. You can clear the hardware statistics via switchd:
by the default ACLs in Cumulus Linux, so the CPU might receive fewer than the 500 packets if
[Link] 391
Cumulus Networks
by the default ACLs in Cumulus Linux, so the CPU might receive fewer than the 500 packets if
the incoming packet rate is too high. The TX counter for the bridge should be equal to 500*
(number of ports in the bridge - incoming port + CPU port) or just 500 * number of ports in the
bridge.
You cannot use ethtool -S for virtual devices. This is because the counters available via
netdev are sufficient to display the vlan/vxlan counters currently supported in the hardware
(only rx/tx packets/bytes are supported currently).
Example output:
cumulus@switch:~$ ls /var/support
cl_support__switch_20141204_203833
(Click to expand)
The cl-support command generates a tar archive of useful information for troubleshooting that
can be auto-generated or manually created. To manually create it, run the cl-support command.
The cl-support file is automatically generated when: (see page 392)
Understanding the File Naming Scheme (see page 393)
This is always This is the hostname The date in year, The time in hours, minutes,
prepended to of the switch where month, day; so seconds; so 203833 is 20, 38, 33
the [Link] cl-support was 20141204 is ([Link]) or the equivalent to 8:
output. executed. December, 4th, 38:33 PM.
2014.
Option Description
cumulus@switch:~$ ls -l cl_support__switch_20141204_203834/
[Link] 393
Cumulus Networks
The cl_support file, when untarred, contains a [Link] file. This file indicates what reason
triggered the event. When contacting Cumulus Networks technical support, please attach the cl-
support file if possible.
The directory contains the following elements:
Directory Description
cl- This is a copy of the cl-support script that generated the cl_support file. It is copied
support so Cumulus Networks knows exactly which files were included and which weren't. This
helps to fix future cl-support requests in the future.
core Contains the core files generated from the Cumulus Linux HAL (hardware abstraction
layer) process, switchd.
etc etc is the core system configuration directory. cl-support replicates the switch’s /etc
directory. /etc contains all the general Linux configuration files, as well as
configurations for the system’s network interfaces, quagga, jdoo, and other packages.
var/log /var is the "variable" subdirectory, where programs record runtime information. System
logging, user tracking, caches and other files that system programs create and monitor
go into /var. cl-support includes only the log subdirectory of the var system-level
directory and replicates the switch’s /var/log directory. Most Cumulus Linux log files
are located in this directory. Notable log files include [Link], [Link],
quagga log files, and syslog. For more information, read this knowledge base article.
proc proc (short for processes) provides system statistics through a directory-and-file
interface. In Linux, /proc contains runtime system information (like system memory,
devices mounted, and hardware configuration). cl-support simply replicates the switch’
s /proc directory to determine the current state of the system.
support support is not a replica of the Linux file system like the other folders listed above.
Instead, it is a set of files containing the output of commands from the command line.
Examples include the output of ps -aux , netstat -i , and so forth — even the routing
tables are included.
This guide on NixCraft is amazing for understanding how /var/log works. The green highlighted rows
below are the most important logs and usually looked at first when debugging.
/var/log Information from the update-alternatives are logged into this log
/alternatives. file.
log
/var/log/apt Information the apt utility can send logs here; for example, from
apt-get install and apt-get remove.
/var/log Contains log information stored by the Linux audit daemon, auditd
/audit/ .
/var/log This file contains information about failed login attempts. Use the
/btmp last command to view the btmp file. For example:
/var/log Contains kernel ring buffer information. When the system boots up, dmesg is one of
/dmesg it prints number of messages on the screen that display information the few places
about the hardware devices that the kernel detects during boot to determine
process. These messages are available in the kernel ring buffer and hardware
whenever a new message arrives, the old message gets overwritten. errors.
You can also view the content of this file using the dmesg command.
/var/log Contains failed user login attempts. Use the faillog command to
/faillog display the contents of this file.
[Link] 395
Cumulus Networks
/var/log/fsck The fsck utility is used to check and optionally repair one or more
/* Linux filesystems.
/var/log The news command keeps you informed of news concerning the
/news/* system.
/var/log The main system log, which logs everything except auth-related The primary
/syslog messages. log; it's easiest
to grep this file
to see what
occurred
during a
problem.
File Description
/etc/nologin nologin prevents unprivileged users from logging into the system.
This is the alphabetical of the output from running ls -l on the /etc directory structure created by
cl-support. The green highlighted rows are the ones Cumulus Networks finds most important when
troubleshooting problems.
[Link]
[Link] 397
Cumulus Networks
[Link] 399
Cumulus Networks
[Link] 401
Cumulus Networks
lsb-release Shows the current version of Linux on the system. This shows you the version
Run cat /etc/lsb-release for output. of the operating system
you are running; also
compare this to the output
of cl-img-select.
[Link]
network Contains the network interface configuration for The main configuration file
ifup and ifdown. is under /etc/network
/interfaces. This is
where you configure L2
and L3 information for all
of your front panel ports
(swp interfaces). Settings
like MTU, link speed, IP
address information,
VLANs are all done here.
[Link]
[Link] 403
Cumulus Networks
ptm.d The directory containing scripts that are run if PTM Cumulus Linux-specific
(see page 139) passes or fails. folder for PTM (prescriptive
topology manager).
[Link] Resolver configuration file, which is where DNS is You need DNS to reach the
set (domain, nameserver and search). Cumulus Linux repository.
rmt
[Link] 405
Cumulus Networks
securetty This file lists terminals into which the root user can
log in.
timezone If this file exists, it is read and its contents are used
as the time zone name.
[Link]
[Link] 407
Cumulus Networks
support This shows you all the interfaces (including swp front panel ports), IP
/[Link] address information, admin state and physical state.
cumulus@sw
$ ip addr
show
Contents
(Click to expand)
Contents (see page 408)
Identifying Active Listener Ports for IPv4 and IPv6 (see page 409)
Identifying Daemons Currently Active or Stopped (see page 409)
[Link] 409
Cumulus Networks
For example:
For example:
Contents
(Click to expand)
Contents (see page 411)
Enabling Logging for Networking (see page 411)
Using ifquery to Validate and Debug Interface Configurations (see page 412)
Debugging Mako Template Errors (see page 414)
ifdown Cannot Find an Interface that Exists (see page 415)
Removing All References to a Child Interface (see page 415)
MTU Set on a Logical Interface Fails with Error: "Numerical result out of range" (see page 416)
Interpreting iproute2 batch Command Failures (see page 416)
Understanding the "RTNETLINK answers: Invalid argument" Error when Adding a Port to a Bridge
(see page 417)
[Link] 411
Cumulus Networks
$cat /etc/default/networking
#
#
# Parameters for the /etc/init.d/networking script
#
#
# Exclude interfaces
EXCLUDE_INTERFACES=
Use ifquery --check to check the current running state of an interface within the interfaces file. It
will return exit code 0 or 1 if the configuration does not match. The line bond-xmit-hash-policy
layer3+7 below fails because it should read bond-xmit-hash-policy layer3+4.
Use ifquery --running to print the running state of interfaces in the interfaces file format:
ifquery --syntax-help provides help on all possible attributes supported in the interfaces file.
For complete syntax on the interfaces file, see man interfaces and man ifupdown-addons-
interfaces.
You can use ifquery --print-savedstate to check the ifupdown2 state database. ifdown works
only on interfaces present in this state database.
[Link] 413
Cumulus Networks
# ssim2 added
auto swp45
iface swp45
auto swp46
iface swp46
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet dhcp
auto bond1
iface bond1
bond-miimon 100
bond-slaves swp2 swp1
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto bond3
iface bond3
[Link] 415
Cumulus Networks
bond-miimon 100
bond-slaves swp8 swp6 swp7
bond-mode 802.3ad
bond-lacp-rate 1
bond-min-links 1
bond-xmit-hash-policy layer3+4
auto br0
iface br0
bridge-ports swp3 swp5 bond1 swp4 bond3
bridge-pathcosts swp3=4 swp5=4 swp4=4
address [Link]/24
address 2001::10/64
Notice that bond1 is a member of br0. If you comment out or simply delete bond1 from /etc/network
/interfaces, you must remove the reference to it from the br0 configuration. Otherwise, if you
reload the configuration with ifreload -a, bond1 is still part of br0.
MTU Set on a Logical Interface Fails with Error: "Numerical result out of range"
This error occurs when the MTU (see page 106) you are trying to set on an interface is higher than the
MTU of the lower interface or dependent interface. Linux expects the upper interface to have an MTU
less than or equal to the MTU on the lower interface.
In the example below, the swp1.100 VLAN interface is an upper interface to physical interface swp1. If
you want to change the MTU to 9000 on the VLAN interface, you must include the new MTU on the
lower interface swp1 as well.
auto swp1.100
iface swp1.100
mtu 9000
auto swp1
iface swp1
mtu 9000
error: failed to execute cmd 'ip -force -batch - [link set dev host2 master
bridge
addr flush dev host2
link set dev host1 master bridge
addr flush dev host1
]'(RTNETLINK answers: Invalid argument
Command failed -:1)
warning: bridge configuration failed (missing ports)
Network Troubleshooting
Cumulus Linux contains a number of command line and analytical tools to help you troubleshoot
issues with your network.
Contents
(Click to expand)
Contents (see page 417)
Commands (see page 418)
Checking Reachability Using ping (see page 418)
Printing Route Trace Using traceroute (see page 418)
Manipulating the System ARP Cache (see page 419)
Generating Traffic Using mz (see page 419)
Creating Counter ACL Rules (see page 420)
Configuring SPAN and ERSPAN (see page 421)
Configuring SPAN for Switch Ports (see page 422)
Configuring SPAN for Bonds (see page 425)
Configuring ERSPAN (see page 426)
Removing SPAN Rules (see page 427)
Monitoring Control Plane Traffic with tcpdump (see page 427)
Configuration Files (see page 428)
Useful Links (see page 428)
Commands
arp
cl-acltool
ip
mz
ping
tcpdump
traceroute
cumulus@switch:~$ arp -a
? ([Link]) at [Link] [ether] on swp3
? ([Link]) at [Link] [ether] on swp4
? ([Link]) at [Link] [ether] on swp1
[Link] 419
Cumulus Networks
For example, to send two sets of packets to TCP port 23 and 24, with source IP [Link] and destination
[Link], do the following:
IP: ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=[Link],
payload=[see next layer]
TCP: sp=0, dp=24, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
IP: ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=[Link],
payload=[see next layer]
TCP: sp=0, dp=23, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
IP: ver=4, len=40, tos=0, id=0, frag=0, ttl=255, proto=6, sum=0, SA=11.
0.0.1, DA=[Link],
payload=[see next layer]
TCP: sp=0, dp=24, S=42, A=42, flags=0, win=10000, len=20, sum=0,
payload=
[iptables]
-A FORWARD -p tcp --dport 80 -j ACCEPT
The -p option clears out all other rules, and the -i option is used to reinstall all the rules.
[Link] 421
Cumulus Networks
----------------------------------------------------------
| MAC_HEADER | IP_HEADER | GRE_HEADER | L2_Mirrored_Packet |
----------------------------------------------------------
SPAN and ERSPAN are configured via cl-acltool, the same utility for security ACL configuration (see
page 71). The match criteria for SPAN and ERSPAN can only be an interface; more granular match
terms are not supported. The interface can be a port, a subinterface or a bond interface. Both ingress
and egress interfaces can be matched.
Cumulus Linux supports a maximum of 2 SPAN destinations. Multiple rules can point to the same SPAN
destination. The MTP interface can be a physical port, a subinterface, or a bond interface. The SPAN
/ERSPAN action is independent of security ACL actions. If packets match both a security ACL rule and a
SPAN rule, both actions will be carried out.
Using cl-acltool with the --out-interface rule applies to transit traffic only; it does not
apply to traffic sourced from the switch.
Running the following command is incorrect and will remove all existing control-plane rules
or other installed rules and only install the rules defined in [Link]:
[Link] 423
Cumulus Networks
Using cl-acltool with the --out-interface rule applies to transit traffic only; it does not
apply to traffic sourced from the switch.
[Link] 425
Cumulus Networks
Configuring ERSPAN
This section describes how to configure ERSPAN for all packets coming in from swp1 to [Link]:
First, create a rules file in /etc/cumulus/acl/policy.d/:
The src-ip option can be any IP address, whether it exists in the routing table or not. The dst-ip
option must be an IP address reachable via the routing table. The destination IP address must be
reachable from a front-panel port, and not the management port. Use ping or ip route get <ip>
to verify that the destination IP address is reachable. Setting the --ttl option is recommended.
When using Wireshark to review the ERSPAN output, Wireshark may report the message
"Unknown version, please report or test to use fake ERSPAN preference", and the trace is
unreadable. To resolve this, go into the General preferences for Wireshark, then go to
Protocols > ERSPAN and check the Force to decode fake ERSPAN frame option.
[Link] 427
Cumulus Networks
Configuration Files
/etc/cumulus/acl/[Link]
Useful Links
[Link]/sec/mz/[Link]
[Link]/wiki/Ping
[Link]
[Link]/wiki/Traceroute
428 14 December 2015
Cumulus Linux 2.5.5 User Guide
[Link]/wiki/Traceroute
SNMP Monitoring
Cumulus Linux 2.5.x utilizes the open source Net-SNMP agent snmpd, v5.4.3, which provides support
for most of the common industry-wide MIBs, including interface counters and TCP/UDP IP stack data.
Cumulus Linux does not prevent customers from extending SNMP features. However,
Cumulus Networks encourages the use of higher performance monitoring environments,
rather than SNMP.
Contents
(Click to expand)
Contents (see page 429)
Starting the SNMP Daemon (see page 429)
Configuring SNMP (see page 431)
Set up the Custom Cumulus MIBs (see page 431)
Enable the .[Link].2.1 Range (see page 432)
Enable Public Community (see page 432)
Generate Event Notification Traps (see page 433)
Enable MIB to OID Translation (see page 433)
Configure Trap Events (see page 434)
Supported MIBs (see page 437)
jdoo is the fork of monit version 5.2.5, and is included in Cumulus Linux 2.5.2 and later. For
more information about upgrading from monit to jdoo, see the jdoo upgrade knowledge
base article.
jdoo and monit are mutually exclusive. If you would prefer to use monit, the installation
process will uninstall jdoo. Cumulus Networks will not provide support for issues with monit.
[Link] 429
Cumulus Networks
1. Open /etc/default/snmpd to verify that SNMPDRUN=yes. If it does not, update the file to the
correct value.
2. Create an *.rc configuration file in the /etc/jdoo/jdoorc.d/ directory.
3. Add the following content to the [Link] file created in step 2, under the Services banner, and
save the file:
#######################################################################
#######
## Services
#######################################################################
#######
check process snmpd with pidfile /var/run/[Link]
every 6 cycles
group networking
start program = "/etc/init.d/snmpd start"
stop program = "/etc/init.d/snmpd stop"
5. Reload jdoo:
Natively:
Once the service is started, SNMP can be used to manage various components on the Cumulus Linux
switch.
Configuring SNMP
Cumulus Linux ships with a production usable default [Link] file included. This section covers a
few basic configuration options in [Link]. For more information regarding further configuring
this file, refer to the [Link] man page.
The default [Link] file does not include all supported MIBs or OIDs that can be
exposed.
Customers are encouraged to at least change the default community string for v1 or v2c
environments. v3 is encouraged for customer security concerns regarding unencrypted SNMP
data traversing the network.
However, several files need to be copied to the server, in order for the custom Cumulus MIB to be
recognized on the destination NMS server.
/usr/share/snmp/[Link]
/usr/share/snmp/[Link]
/usr/share/snmp/[Link]
[Link] 431
Cumulus Networks
This configuration grants access to a large number of MIBs, including all MIB2 MIBs, which
could reveal more data than expected, and consume more CPU resources.
#######################################################################
########
#
# ACCESS CONTROL
#
# system
view systemonly included .[Link].2.1
# quagga ospf6
view systemonly included .[Link].3.102
# lldpd
view systemonly included .1.0.8802.1.1.2
#lmsensors
view systemonly included .[Link].4.1.2021.13.16
# Cumulus specific
view systemonly included .[Link].4.1.40310.1
view systemonly included .[Link].4.1.40310.2
3. Restart snmpd:
3. Restart snmpd:
apt-get snmp-mibs-downloader
5. Open the /etc/snmp/[Link] file to verify that the mibs : line is commented out:
#
# As the snmp packages come without MIB files due to license reasons,
loading
[Link] 433
Cumulus Networks
6. Open the /etc/default/snmpd file to verify that the export MIBS= line is commented out:
7. Once the configuration has been confirmed, remove or comment out the non-free repository
in /etc/apt/[Link].
createUser cumulusUser
iquerySecName cumulusUser
rouser cumulusUser
Although the traps are sent to an SNMPV2 receiver, the SNMPv3 user is still required.
It is possible to define multiple trap receivers, and to use the domain name instead of IP
address in the trap2sink directive.
linkUpDownNotifications yes
The default frequency for checking link up/down is 60 seconds. The default frequency can be
changed using the monitor directive directly instead of the linkUpDownNotifications
directive. See man [Link] for details.
Alternatively, temperature sensors may be monitored individually. To monitor the sensors individually,
first use the sensors command to determine which sensors are available to be monitored on the
platform.
#sensors
CY8C3245-i2c-4-2e
Adapter: i2c-0-mux (chan_id 2)
fan5: 7006 RPM (min = 2500 RPM, max = 23000 RPM)
fan6: 6955 RPM (min = 2500 RPM, max = 23000 RPM)
fan7: 6799 RPM (min = 2500 RPM, max = 23000 RPM)
fan8: 6750 RPM (min = 2500 RPM, max = 23000 RPM)
[Link] 435
Cumulus Networks
Configure a monitor command for the specific sensor using the -I option. The -I option indicates
that the monitored expression is applied to a single instance. In this example, there are five
temperature sensors available. The following monitor directive can be used to monitor only
temperature sensor three at five minute intervals.
load 12 10 5
monitor -r 60 -o laNames -o laErrMessage "laTable" laErrorFlag !=0
includeAllDisks 1%
monitor -r 60 -o dskPath -o DiskErrMsg "dskTable" diskErrorFlag !=0
authtrapenable 1
Supported MIBs
Below are the MIBs supported by Cumulus Linux 2.5.4, as well as suggested uses for them. The overall
Cumulus Linux MIB is defined in /usr/share/snmp/[Link].
CUMULUS- Discard counters: Cumulus Linux also includes its own counters MIB, defined in
COUNTERS- /usr/share/snmp/[Link]. It has the OID .
MIB [Link].4.1.40310.2
CUMULUS- Cumulus Linux includes its own resource utilization MIB, which is similar to using cl-
RESOURCE- resource-query (see page 375). It monitors L3 entries by host, route, nexthops,
QUERY-MIB ECMP groups and L2 MAC/BDPU entries. The MIB is defined in /usr/share/snmp
/[Link], and has the OID .
[Link].4.1.40310.1.
IF-MIB Interface description, type, MTU, speed, MAC, admin, operation status, counters
LLDP L2 neighbor info from lldpd (note, you need to enable the SNMP subagent (see page
138) in LLDP)
[Link] 437
Cumulus Networks
NET-SNMP- (See also this knowledge base article on extending NET-SNMP in Cumulus Linux to
EXTEND-MIB include data from power supplies, fans and temperature sensors.)
SNMP-TARGET
SNMPv2 SNMP counters (For information on exposing CPU and memory information via
SNMP, see this knowledge base article.)
Index
438 14 December 2015
Cumulus Linux 2.5.5 User Guide
Index
4
40G ports 110
logical limitations 110
8
802.1p 113
class of service 113
802.3ad link aggregation 200
A
ABRs 306
area border routers 306
access control lists 71
access ports 169
ACL policy files 75
ACL rules 115
ACLs 71
active-active mode 209, 266
VRR 209
VXLAN 266
active image slot 31
active listener ports 409
active-standby mode 209
VRR 209
Algorithm Longest Prefix Match 286
routing 286
ALPM mode 286
routing 286
alternate image slot 25, 30, 31, 35
accessing 35
installing a new image 25
selecting 30
AOC cables 11
apt-get 41
area border routers 306
ABRs 306
arp cache 419
[Link] 439
Cumulus Networks
B
bestpath 330
BGP 330
BFD 139, 143, 145
Bidirectional Forwarding Detection 139, 143
echo function 145
BGP 318, 321
Border Gateway Protocol 318
ECMP 321
BGP peering relationships 329, 329
external 329
internal 329
Bidirectional Forwarding Detection 139
bonds 151, 200
LACP Bypass 200
boot recovery 364
bpdufilter 132
and STP 132
BPDU guard 130
and STP 130
brctl 13, 119, 156, 157, 277, 277
and STP 119
IGMP snooping 277
MLD snooping 277
bridge assurance 129
and STP 129
bridges 154, 154, 155, 156, 157, 158, 158, 161, 164, 169, 169, 175, 183
access ports 169
adding interfaces 156, 157
adding IP addresses 161
IGMP snooping 183
C
cable connectivity 11
cabling 139
Prescriptive Topology Manager 139
cl-acltool 71, 115, 420
CLAG 209
and VRR 209
clagctl 193
class of service 113
cl-bgp 300
cl-cfg 84, 390
cl-ecmpcalc 346
cl-img-clear-overlay 37, 37
cl-img-install 25
cl-img-pkg 39
cl-img-select 30, 37, 38, 39
cl-license 11
cl-netstat 374
cl-ospf 300, 308
cl-ospf6 301, 316
Clos topology 289
cl-ra 301
cl-rctl 301
cl-resource-query 85, 375
cl-route-check 314
cl-support 361
convergence 288
routing 288
Cumulus Linux 7, 8, 17, 37, 37, 38, 183, 226
installing 7, 17
reprovisioning 37
reserved VLAN ranges 183
reverting 37
uninstalling 38
upgrading 8
VXLAN 226
[Link] 441
Cumulus Networks
cumulus user 61
D
DAC cables 11
daemons 408
datapath 113
[Link] 113
date 58
setting 58
deb 45
debugging 359
decode-syseeprom 378
differentiated services code point 113
dmidecode 379
dpkg 43
dpkg-reconfigure 57
DSCP 113
differentiated services code point 113
DSCP marking 115
dual-connected hosts 186
duplex interfaces 105
dynamic routing 146, 291
and PTM 146
quagga 291
E
eBGP 320
external BGP 320
ebtables 71, 74
memory spaces 74
echo function 145, 145
BFD 145
PTM 145
ECMP 291, 312, 321, 1
BGP 321
equal cost multi-pathing 291
monitoring 1
OSPF 312
ECMP hashing 346, 349
resilient hashing 349
EGP 292
F
fast convergence 328
BGP 328
fast leave 280
IGMP/MLD snooping 280
First Hop Redundancy Protocol 209
VRR 209
G
globs 100
Graphviz 139
H
hardware 377
monitoring 377
hardware compatibility list 7
hash distribution 154
HCL 7
head end replication 235
LNV 235
high availability 183, 290
host entries 375
monitoring 375
Host HA 183
hostname 9
hsflowd 385
hwclock 58
[Link] 443
Cumulus Networks
I
iBGP 320
internal BGP 320
ifdown 91
ifplugd 210
VRR 210
ifquery 95, 412
ifup 91
ifupdown 90
ifupdown2 99, 167, 411, 411, 412
excluding interfaces 412
logging 411
purging IP addresses 99
troubleshooting 411
VLAN tagging 167
IGMP snooping 183, 196, 275
MLAG 196
VLAN-aware bridges 183
IGP 292
Interior Gateway Protocol 292
image contents 39
image slots 31, 32, 33, 34
PowerPC 32
resizing 34
x86 33
installing 7
Cumulus Linux 7
interface counters 374
interface dependencies 94
interfaces 103, 111
statistics 111
internal BGP 320
iBGP 320
ip6tables 71
IP addresses 99
purging 99
iproute2 416
failures 416
iptables 71
IPv4 routes 321
BGP 321
IPv6 routes 321
BGP 321
J
jdoo 197, 380
L
LACP 151, 183
MLAG 183
LACP Bypass 200
layer 3 access ports 13
configuring 13
LDAP 69
leaf-spine topology 289
license 10
installing 10
lightweight network virtualization 232, 235, 235, 259
head end replication 235
service node replication 235
link aggregation 151
Link Layer Discovery Protocol 133
link-local IPv6 addresses 334
BGP 334
link pause 116
datapath 116
link-state advertisement 305
link state monitoring 210
VRR 210
LLDP 133, 138
SNMP 138
lldpcli 135
lldpd 133, 140
LNV 232, 232, 235, 235, 259, 259
head end replication 235
service node replication 235
VXLAN 232, 259
load balancing 291
logging 362, 411, 411
ifupdown2 411
networking service 411
logging neighbor state changes 333
BGP 333
[Link] 445
Cumulus Networks
M
MAC entries 375
monitoring 375
Mako templates 101, 414
debugging 414
mangle table 116
ACL rules 116
memory spaces 74
ebtables 74
MLAG 183, 194, 194, 195, 196, 198, 199, 274, 1
backup link 195
IGMP snooping 196
MTU 198
peer link states 194, 1
PROTO_DOWN state 194, 274
STP 199
MLD snooping 276
monitoring 57, 359, 372, 375, 384, 385, 387, 429
hardware watchdog 384
Net-SNMP 429
network traffic 385
mount points 33
mstpctl 119, 171
MTU 106, 158, 198, 416
bridges 158
failures 416
MLAG 198
multi-Chassis Link Aggregation 183
MLAG 183
multiple bridges 159
mz 419
traffic generator 419
N
name switch service 68
Netfilter 71
Net-SNMP 380, 429
networking service 411
logging 411
network interfaces 90, 103
ifupdown 90
network traffic 385
monitoring 385
network troubleshooting 427
tcpdump 427
network virtualization 211, 212, 226
VMware NSX 212
no-as-set 330
BGP 330
nonatomic updates 73
switchd 73
non-blocking networks 290
NSS 68
name switch service 68
NTP 59
time 59
ntpd 59
O
ONIE 7, 39
rescue mode 39
Open Network Install Environment 7
Open Shortest Path First Protocol 305, 315
OSPFv2 305
OSPFv3 315
open source contributions 6
OSPF 1, 311, 312, 312, 315
ECMP 312
reconvergence 312
summary LSA 1
supported RFCs 315
unnumbered interfaces 311
[Link] 317
OSPFv2 305
[Link] 447
Cumulus Networks
P
packages 40
managing 40
packet buffering 113
datapath 113
packet filtering 72
packet queueing 113
datapath 113
packet scheduling 113
datapath 113
PAM 68
pluggable authentication modules 68
parent interfaces 97
password 61
default 61
passwordless access 61
passwords 9
peer-groups 330
BGP 330
Per VLAN Spanning Tree 119
PVST 119
ping 418
pluggable authentication modules 68
[Link] 77
port lists 100
port speeds 105
Prescriptive Topology Manager 139
primary image slot 31
priority groups 113
datapath 113
privileged commands 62
PROTO_DOWN state 194, 274
MLAG 194, 274
protocol tuning 288, 336
BGP 336
routing 288
PTM 139, 145
Q
QSFP 375
Quagga 146, 146, 284, 291, 293
and PTM 146, 146
configuring 293
dynamic routing 291
static routing 284
quality of service 117
querier 280
IGMP/MLD snooping 280
R
Rapid PVST 119
PVRST 119
read-only mode 335
BGP 335
recommended configuration 26
reconvergence 312
OSPF 312
remote access 60
repositories 45
other packages 45
rescue mode 39
reserved VLAN ranges 183
resilient hashing 349
restart 85
switchd 85
root user 9, 61
route advertisements 320
BGP 320
[Link] 449
Cumulus Networks
S
sensors command 379
serial console management 9
service node replication 235
LNV 235
services 408
sFlow 385
sFlow visualization tools 387
SFP 112, 375
switch ports 112
single user mode 364
smonctl 383
smond 383
SNMP 380
snmpd 380, 429
[Link] 45
SPAN 421
network troubleshooting 421
spanning tree parameters 122
Spanning Tree Protocol 118, 175
STP 118
VLAN-aware bridges 175
SSH 60
SSH keys 60
static routing 282, 284
with ip route 282
with Quagga 284
STP 118, 129, 199
and bridge assurance 129
MLAG 199
Spanning Tree Protocol 118
T
tcpdump 427
network troubleshooting 427
templates 101
time 58
setting 58
time zone 10, 57
topology 139, 289
data center 139
traceroute 418
[Link] 113
traffic distribution 154
traffic generator 419
mz 419
traffic marking 115
datapath 115
troubleshooting 359, 364, 427
single user mode 364
tcpdump 427
trunk ports 164, 169
[Link] 451
Cumulus Networks
tzdata 57
U
U-Boot 7, 359
unnumbered interfaces 311, 317
OSPF 311
OSPFv3 317
untagged frames 164
bridges 164
upgrading 8
Cumulus Linux 8
user accounts 61, 61
cumulus 61
root 61
user authentication 68
user commands 99
interfaces 99
V
virtual device counters 387, 391, 391
monitoring 387
poll interval 391
VLAN statistics 391
virtual router redundancy 206
visudo 62
VLAN 188, 387
statistics 387
switched virtual interface 188
VLAN-aware bridges 154, 175, 175, 176, 183
configuring 176
IGMP snooping 183
Spanning Tree Protocol 175
VLAN tagging 167, 167, 169
advanced example 169
basic example 167
VLAN translation 174
VRR 206
virtual router redundancy 206
VTEP 211, 213
vtysh 296
quagga CLI 296
W
watchdog 384
monitoring 384
Z
zebra 292
routing 292
zero touch provisioning 47, 50
USB 50
ZTP 47
[Link] 453