Learning EVPN VXLAN First

I’ve found that learning resources for EVPN are much more plentiful for data center than service provider. The drawback is that VXLAN is typically used for the data plane for data center deployments. However, as I’ve gone through some coursework for EVPN/VXLAN, I’ve found that the data plane is such a minor section that it really doesn’t make much difference. If your goal is to learn EVPN, whether you learn the EVPN/VXLAN data center flavor, or EVPN/MPLS, the principals and operations of EVPN are the same. In fact, EVPN is specifically data-plane agnostic, so whether you using MPLS or VXLAN the EVPN operations fundamentally do not change. In your mind you can sort of substitute the VXLAN header and VNI for an MPLS header and MPLS label.

Therefore I would strongly recommend to take advantage of some of the useful guides that are out there for EVPN/VXLAN, even if you will only ever operate a EVPN/MPLS network. This will give you the advantage of better writing/guides when learning EVPN. In general, there is a wider audience for EVPN/VXLAN so there seems to be better introductory learning material out there. It will also make you a more well-rounded engineer.

Learn theory first

https://www.youtube.com/playlist?list=PLtO_OYBiEo6l0z5cQ1eaiFC6QAnFeVRcf

Follow the above Youtube playlist from Packet Pushers. This is an excellent resource for learning EVPN. This is only an hour and a half and covers the basics of EVPN in great detail. This purposefully avoids any configuration and only covers the basics of EVPN from a theory perspective.

Practice in the lab with Arista vEOS

Don’t worry if you’ve never used Arista before. If you know Cisco, you know Arista’s CLI. In fact, the CLI is so similar that Cisco sued Arista over it.

Download vEOS. I used version 4.24.11M. You will need to sign up for a free user account to get access to Arista’s free vEOS downloads.

From Arista’s website:

All that is required to get a copy of vEOS is a guest user registration and acceptance of end user licensing agreement, visit our software download page to get started. The vEOS directory contains the latest vEOS-lab vmdk and Aboot iso images.

Download the .vmdk file from Arista and convert it to .qcow2 using qemu-img. I preformed the following steps on an Ubuntu linux server to create the .qcow2 file.

sudo apt-get install qemu-utils
qemu-img convert -f vmdk -O qcow2 vEOS64-lab-4.24.11M.vmdk vEOS64-lab-4.24.11M.qcow2

Import the vEOS image into CML. I used the image and node definitions from this git repo, just changing the version number: https://github.com/colinmacgiolla/vEOS-on-CML/tree/main

ZTP

One thing to note about vEOS, is that by default ZTP will continuously try to obtain a DHCP address on all ports. You will see these messages in the console over and over.

Oct  2 15:57:22 localhost ZeroTouch: %ZTP-6-RETRY: Retrying Zero Touch Provisioning from the beginning (attempt 1)
Oct  2 15:57:37 localhost ZeroTouch: %ZTP-6-DHCPv4_QUERY: Sending DHCPv4 request on  [ Ethernet1, Ethernet2, Ethernet3, Ethernet4, Ethernet5, Ethernet6, Ethernet7, Ethernet8, Ethernet9, Ethernet10, Ethernet11, Ethernet12, Management1 ]

To prevent this, the configuration tells you you can turn off ZTP. However you need to reboot the switch.

localhost#show run | be Zero
The device is in Zero Touch Provisioning mode and is attempting to 
download the startup-config from a remote system. The device will not  
be fully functional until either a valid startup-config is downloaded 
from a remote system or Zero Touch Provisioning is cancelled.

To cancel Zero Touch Provisioning, login as admin and type 
'zerotouch cancel' at the CLI. Alternatively, to disable Zero Touch  
Provisioning permanently, type 'zerotouch disable' at the CLI.  
Note: The device will reload when these commands are issued. 

EOF
!
end

Issuing zerotouch disable prompts an immediate reload of the device.

EVPN Lab

Follow this EVPN Deployment Guide: http://allvpc.net/EVPN_Deployment_Guide.pdf

If the link is dead, you should be able to find the deployment guide from Arista’s website

This is an excellent lab guide that goes into depth about EVPN and VXLAN theory, and then walks you through the process of setting up a basic EVPN/VXLAN data center in your lab. The guide describes the purpose of every command and goes over verification steps.

You will not be using the four switches towards the right-hand side of the topology. You do not need to create these, which will save you some setup time.

The guide does not cover VLAN 50 but I would recommend using this as an opprotunity to try to set it up on your own at the end.

MLAG

Follow this guide to setup MLAG between A-LEAF2A and A-LEAF2B

https://aristanetworks.force.com/AristaCommunity/s/article/mlag-basic-configuration

By default, MLAG would not work for me in CML. This appears to be due to an issue with the MACs that are assigned to ports in CML. When they start with 52, this creates an issue when vEOS attempts to create the virtual mac for MLAG. See here: https://github.com/GNS3/gns3-gui/issues/2475. I believe if you use EVE-NG it may work by default.

You cannot manually change the mac address on interfaces with the mac-address command under ports. Changing the mac address on vEOS with the CLI simply doesn’t work due to the router’s lack of a physical data plane. Instead the workaround is to change the system MAC of each switch using the following method: https://ztpserver.readthedocs.io/en/latest/tips.html#how-do-i-override-the-default-system-mac-in-veos

#A-LEAF2A

A-LEAF2A#bash

Arista Networks EOS shell

[admin@A-LEAF2A ~]$ echo 0411.1111.1111 > /mnt/flash/system_mac_address
[admin@A-LEAF2A ~]$ exit
logout
A-LEAF2A#reload now

#A-LEAF2B
A-LEAF2B#bash

Arista Networks EOS shell

[admin@A-LEAF2B ~]$ echo 0422.2222.2222 > /mnt/flash/system_mac_address
[admin@A-LEAF2B ~]$ exit
logout
A-LEAF2B#reload now

Now that the system MAC does not have the 7th or 8th bit set to 1, the switches can properly generate the virtual MAC for MLAG, and you should see the MLAG session as up.

A-LEAF2A#show mlag
MLAG Configuration:               
domain-id                          :               mlag1
local-interface                    :            Vlan4094
peer-address                       :            10.0.0.2
peer-link                          :      Port-Channel10
peer-config                        :          consistent
                                                        
MLAG Status:                      
state                              :              Active
negotiation status                 :           Connected
peer-link status                   :                  Up
local-int status                   :                  Up
system-id                          :   06:11:11:11:11:11
dual-primary detection             :            Disabled
dual-primary interface errdisabled :               False
                                                        
MLAG Ports:                       
Disabled                           :                   0
Configured                         :                   0
Inactive                           :                   0
Active-partial                     :                   0
Active-full                        :                   2

If for some reason you cannot get mlag to work in your own lab environment, you could simply skip this step and just use a single switch. It doesn’t prevent you from learning EVPN. In fact you will later remove MLAG and configure multihoming with stand alone switches to learn type-1 and type-4 routes.

vEOS changes to the deployment guide

Do not use update wait-install under BGP. This tells the switch to wait to send a BGP update for a prefix until it is installed in hardware. As you can imagine, this is problematic with a virtual switch image. In my own lab, my spines would never send updates from one leaf to another until I realized that I needed to turn this feature off.

Do not use BFD. You may be able to set BFD timers high enough for it to not be an issue, but you don’t really need to use BFD to achieve the learning objectives of this lab guide.

In 4.24 there was different syntax for a few commands. However I never got stuck, because the OS would let you know if you were trying to use a depreciated command, and what the new command was.

A-LEAF2A(config)#vrf definition TEST
% Unavailable command (This command is deprecated by 'vrf instance [VRF_ID]')
A-LEAF2A(config)#vrf instance TEST
A-LEAF2A(config-vrf-TEST)#

Also, make sure you enable ip routing for every vrf you define using ip routing vrf VRF-NAME

Learning goals

At the end of this lab you should be able to explain the differences between an EVPN type 2, 3, and 5 route. You should be able to explain the data plane operations of how a host pings another host in the same subnet off a different switch, and how a host pings another host in a different subnet off a different switch.

Overall this lab took me around 4 hours. You might want to break it up into two days.

To challenge yourself, after completing the lab, you could tear it down and re-build the underlay with an IGP.

Takeaways from the EVPN Deployment Guide

This lab experience gave me a much deeper understanding of EVPN type 2, 3 and 5 routes.

Type 2

Type 2 routes are generated in response to traffic received from a host. (Learning of a host in the data plane via MAC learning). The local VTEP generates an EVPN type 2 route with information about this host, including its MAC address and L2VNI value. If you have routing enabled for the VLAN, the type 2 route also includes the host’s IP address (/32), L3VNI value and the VRF export RT(s).

Notes:

  • Route type is called mac-ip

  • If L2 only:

    • Contains L2VNI, MAC address, L2 RD/RT, ESI

      • The RD and RT are defined under the vlan-aware-bundle under BGP

  • If L3:

    • Also contains the /32 IP address of the host, the L3 VNI, the L3 RD/RT, the EvpnRouterMac

    • This is signaled due to the presense of the L3 SVI

    • The EvpnRouterMac is necessary so that the egress VTEP processes the frame locally in order to route it to the correct subnet. This MAC is used as the destination of the frame. The ingress VTEP puts the EvpnRouterMac into the frame’s destination when sending the traffic to the egress VTEP. The routing happens on the egress VTEP. The ingress VTEP is able to match on the destination /32 address, not preform routing.

  • The Type-2 route is how a remote VTEP learns where a host in the same VLAN is located. It is also how a remote VTEP learns how a host in another subnet is located and how to route to it. If the host is in another subnet, the VTEP simply has to send the packet to the remote VTEP, it doesn’t actually preform any L3 routing. The remote VTEP recevies the packet and then routes it to the appropriate subnet.

Type 3

Type 3 routes were a bit confusing to me before I began this lab. The “multicast” keyword in IMET (Inclusive Multicast Ethernet Tag) threw me off. It seems that the idea behind this route is simply to advertise interest in receiving BUM traffic for a given VNI. So when a leaf is participating in a VNI, it advertises this route so that other VTEPs will send BUM traffic for that VNI to this particular leaf. The confusion for me, was that multicast is not necessary to achieve BUM forwarding. You can use HER (head end replication), and in fact this is what Arista uses. Arista does not even support multicast replication.

Notes:

  • Route type is called imet

  • Contains the L2 RD and RT (defined under vlan-aware-bundle), and the VNI

  • Simply advertises the fact that the switch is participating in the VNI and wishes to receive BUM traffic for this VNI. All other VTEPs will add this leaf switch into their flood list.

Type 5

Type 5 routes advertise a layer 3 prefix. To me, these seem very much related to L3VPN. In fact, the VNID sort of replaces the MPLS VPN (service) label. A VTEP advertises a type 5 route when it is preforming routing for the subnet. Instead of the leaf learning a route from a CE like in classic L3VPN, the leaf simply has the prefix locally configured as a connected route. The leaf advertises this prefix and the associated VNID. In both L3VPN and EVPN IP VPN, these routes are part of a VRF. It is also important to note that the VNI used for type 5 is a different VNI value, it is the L3VNI value. The VNI value for type 2 and type 3 routes is the L2VNI value. The L3VNI is mapped to an IP VRF and the L2VNI is mapped to a VLAN (MAC-VRF).

Notes:

  • Route type is called ip-prefix

  • Contains the RD and RT for the VRF, the L3VNI, the prefix, and the EvpnRouterMac

    • Just like a VLAN is mapped to a L2VNI, a VRF is mapped to a L3VNI

VLAN-aware versus VLAN-based

This was a concept I had seen but didn’t really understand before going through this guide. I now see that a VLAN-aware bundle literally “bundles” multiple VLANs into a single MAC-VRF. The Ethernet Tag ID is used to put traffic into the correct VLAN table, thereby differentiating between the multiple VLANs bundled into the single MAC-VRF.

The VLAN-based MAC-VRF uses a one-to-one mapping of VLANs to MAC-VRF. A single VLAN is mapped to a single MAC-VRF with its own RT value. This does not require an Ethernet TAG ID, but it does require many more MAC-VRFs to be defined.

The Type-2 and Type-3 routes do not need to carry the VNI in the VLAN-based approach. This is because the RT value(s) alone can import routes into the correct VLAN. Each switch participating in the VLAN has it locally mapped to the VNI, so the VNI does not need to be signaled via EVPN.

With VLAN-aware, the VNI (ethernet tag ID) needs to be carried in the type-2 and type-3 routes to differentiate between multiple VLANs all using the same RT value(s).

Multihoming Design Guide

The multihoming in the previous design guide was based on MLAG. This simplified multihoming operations in a certain sense, because multihoming happend at the chassis level rather than EVPN level. It used a mechanism most engineers are familiar with, MLAG, instead of introducing the complexity of EVPN multihoming while you are trying to learn the fundamentals of EVPN operations.

EPVN multihoming uses type 1 and type 4 routes, and it is important to learn how these work.

I suggest first reading through this: https://www.arista.com/en/support/toi/eos-4-22-0f/14236-evpn-vxlan-all-active-multihoming

Then I suggest following the lab guide found on Arista’s website titled “EVPN Multihoming in Data Center Networks.”

Try to break the MLAG between LEAF2A and LEAF2B and set it up using EVPN multihoming.

In summary, the Type 1 route is used for auto-discovery. Every leaf that participates in the ethernet segment originates a Type 1. This allows remote VTEPs to learn the list of leafs participating in this ethernet segment.

The Type 4 route is used by all leafs participating in the ethernet segment to elect a designated forwarder. The DF will be the leaf that sends BUM traffic onto the segment.

These three short videos do a great job of covering multi-homing basics:

https://www.youtube.com/watch?v=F7jm90YGRpk&ab_channel=AnuradhaKaruppiah

https://www.youtube.com/watch?v=L5ZXpG4FOEs&ab_channel=AnuradhaKaruppiah

https://www.youtube.com/watch?v=saiYcFvNPHQ&ab_channel=AnuradhaKaruppiah

Conclusion

Now that you’ve had some practice with EVPN/VXLAN and learned the basic operations of EVPN, it’s time to use EVPN in a service provider network.

Further Reading

If you are interested in more reading, check out this great series: https://overlaid.net/2018/08/27/arista-bgp-evpn-overview-and-concepts/

Last updated