Tidbit – An Overlay Love Story

I happen to have a bit of a crush on Trill/FabricPath, yet I think both are dead and/or dying. I can’t understand why we as network engineers would purposely want to implement pervasive L2 in a modern data center (or network in general) . I feel like it inexorably ties far too much of whats in the network directly into the hardware in your data center. Of course we’ve been talking about abstraction on Twitter and blogs and in the industry in general, but mostly about abstracting away the element of human error and vendor centric configurations. Why should we not abstract more though? I think that’s the point of the overlay/underlay model, and that’s why I love it. I want the underlay network to be stupid stupid stupid simple. I want it to just work. I want to be able to throw one of the cores or spine switches off the roof of the data center and have no users be any wiser (I suppose you can do that in a Trill/FP environment, but I think you get my meaning!). I also want to be able to put my networks any damn place I please (in all the clouds!), and yet retain all of my normal data center policies and services.

As with anything, there’s always a time and a place for technologies, and perhaps saying that Trill/FabricPath is dead is a bit over the top… especially since I have a net new FabricPath install coming up soon… but I love me some overlays, and you should too.

1000v BGP VXLAN Control Plane

The latest software release for the Cisco 1000v dropped early August to much fanfare and applause. Oh wait… no it definitely didn’t do that. Other than one or two people on Twitter talking about it, you could have easily missed it. Turns out that this release is actually pretty interesting and worth a bit of time to investigate!

Things that look cool:

  • Support for VM VXLAN Gateway – up till now you had to have the 1110x appliance to do this in VMware 1kv
  • VSUM — magical software thing to do installs and upgrades on the 1000v. I used this in a demo and it was pretty slick
  • Distributed Netflow  — VEMs can send Netflow information directly — no need to pipe things back to the VSM
  • BPDU Guard — whoa, about time? No more bridges in VMs causing problems
  • Possibly interesting TrustSec stuff… not something I’m very familiar with, but it seems like this is another good one
  • Storm Control! Yay! I’ve been wanting this for a while. VXLAN lets us do sometimes less than intelligent things with stretching bridge-domains all around… I feel like adding Storm Control is a bit of a safety net to prevent your bridge-domain from falling over from excessive broadcasts or multicast (or I guess unicast too)

Annnnnd the one I’m most interested in: BGP Control Plane for VXLAN. VXLAN has been around for sometime now, but still doesn’t have a control plane. We’ve been relying on multicast, or proprietary black magic unicast VXLAN to figure out which MACs live at which VTEPs. This is obviously working, but scale will likely become a serious consideration for any reasonably large deployment. What do we as network engineers know that scales pretty well? RIPv1!! Oh wait, no not that… BGP seems to do okay though. This latest release includes support for essentially doing unicast VXLAN in the 1000v, but also to extend that functionality across multiple (up to 8 for now) VSMs.

Why does BGP matter at all for this? Strictly in the realm of Cisco and the 1000v — it means that we can now scale much much much better. Each VSM gets tied to a single vCenter and a single data center within that vCenter, it also has limitations around total supported VEM per VSM (or HA VSM pair). By adding BGP, we now have a non multicast way (yay!) of sharing VXLAN information across multiple VSMs (multicast VXLAN does work across multiple VSMs though, so thats still an option).

From a more holistic standpoint, I believe this is an important step forward in the maturity of VXLAN as a technology. It’s interesting that there seems to be so much support and desire by vendors to have implement BGP with VXLAN, however it seems that nobody is doing it. It’s entirely possible I’m missing something, but other than a few slide decks and IETF drafts, I haven’t seen any vendors implementing this – please tell me if I am missing something here?! NSX and ACI seem to have stolen (at least in the Enterprise type segments I work in normally) a lot of the thunder from VXLAN in general by covering up the underpinnings and replacing it all with proprietary software and shiny GUIs.

In any case, BGP Control Plane is here, and it even works! After getting my two VSMs/vCenters upgraded to the latest code I jumped right in. The configuration is almost exactly what you would expect; enable the feature, utilize BGP off the control interface of the VSM, use an L2VPN address family, and advertise VXLANs (kind of). Here is a complete working config off of my ‘main’ VSM:

interface control0
 ip address
N1kv# sh run bgp
!Command: show running-config bgp
!Time: Sat Oct 4 11:03:28 2014
version 5.2(1)SV3(1.1)
feature bgp
router bgp 65535
 address-family l2vpn evpn
 neighbor remote-as 65535
 address-family l2vpn evpn
 send-community extended
bridge-domain VXLAN_666
 segment id 500666
 segment control-protocol bgp

Obviously the other VSM looks pretty similar. The ‘segment control-protocol’ under the bridge-domain can be configured globally or on a per bridge-domain basis. Since I’m using multicast at home primarily I configured it on the individual bridge-domain.  All that’s needed other than the above configuration is of course a port-profile to present your bridge-domain to VMware.

Verification is also about what you would expect. BGP commands are pretty much the same as BGP on any other Cisco device, and will show VTEP information for each bridge-domain:

N1kv# show bgp l2vpn evpn evi all vtep
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 12, local router ID is
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: (EVI 500666)
*>l10.10.10.183 100 32768 i
*>l10.10.11.254 100 32768 i
*>i10.10.13.253 100 0 i

All other ‘normal’ VXLAN show commands still do basically the same thing. It’s not a very sexy thing to look at configurations for, and it’s not very difficult to get going, but it is certainly a welcome development for VXLAN as a technology.

You can check out the release notes here: http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus1000/sw/5_2_1_s_v_3_1_1/release/notes/n1000v_rn.html