Open Question: What is it that we are supposed to automate?

This post, as the title suggests, is an open question to whoever out there can answer/have a dialogue about this with me.

Unless you’ve been living under a rock lately you have undoubtedly heard about Puppet, Chef, Ansible, and any other number of automation/orchestration engines. The goal of course of all these tools is to simplify how we deploy and manage different pieces of equipment across our network. Tons of ‘stuff’ is happening in server-land, and as always, there are gripes that the network isn’t adapting quick enough etc. To that end, Puppet in particular seems to be making a lot of noise about getting into more of the network space (in fact there are already Puppet agents for some Cisco Nexus platforms).

So that’s cool right — Puppet can do some network stuff, Ansible can do even more stuff since it’s just SSH basically, but what the hell are we supposed to do with them? Ansible makes a lot of sense to me from a template-ing perspective; build the template and just change variables, and even deploy it to live gear with a bit of intervention from Python (supposedly that’s how to do it, I’ve not done it first hand though). That is some super cool stuff, but again, it’s not ‘the SDNs’ (note sarcasm please).

So in terms of the dynamic network of the future, what are we automating? Take this example that is very similar to a real life scenario of a customer I was recently at:

We host pictures of famous cats. We have different tiers of storage. Cats that are less popular are stored in a slower storage tier and/or have fewer resources dedicated to serving those pictures up. A cat that was not that popular dies/gets married/has a new viral cat video/etc. and is now super popular. I want to pull that cat’s images from the slow storage to put into my fastest tier of storage and/or allocate more resources to serving this cat up.

In this example, we could automatically provision more VMs to serve these images up (vCAC/vApp+Puppet or something), or possibly shuffle the storage to the faster tier. We could do this based on some external input — such as top 100 Google searches or something; when our favorite cat is popular we unfreeze the storage, when his star falls we put it in mothballs. This is a bit outside my realm, but this could conceivably all be accomplished with tools that are available today with a bit of extra programming work.

So now we’ve finally got to my question — what the hell needs to be automated in the network? What is it that the app guys are complaining about? That’s a sincere question. Lets take a look at what I think at this point is a modern (maybe not bleeding edge Facebook type) data center — what does it look like? If you asked me it would be almost entirely virtualized, employ L3 ECMP all the way to the top of rack, and if required use VxLAN via OVS/Nexus 1000v/NSX to provide L2 adjacency over the fabric if required, and possibly some service chaining type functionality.

My position is that if we, as network engineers, have done our job well, the network should be 100% transparent to the app guys. It almost pains me to say it, but the network should just be plumbing, it should just be the highway… its not sexy, but it is necessary. You can have the fastest car money can buy, but if you can’t drive it anywhere because there is no roads, whats the point?

Taking the 1000v as an example, we should be able to provision port-profiles, either VLAN or VxLAN, on day-1. These VxLANs or VLANs would be then available to whatever tools are reaching into the hypervisor to automate deployment of virtual machines. Going a step further, we should also be able to take tenants or resource pools, or other VM or network attributes and define security policies surrounding those attributes. Again, as new virtual machines are deployed these automation tools should be able to use the network policies already in place.

All this is well and good, but what happens when you need to define NEW policies, or NEW VxLANs? I don’t know! That’s a great question! Deploying a VxLAN can probably be automated fairly easily (VMware NSX does this for sure), but what about security policies? I gotta think that there is enough complexity here that abstracting the process won’t really help…I don’t think at least. My thought would be that you will still have to understand and define the ports/protocols/VM attributes/etc. that needs to be matched and acted on in any policy and therefore it would be basically impossible to automate since you are going to have to type/click things no matter what. Thoughts?

Outside of the network edge configurations, what else would need to be automated in a network? If the network has been defined as I outlined above, it really is just like a service provider network, or voice networks before that (isn’t technology cyclical??) where the complexity gets pushed as far to the edge as possible, what the hell needs to happen in the core? In the service provider network the core would be nothing but ISIS/OSPF and MPLS labels — in our example the core would be nothing but an underlay with some routing protocol that does ECMP. Why would you need to change it? You wouldn’t unless it was something that would require manual intervention anyway like hardware changes/re-cabling etc.

This post is getting a little long-winded, so I’ll try and wrap up. I think at the end of the day I still have the open question about what the hell it is we are going to automate in the network. I think the network is just a road. The road should be awesome. It shouldn’t have a bunch of pot holes, but it doesn’t need to do anything exciting. It should just allow the overlay (if required — it may not even be necessary if the applications are capable of being distributed and not rely on any silly L2 requirements) do what it needs to do. We can do service chaining type stuff with 1kv/NSX/OVS/etc. So what needs to be automated from your perspective? I’d love to have some good dialogue about this since its fascinating stuff!

Spanning-tree is hard! (for Nexus 93128s)

This is a blog post about how I thought I lost my mind this weekend and forgot how to do the spanning-trees.

The project couldn’t have been any simpler — a single 6509 core (I know, I know… but dual Sups at least) connected to some Nexus 93128 switches for the servers to land on. The 9ks function as the default gateway for the servers, and have vPCs to servers where applicable. A static default route (no licensing for dynamic routing… again I know…) pointed back to the 6k. The routing was happening over a VLAN since there was a requirement for some L2 between the 6k/9ks while servers migrated. Easy right?

Physical connectivity to each 9k was a pair of 10g ports from the SUP-2Ts, these connections landed on a 40g QSFP port with the QSFP->SFP+ adapter module. Not super relevant, but I can say I’ve implemented “40g” now 🙂

The uplink was a simple vPC, trunking a native VLAN and the VLAN that the routing is happening on. vPC Peer-Keepalive was up and happy, the Peer-Link was up and happy, and everything was looking good. Once the uplinks were connected, the 6k saw the 9ks as CDP neighbors and things were trucking along… but…. the next hop for the default route was not reachable. So at this point I’m like okay, I suck and fat fingered a VLAN or left it shutdown or something, so I go through and check it all out. L2 VLAN is created on both the 9ks and the 6k, the L3 SVI is created and is up/up on both sides. The 9ks are looking good — can ping the standby IP, and both of the real IPs there, but still no dice on pinging the 6k. So I take a look at the ARP table for my next hop and see that I’m getting an incomplete entry. At this point its pretty obvious that it’s a L2 thing, so I double and triple check that the VLANs are created and all that looks good. Next, I check spanning-tree and see that everything is operating in the same mode, and that the 6k is set up to be the root for this VLAN I’m sending between the boxes. At first glance, the 6k side seems cool — he knows he’s the root for this VLAN and the ports/port-channel are all up so I moved on…

Over on the shiny new 9ks though things are not so awesome though! The 9ks apparently thought because they were new and cool that they needed to be the center of the universe, and showed that they were in fact the root for this VLAN. Well that sure ain’t cool. Interestingly, for the native VLAN though, the 9ks seem to recognize that the 6k is the root. Weird. Configs look fine however, and just to be sure the 9ks REALLY aren’t the root, I bumped the priority for that VLAN to the max. Bump the vPC, and things are still weird. 9ks still think they are the root (well only one of them obviously, but the point being that they stills on’t see the 6k as root). Jumping back over to the 6k, and taking closer inspection I see that the 6k port-channel that goes to the 9ks is in blocking state for that VLAN. Okay…. furthermore it says its in blocking state and lists “P2p Dispute.”

After starring at this and shutting and un-shutting the ports for a few minutes to see if its going to magically start working I say screw it and kill the second 9k, and whittle things down to a single trunk port without any port-channel/vPC to muddy the waters. Same exact results… okay what the hell is happening here. Now I’m starting to wonder if there is something silly going on with the QSFP ports/adapters, so I change things up to a routed link and try to ping across. 100% success for however many pings I want to throw at it. So this yet again confirms that L1 is good to go. At this point I used the phone a friend option thinking I’m just blind and or completely forgot how to do anything with spanning-tree. Thankfully my awesome co-worker picked up and we hopped on a webex and after about 30 minutes he verified that I was in fact not losing my mind! Big relief, but obviously things were still broken.

As a final test before calling TAC to cry we moved the 6k/9k connectivity off the QSFP ports and onto a copper port on the 9k (96 10g copper ports on the 93128TX). Boom — 9ks see the 6k as root and everything starts working as expected.

Turns out there is a nifty feature (not a bug of course) on the 93128 running this particular version of NX-OS that the 40g ports just decide that they don’t want to receive PVST BDPUs…. which of course caused the 6k to not be seen as root, and the 9k to advertise itself as root, causing the dispute on the 6k side and moving that port to a blocking state.

While troubleshooting this one my good friend Google was not very helpful, so I figured I’d write about it. Hopefully if you’ve found this post I can save ya some time 🙂

Bug ID: CSCup33000
Description: PVST BPDUs dropped on Nexus 9300 40G ports