DMVPN/MPLS/PfR Part 2: Finally… Some Routing!

Picking up right where we left off with Part 1 of the “DMVPN/MPLS/PfR” series we are going to drop some dynamic routing on top of our setup and see where this takes us.

Frank&Co is using OSPF as their IGP of choice, but their MPLS provider required them (rightfully so!) to peer via BGP to them. So, internal to each of the Frank&Co locations we have some basic OSPF for internal route propagation, but also a bit of BGP to send and learn prefixes over the MPLS. As I’m sure will become painfully evident over the course of future blog posts, I absolutely despise redistribution, and thankfully Frank&Co shares my opinion on that matter. Referring to the diagram, on the “HQ” sites, there is of course BGP running on the “MPLS” routers to deal with the provider, in addition to that, the “MPLS” routers also peer to the “Core” device via iBGP.

Fran&Co Routing Overview

This is of course a pretty common deployment, but it really is quite elegant. Ignoring for the moment the added requirement for the DMVPN terminating on the “WWW” router, we can see that prefixes learnt via the MPLS provider never need to be redistributed back into the IGP. Any routers “south” of the HQ “Core” device that do not have an exact match for a route that exists in the MPLS cloud would “fallback” to a default route (being originated via the “WWW” device)…. which in turn would pipe traffic through the “Core,” where there is a longer match via iBGP! Viola! No redistribution necessary (the other clean fix is MPLS, but that tends to become a licensing/feature support issue).

So why is this at all relevant for our DMVPN discussion? Well it totally matters, that’s why! DMVPN is a little less dynamic without a dynamic routing protocol running over the top of it. So what is the logical routing protocol to use? Well Cisco may tell you to use EIGRP (even though it sucks), OSPF would work too (better than EIGRP). EIGRP (besides sucking) seems like a poor choice because then we would end up with OSPF, BGP, AND EIGRP to deal with. OSPF would be okay, but then we have some interesting issues of path selection — the “Core” (without redistribution) would ALWAYS prefer the OSPF routes via the DMVPN over the MPLS routes (because the prefixes in BGP will be “iBGP” at the “Core” and as such have an AD of 200 vs OSFP’s AD of 120). Alright, so other options: RIP – LOL, NOPE!, Static – again, not very dynamic, ISIS – cool, but not exactly its primary use case, BGP it is!

Next up – eBGP or iBGP?? Well we could do either really, but there are some things to seriously consider.

  •  Full mesh requirement if using iBGP; could configure the hub(s) as route-reflectors to ease this requirement
  • Administrative distance; AD 20 vs AD 200 / eBGP vs iBGP
  • Next hop reachability; are we updating this or not, and if not, how are we learning the next hop

My money (and consequently Frank&Co’s) is on eBGP. I think that it provides the simplest and most scalable solution, eliminates the need for route reflection, and gives you the least amount of headache with respect to path selection. So, enough talk, lets set it up.

This is going to be a super simple BGP setup, for now all we care about is getting our basic peering up. On the HQ “DMVPN” router, we configure the BGP process, set a static router-id, and send communities for kicks, and tack on soft-reconfig.

router bgp 100
 bgp router-id 1.1.1.3
 bgp log-neighbor-changes
 neighbor 1.1.1.1 remote-as 100
 neighbor 1.1.1.1 update-source Loopback0
 neighbor 1.1.1.1 send-community both
 neighbor 1.1.1.1 soft-reconfiguration inbound
 neighbor 100.100.100.180 remote-as 180
 neighbor 100.100.100.180 send-community both
 neighbor 100.100.100.180 soft-reconfiguration inbound
 neighbor 100.100.100.190 remote-as 190
 neighbor 100.100.100.190 send-community both
 neighbor 100.100.100.190 soft-reconfiguration inbound

Excuse my silly lab IP allocations 🙂 Frank&Co is using 1.1.1.x for loopbacks at the “HQ” site, and as seen in the previous post 100.100.100.x for tunnel IPs for connecting to the primary DMVPN hub. Also please excuse the public AS numbers. As you can see, nothing special here; Hub “1” is peering to both of our branch routers, and to the “Core” at the “HQ” site.  As we already have established that we can ping across the tunnels, we have everything we need to establish our eBGP sessions. The spoke side is configured similarly, but since it is compressed into a DMVPN and MPLS terminating router, that’s all happening in one place.

router bgp 180
 bgp router-id 1.1.80.1
 bgp log-neighbor-changes
neighbor 100.100.100.100 remote-as 100
 neighbor 100.100.100.100 send-community both
 neighbor 100.100.100.100 soft-reconfiguration inbound
 neighbor 172.16.80.2 remote-as 1
 neighbor 172.16.80.2 send-community both
 neighbor 172.16.80.2 soft-reconfiguration inbound

Nothing exciting here — just landing the two eBGP sessions here for the MPLS and the DMVPN peers. We can of course very simply see that we have established our BGP peers with the “show ip bgp summary” command. Here is the output of this command on the Branch router:

Neighbor            V   AS    MsgRcvd MsgSent TblVer InQ OutQ Up/Down    State/PfxRcd
100.100.100.100     4   100    13      14        5     0  0   00:09:09   0
172.16.80.2         4   1      12      14        5     0 0    00:09:23   0

Here we can see the two neighbors, 100.100.100.100 being the DMVPN hub router, and the 172.16.80.2 peer being our itty bitty MPLS “cloud” (hate that word…). We aren’t receiving any prefixes yet (or sending), so lets fix that and see how this works.

router bgp 100
 network 10.10.10.10 mask 255.255.255.255
 aggregate-address 10.10.0.0 255.255.0.0 summary-only

On the HQ Core router, we are going to advertise one of our loopbacks, 10.10.10.10/32,  into BGP. In real life we probably would not be just advertising single /32 prefixes, so we will go ahead and advertise the whole 10.10.0.0/16 into BGP. Furthermore, since summarization is… ya know… a thing… and an important thing at that, we will tack on the “summary-only” keyword. The aggregate-address command will advertise the indicated prefix (assuming at least a subset of that prefix is in the BGP RIB), including the “summary-only” syntax will suppress any of the subsets of the aggregate.

Back out on the Branch router, we can see that we are receiving this new prefix, and that we are receiving it via two possible paths:

Network           Next Hop         Metric LocPrf Weight Path
 *> 10.10.0.0/16  100.100.100.100                0      100 i
 *                172.16.80.2                    0      1 100 i

As we can see, we are receiving the prefix via MPLS and over the DMVPN. We can also see that we prefer the path via the DMVPN due to a shorter AS-Path — “100” vs “1 100” — the added hop of the MPLS provider makes it less desirable. Since its likely that the DMVPN connection is a “fallback” or a secondary path, we probably want to make sure that the MPLS is the preferred path. We have several ways to attack this, probably the simplest option since we are using Cisco devices is to set the weight on the Branch device. We can use this to set the weight on a neighbor basis or on specific prefixes by matching with a route-map and access-list/prefix-list. This is a good option, but I would like to have the possibility of multi-path later, so we will require basically all metrics for the prefix to be equal up to IGP cost.

So the simplest way to make things equal is to prepend an extra hop onto prefixes advertised via the DMVPN router at HQ. That way when the prefix is received at the Branch, the AS-Path count is equal. Heres how we make that happen:

ip prefix-list PL_Any seq 5 permit 0.0.0.0/0 le 32
!
route-map RM_DMVPN_Prepend permit 10
 match ip address prefix-list PL_Any
 set as-path prepend 100
!
router bgp 100
 neighbor 100.100.100.180 route-map RM_DMVPN_Prepend out

Above you can see the prefix-list “PL_Any” matching any prefix, that prefix-list is being used in the route-map, the route-map is then adding a single hop to the AS-Path of “100” which is our local AS. Lastly, we attach this route-map to the Branch neighbor outbound.

Now we have equal AS-Paths, so we will need to move further up the BGP path selection process to determine which of the paths is the best. In our case, we will still end up preferring the path via the DMVPN due to the lower router-id. At this point, we can simply use a route-map and a prefix-list to match all traffic and set the local preference of all prefixes received via MPLS to a higher than default value to force the Branch to prefer the MPLS connection.

ip prefix-list PL_Any seq 5 permit 0.0.0.0/0 le 32
!
route-map RM_MPLS_In permit 10
 match ip address prefix-list PL_Any
 set local-preference 9999
!
router bgp 180
 neighbor 172.16.80.2 route-map RM_MPLS_In in

Now we will prefer the MPLS provider since AS-Paths are equal, and the local-prefernce of that peer is higher than the DMVPN peer.

Network           Next Hop         Metric LocPrf Weight Path
 *  10.10.0.0/16  100.100.100.100                0      100 100 i
 *>               172.16.80.2             9999   0      1 100 i

Yay! So finally we are utilizing the MPLS to reach back to HQ. I’ve also advertised the loopback of the Branch into BGP, so now we should have full reachability to HQ from the Branch.

traceroute 10.10.10.10 source lo10
Type escape sequence to abort.
Tracing the route to 10.10.10.10
VRF info: (vrf in name/id, vrf out name/id)
 1 172.16.80.2 0 msec 0 msec 0 msec
 2 172.16.1.1 1 msec 1 msec 0 msec
 3 10.10.1.1 [AS 100] 1 msec 1 msec *

Above we can see that we reach HQ via the MPLS when sourced from the Branch routers loopback10 (using l10 to simulate the Branch “LAN”). If we shutdown the interface that we peer to the MPLS provider on, we automagically re-route to our next best path which is of course via the DMVPN.

interface e0/1
shutdown
!
traceroute 10.10.10.10 source lo10
Type escape sequence to abort.
Tracing the route to 10.10.10.10
VRF info: (vrf in name/id, vrf out name/id)
 1 100.100.100.100 5 msec 5 msec 5 msec
 2 10.10.1.5 [AS 100] 5 msec 5 msec *

At this point we have a pretty solid setup — we can use local-pref/weight and other BGP attributes to control which path we prefer to use, and we can automagically failover to a secondary path. In this post we didn’t outline the configuration of the second hub, however once we start playing with PfR we will be working from the dual hub DMVPN setup, and the magic of PfR will be able to take advantage of having three different paths between the HQ sites and the Branch. 

Last note: At the HQ core device, we are recursing for reachability to the DMVPN network. We could use next-hop-self on the DMVPN router’s peering session to the core to eliminate that kind of wonky recursion — and utilize a next-hop for the Branch (via the DMVPN) that is in the routing table. Either way, in this case things work.

Advertisements

DMVPN/MPLS/PfR Part 1: Basic DMVPN/NHRP

This series will tackle the basics of a current pet project/side lab I’ve got going on at the moment. The scenario is essentially your “standard” type enterprise network with one or two “main” sites, and any number of “branches” connected via MPLS. The idea here is that our customer “Frank&Co.” has this MPLS provider, but wants some additional redundancy. The easy answer is some affordable commercial broadband at the remotes, but how do you securely tie that into the rest of the enterprise network? Well you’ve got some options of course, site to site VPNs, VTI tunnels, DMVPN, you could have brought in a second MPLS provider too if you want to be spendy, or you could get really nutty and look at possibly introducing LISP and GET-VPN (hopefully I’ll be lab-ing that up later, looks like Cisco has a few good docs on this).

All of these options could provide you with some redundancy options, and hopefully we will take a look at all of them in time, but for now “Frank&Co” is interested in DMVPN. On top of the DMVPN and MPLS combo, they are interested in leveraging PfR for some more intelligent routing control (SDN lite perhaps?).

Lets take a look at the basic lab layout:

FrankCo Overview

As you can see, nothing fancy here! Two “HQ” type location with a unique router for terminating the MPLS and for the Internet connection, and a single core device (lets hope its VSS and not literally just one box). For the remote locations, a single router terminates the MPLS and the Internet connections. We will pile on ZBFW for some security on the remote branches in a later post in this series, for now though Frank&Co is just not that concerned with security.

So, now that all that is out-of-the-way, let’s get down to the fun part. DMVPN is often described with the three different phases that I’m sure you have read about before. Simply put; Phase 1 = hub to spoke connectivity, and Phase 2 = Phase 1 + dynamic spoke to spoke tunnels. Phase 3 is a whole new animal that we won’t worry about for now! In addition to the different phases, DMVPN is almost always associated with IPSEC in order to encrypt traffic, but strictly speaking IPSEC is not a mandatory component of DMPVN. So how do these tunnels magically form over the Internet (or any NBMA network)?

Next Hop Routing Protocol of course! NHRP is basically a protocol that functions sort of like ARP, just over the Internet. Your spoke routers, or Clients, are configured to map to the Hub(s), or Server(s). The Clients basically register with the Server so that the Server does not have to be pre-configured for a potentially dynamic Client address. Lets take a look at how this is configured:

Hub/Server:
 interface Tunnel0
 ip address 100.100.100.100 255.255.255.0
 tunnel source Ethernet0/0
 tunnel mode gre multipoint
 ip nhrp network-id 1
Spoke/Client:
 interface Tunnel0
 ip address 100.100.100.180 255.255.255.0
 ip nhrp map 100.100.100.100 192.168.10.1
 ip nhrp network-id 1
 ip nhrp nhs 100.100.100.100
 tunnel source Ethernet0/0
 tunnel mode gre multipoint

Thats really it. On the Hub/Server side, you can see that we’ve completed the basics to spin up a tunnel – source/ip/mode, but we did not need to configure a destination. The “ip nhrp network-id [number]” command enables NHRP on the interface, and eliminates the need for a statically defined destination. The actual number is locally significant to the router, you could use “1” on the hub and “9999” on the spoke; the mapping and next hop server commands on the spoke are what tie the tunnels together.

The Spoke/Client clearly has a little bit more going on, but still, not a whole lot. Again, we have got the basic tunnel stuff, but still no destination. In addition to the network-id, we have two very important lines:

ip nhrp nhs 100.100.100.100
ip nhrp map 100.100.100.100 192.168.10.1

Line one above defines the (in our case) private IP address of the endpoint that is the Next Hop Server (NHS). The second line essentially just tells the Spoke how to get to the server. Here we are mapping the NHS address to a “public” (just pretend with me, it just a lab after all) IP address that is reachable from the Spoke.

So at this point, assuming that you have reachability to the address that NHRP is mapping the NHS to, you should have basic DMVPN connectivity! Well how do you know its working!? The easy peasy verification for this is simply, and intuitively, “show dmvpn.”

Legend: Attrb --> S - Static, D - Dynamic, I - Incomplete
 N - NATed, L - Local, X - No Socket
 # Ent --> Number of NHRP entries with same NBMA peer
 NHS Status: E --> Expecting Replies, R --> Responding, W --> Waiting
 UpDn Time --> Up or Down Time for a Tunnel
 ==========================================================================
Interface: Tunnel1, IPv4 NHRP Details
 Type:Spoke, NHRP Peers:1,
# Ent Peer NBMA Addr Peer Tunnel Add State UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
 1 192.168.10.1 100.100.100.100 UP 00:09:34 S

We get a nifty little legend for us that makes this output nice and easy to read. We can clearly see the peers NBMA address, the tunnel address, and whether the peer is up or down. Since this output was taken from a Spoke, you can see the “S” for static under the peer attributes; on the Hub side, this will be a “D” for dynamic.

Okay, so show commands are cool, but how do we see whats really going? Debugs are fun right? — “debug nhrp packet” will give us some good output on the registration process. Here we see the process from the Hub side.

*Nov 28 18:01:41.619: NHRP: Receive Registration Request via Tunnel0 vrf 0, packet size: 92
 *Nov 28 18:01:41.619: (F) afn: AF_IP(1), type: IP(800), hop: 255, ver: 1
 *Nov 28 18:01:41.619: shtl: 4(NSAP), sstl: 0(NSAP)
 *Nov 28 18:01:41.619: pktsz: 92 extoff: 52
 *Nov 28 18:01:41.619: (M) flags: "unique nat ", reqid: 65688
 *Nov 28 18:01:41.619: src NBMA: 192.168.80.1
 *Nov 28 18:01:41.619: src protocol: 100.100.100.180, dst protocol: 100.100.100.100
 *Nov 28 18:01:41.619: (C-1) code: no error(0)
 *Nov 28 18:01:41.619: prefix: 32, mtu: 17916, hd_time: 7200
 *Nov 28 18:01:41.619: addr_len: 0(NSAP), subaddr_len: 0(NSAP), proto_len: 0, pref: 0
 *Nov 28 18:01:41.619: NHRP: Send Registration Reply via Tunnel0 vrf 0, packet size: 112
 *Nov 28 18:01:41.619: src: 100.100.100.100, dst: 100.100.100.180
 *Nov 28 18:01:41.619: (F) afn: AF_IP(1), type: IP(800), hop: 255, ver: 1
 *Nov 28 18:01:41.619: shtl: 4(NSAP), sstl: 0(NSAP)
 *Nov 28 18:01:41.619: pktsz: 112 extoff: 52
 *Nov 28 18:01:41.619: (M) flags: "unique nat ", reqid: 65688
 *Nov 28 18:01:41.619: src NBMA: 192.168.80.1
 *Nov 28 18:01:41.620: src protocol: 100.100.10.11, dst protocol: 100.100.10.10
 *Nov 28 18:01:41.620: (C-1) code: no error(0)
 *Nov 28 18:01:41.620: prefix: 32, mtu: 17916, hd_time: 7200
 *Nov 28 18:01:41.620: addr_len: 0(NSAP), subaddr_len: 0(NSAP), proto_len: 0, pref: 0

TL;DR — You can see above the in the bolded lines where the interesting parts are (for Humans to read at least). The router receives the request on the tunnel, you can see the source NBMA address (this is the “public” IP on our Spoke router), and see the mappings for the actual “private” tunnel addresses. “Show ip nhrp” can give us a little further verification that the registration has been successful:

100.100.100.180/32 via 100.100.100.180
 Tunnel0 created 00:19:03, expire 01:43:13
 Type: dynamic, Flags: unique registered
 NBMA address: 192.168.80.1

So does it ping!? Sure does, from the hub:

HQ_WWW#ping 100.100.100.180 
Type escape sequence to abort. 
Sending 5, 100-byte ICMP Echos to 100.100.100.180, timeout is 2 seconds: 
!!!!!

One last cool piece to all of this; since we have already setup the tunnels as multi-point GRE, we are already up to snuff for DMVPN “Phase 2.” I’ve gone ahead and added the same basic tunnel/nhrp configurations on the second Spoke router. Now from the Hub we can see two NHRP entries:

100.100.100.180/32 via 100.100.100.180
 Tunnel10 created 00:23:40, expire 01:38:36
 Type: dynamic, Flags: unique registered
 NBMA address: 192.168.80.1
100.100.100.190/32 via 100.100.100.190
 Tunnel10 created 00:00:10, expire 01:59:50
 Type: dynamic, Flags: unique registered used
 NBMA address: 192.168.90.1

It gets a little cooler though! At first glance, we have no spoke to spoke connectivity, here is some output on Spoke “1,” it shows that we only have the NHRP entry for the Hub router:

100.100.100.100/32 via 100.100.100.100
 Tunnel10 created 00:22:38, never expire
 Type: static, Flags: used
 NBMA address: 192.168.10.1

Once we start to send some traffic to our Spoke “2” site, some magic starts to happen though! Below we can see that a traceroute from Spoke “1” goes “directly” to Spoke “2” — this is all due to the magic of NHRP. The NHS already knows about the second Spoke site, and it shares this information with Spoke “1.” Spoke “1” can then build its NHRP entry for Spoke “2” and forward traffic directly across the NBMA network. This spoke-to-spoke tunnel will stay “alive,” for a time determined by the “nhrp holdtime” which defaults to 2 hours.

Branch_1#traceroute 100.100.100.190
 Type escape sequence to abort.
 Tracing the route to 100.100.100.190
 VRF info: (vrf in name/id, vrf out name/id)
 1 100.100.100.190 1 msec 1 msec *

At the moment this is a little less than impressive since we are just pinging from tunnel IP to tunnel IP, but do not fear, shortly we will be adding on some dynamic routing and IPSEC on top of our lovely basic DMVPN configuration!

Lastly, here are all the relevant configurations for the Hub and two Spokes that were referenced in this post:

Hub:

hostname HQ_WWW
 !
 interface Tunnel0
 ip address 100.100.100.100 255.255.255.0
 ip nhrp network-id 1
 tunnel source e0/0
 tunnel mode gre multipoint
 !
 interface e0/0
 description To WWW
 ip address 192.168.10.1 255.255.255.252
 !
 ip route 0.0.0.0 0.0.0.0 192.168.10.2

Spoke1:

hostname Branch_1
 !
 interface Tunnel0
 ip address 100.100.100.180 255.255.255.0
 ip nhrp map 100.100.100.100 192.168.10.1
 ip nhrp nhs 100.100.100.100
 ip nhrp network-id 1
 tunnel source e0/0
 tunnel mode gre multipoint
 tunnel key 1
 !
 interface e0/0
 description To WWW
 ip address 192.168.80.1 255.255.255.252
 !
 ip route 0.0.0.0 0.0.0.0 192.168.80.2

Spoke2:

hostname Branch_2
 !
 interface Tunnel0
 ip address 100.100.100.190 255.255.255.0
 ip nhrp map 100.100.100.100 192.168.10.1
 ip nhrp nhs 100.100.100.100
 ip nhrp network-id 1
 tunnel source e0/0
 tunnel mode gre multipoint
 tunnel key 1
 !
 interface e0/0
 description To WWW
 ip address 192.168.90.1 255.255.255.252
 !
 ip route 0.0.0.0 0.0.0.0 192.168.90.2