BGP Communities Pt. 1

BGP communities are awesome. I could end the post there, but I’m too long-winded for that…

Every customer I go to that is running BGP with their provider I make the same recommendation — start tagging routes with communities (I’ll get into what routes in a minute). Most customers will never ever have any real use case to take advantage of these tags, but in those cases where you have to do some interesting things with routes its great to already have a mechanism in place with which to do things.

Before going any further I’ll point out that every problem has many solutions, and with BGP that’s particularly true — many ways to skin the poor cats. My love affair with communities is one way to skin some cats. In the particular case of ensuring that your AS does not become a transit AS you should probably be doing a secondary layer of protection (filter list or something) in addition to community stuff.

So what routes need to be tagged, what does an organizations community policy look like, and what the hell else do we need to care about with regards to communities? Well we can start by blatantly plagiarizing the way service providers have been using communities for years. One Step Consulting has an excellent list of ISPs and lists of communities that those ISPs honor: http://onesc.net/communities/. Lets pick on L3 since they are a lovable giant, some of the more commonly used communities would be the ability to change local pref or prepend prefixes based on communities as follows:

customer traffic engineering communities – LocalPref
——————————————————–
3356:70 – set local preference to 70
3356:80 – set local preference to 80
3356:90 – set local preference to 90

Note that this is all standards based, and has been around FOREVER (1996!!) and can be found in RFC1998: http://tools.ietf.org/html/rfc1998

So why bring this up? Customers should be aware of these options, and because it gives us a framework for putting communities to work for us. So in terms of an enterprise customer, where would you want to put communities to work? The obvious answer is at the edge, so here is my ‘go-to’ community strategy at the CE routers.

Where 1234 = Customer ASN, 5678 = ISP1 ASN, and 9012 = ISP2 ASN

1234:5678
1234:5679

The above communities correspond to ISP1. The first is to be associated with ALL routes learned from ISP1, the second is all ISP1 provider local routes (ISP1+1 AS Path). ISP2 would have similar communities assigned.

1234:9012
1234:9013

A third community would be assigned to ALL ISP learned routes:

1234:999

So why bother with all this? Firstly, we now have a super simple way to deny any prefixes learned from ISP1 from being advertised to ISP2 and visa versa. Inbound from the providers we apply our communities to our desired routes along with whatever else we have going on, here’s an example in XR and IOS:
XR (Note that RPLs can call other RPLs, so the first one here is the ‘parent’ RPL basically, these obviously reference community lists that I’ll leave out so this doesn’t take up a million lines):

 route-policy RPL_ISP1_Inbound
 apply RPL_Deny_Bogons
 apply RPL_All_Provider_Inbound
 apply RPL_ISP1_All_Prefixes
 apply RPL_ISP1_Provider_Local_Prefixes
 end-policy
 !
 route-policy RPL_Deny_Bogons
 if destination in PS_Bogons then
 drop
 else
 pass
 endif
 end-policy
 !
 route-policy RPL_All_Provider_Inbound
 set community COMM_Any_Provider additive
 set local-preference 90
 pass
 end-policy
 !
 route-policy RPL_ISP1_All_Prefixes
 set community COMM_ISP1_All_Prefixes additive
 pass
 end-policy
 !
 route-policy RPL_ISP1_Provider_Local_Prefixes
 if as-path in AS_ISP1_Local then
 set community COMM_ISP1_Provider_Local additive
 pass
 endif
 end-policy

IOS:

 route-map RM_ISP1_Inbound deny 10
 match ip address prefix-list PL_Bogons
 !
 route-map RM_ISP1_Inbound permit 20
 match as-path 1
 set local-preference 90
 set community 1234:999 1234:5678 1234:5679
 !
 route-map RM_ISP1_Inbound permit 30
 set local-preference 90
 set community 1234:999 1234:5678

Basically all this is just doing what we’ve discussed — plop some communities on some routes so that you can do stuff with them later. The obvious as mentioned is to prevent yourself from becoming transit; here is a simple RPL/Route-map to do just that:
XR:

route-policy RPL_ISP1_Outbound
 if community matches-any COMM_All_Providers then
  drop
 endif
 pass
 end-policy

IOS:

 route-map RM_ISP1_Outbound deny 10
 match community COMM_All_Providers
 !
 route-map RM_ISP1_Outbound permit 1000

We could also have forgone the 1234:999 tag and instead just used the 1234:5678 and 1234:9012 communities to deny advertising those outbound. I find it simpler to just use a single community though. The communities that are applied to provider local routes can be used to allow that smaller subset of prefixes to routers that maybe don’t have enough memory to run full tables, or just don’t have a real business case to have full tables, but still wants to be able to make semi-intelligent outbound routing decisions. This sounds kind of hokey off-hand, but I promise it’s a real thing, and I will elaborate on it in the future 🙂

I think I’ll wrap this up at this point since its already rather long. I’ll try to get a write up of a topology I’ve deployed a few times at the edge of mid-large sized enterprise customers that have taken advantage of communities in order to help keep route-tables smaller were needed, and yet provide the best possible load balancing outbound across multiple ISPs.

Side note, cats are cool. Do NOT skin real cats. Here is my cat, his name is Luca and he is the greatest cat on the entire planet:

Untitled

Advertisements

Tidbit: IOS-XR BGP Allocate Label + Some Inter-AS VPNv4

I ran into a problem while doing an INE mock lab this morning…. it basically kicked my ass, so I figured I’d post about it!

The overall scenario is that there are two BGP domains, AS 1000 and AS 2000. Within each AS, there is some standard IGP routing, and IPv4 BGP — including eBGP between the two domains. There is also some route reflection and some other fun stuff, but that’s mostly irrelevant for the purposes of this post. Below is the INE topology drawing.

INE-Lab4-Snip

After the tasks that setup the basics, the lab rolls into some inter-AS VPN. Essentially, the routers in AS 1000 and 2000 also have loopbacks that are in the same VRF that AS 3000 lives in. The initial VPNv4 task basically is asking to configure the domains so that loopbacks in this VRF are reachable from both of the domains.

So the first thing to consider is that BGP labels will need to be sent between the domains. Thats pretty simple, just send-label in IOS or labeled-unicast in IOS-XR. In addition to that, IOS will require the “mpls bgp forwarding” command on the interfaces between the domains in order to send the labels. For IOS-XR, since the neighbor is on a physical interface, and it’s not a /32 (obviously), a static host route to the neighbor pointing out the connected interface is required. This is because IOS-XR will not install any labels into the forwarding table that have a next hop of something other than a /32.

After this, we need to ensure that each domain has reachability to the ‘PE’ routers loopbacks. This is to ensure that we have a label switched path the whole way through to the end. We also need to make sure that however we learn about the ‘PE’ (which is basically every router since they all have a loopback in the ‘customer’ VRF) loopbacks, and that we get some labels for those. There is an important piece here that basically says that however we learn about that prefix (/32 for the PE), we must also get a label from the same mechanism. IF we were to learn about those /32s via BGP, we would need a BGP label. If we learn the PE loopbacks via IGP, we need to have a label for that via IGP/LDP.

This leads us to the point of the post! In the course of the lab, I was advertising the loopback of each of the PE devices into BGP on each router individually — i.e. on R5 I advertised 10.0.0.5/32 (loopback0) into BGP locally, and advertised R2s loopback locally, etc. This totally worked — R5  and XR1 both had these prefixes in BGP and while things were configured for normal IPv4 unicast (not labeled) they were advertised across to AS 2000.

Things got a little dicey for me though when I moved the eBGP to labeled unicast. R5 was sending prefixes and labels across to AS 2000, but when shutting down that peering session to test that the inter-AS VPNv4 setup was working across XR1/R3 as well, I was met with crushing defeat!!

Thankfully somebody on the IEOC forums (INEs forum) had this same problem, and Mr. Brian McGahan was there to save the day… here’s what he said:

Only the originator of the BGP route can allocate the label.  This means that whoever you have the network statement or the redistribute statement on you need to do the allocation there.  In your case if you don’t originate the network on XR1 you’d have to go to R5 and then send-label to XR1, and on XR1 send label back to R5.  That’s why in most designs you just have your edge routers originate the BGP networks on behalf of the IGP network, because then you have a single point of control for them.  You can do it either way but it’s good to know that the problem exists in the first place.

So basically IOS-XR, which was configured for ‘allocate-label all’ in order to send BGP labels across to AS 2000, was NOT actually sending any labels!! This was due to the way I was getting the loopback prefixes piped into BGP. Killing the advertisements on the other routers, and then advertising them into BGP on XR1 instead allowed XR1 to send the labels across.

So lesson learned! I’m pretty glad that I ‘messed’ up and was able to come across this because I could totally see Cisco doing something like this on the lab — guess I’ll find out in a few weeks when I sit my first attempt!! 😀