BGP Communities Pt. 1

BGP communities are awesome. I could end the post there, but I’m too long-winded for that…

Every customer I go to that is running BGP with their provider I make the same recommendation — start tagging routes with communities (I’ll get into what routes in a minute). Most customers will never ever have any real use case to take advantage of these tags, but in those cases where you have to do some interesting things with routes its great to already have a mechanism in place with which to do things.

Before going any further I’ll point out that every problem has many solutions, and with BGP that’s particularly true — many ways to skin the poor cats. My love affair with communities is one way to skin some cats. In the particular case of ensuring that your AS does not become a transit AS you should probably be doing a secondary layer of protection (filter list or something) in addition to community stuff.

So what routes need to be tagged, what does an organizations community policy look like, and what the hell else do we need to care about with regards to communities? Well we can start by blatantly plagiarizing the way service providers have been using communities for years. One Step Consulting has an excellent list of ISPs and lists of communities that those ISPs honor: http://onesc.net/communities/. Lets pick on L3 since they are a lovable giant, some of the more commonly used communities would be the ability to change local pref or prepend prefixes based on communities as follows:

customer traffic engineering communities – LocalPref
——————————————————–
3356:70 – set local preference to 70
3356:80 – set local preference to 80
3356:90 – set local preference to 90

Note that this is all standards based, and has been around FOREVER (1996!!) and can be found in RFC1998: http://tools.ietf.org/html/rfc1998

So why bring this up? Customers should be aware of these options, and because it gives us a framework for putting communities to work for us. So in terms of an enterprise customer, where would you want to put communities to work? The obvious answer is at the edge, so here is my ‘go-to’ community strategy at the CE routers.

Where 1234 = Customer ASN, 5678 = ISP1 ASN, and 9012 = ISP2 ASN

1234:5678
1234:5679

The above communities correspond to ISP1. The first is to be associated with ALL routes learned from ISP1, the second is all ISP1 provider local routes (ISP1+1 AS Path). ISP2 would have similar communities assigned.

1234:9012
1234:9013

A third community would be assigned to ALL ISP learned routes:

1234:999

So why bother with all this? Firstly, we now have a super simple way to deny any prefixes learned from ISP1 from being advertised to ISP2 and visa versa. Inbound from the providers we apply our communities to our desired routes along with whatever else we have going on, here’s an example in XR and IOS:
XR (Note that RPLs can call other RPLs, so the first one here is the ‘parent’ RPL basically, these obviously reference community lists that I’ll leave out so this doesn’t take up a million lines):

 route-policy RPL_ISP1_Inbound
 apply RPL_Deny_Bogons
 apply RPL_All_Provider_Inbound
 apply RPL_ISP1_All_Prefixes
 apply RPL_ISP1_Provider_Local_Prefixes
 end-policy
 !
 route-policy RPL_Deny_Bogons
 if destination in PS_Bogons then
 drop
 else
 pass
 endif
 end-policy
 !
 route-policy RPL_All_Provider_Inbound
 set community COMM_Any_Provider additive
 set local-preference 90
 pass
 end-policy
 !
 route-policy RPL_ISP1_All_Prefixes
 set community COMM_ISP1_All_Prefixes additive
 pass
 end-policy
 !
 route-policy RPL_ISP1_Provider_Local_Prefixes
 if as-path in AS_ISP1_Local then
 set community COMM_ISP1_Provider_Local additive
 pass
 endif
 end-policy

IOS:

 route-map RM_ISP1_Inbound deny 10
 match ip address prefix-list PL_Bogons
 !
 route-map RM_ISP1_Inbound permit 20
 match as-path 1
 set local-preference 90
 set community 1234:999 1234:5678 1234:5679
 !
 route-map RM_ISP1_Inbound permit 30
 set local-preference 90
 set community 1234:999 1234:5678

Basically all this is just doing what we’ve discussed — plop some communities on some routes so that you can do stuff with them later. The obvious as mentioned is to prevent yourself from becoming transit; here is a simple RPL/Route-map to do just that:
XR:

route-policy RPL_ISP1_Outbound
 if community matches-any COMM_All_Providers then
  drop
 endif
 pass
 end-policy

IOS:

 route-map RM_ISP1_Outbound deny 10
 match community COMM_All_Providers
 !
 route-map RM_ISP1_Outbound permit 1000

We could also have forgone the 1234:999 tag and instead just used the 1234:5678 and 1234:9012 communities to deny advertising those outbound. I find it simpler to just use a single community though. The communities that are applied to provider local routes can be used to allow that smaller subset of prefixes to routers that maybe don’t have enough memory to run full tables, or just don’t have a real business case to have full tables, but still wants to be able to make semi-intelligent outbound routing decisions. This sounds kind of hokey off-hand, but I promise it’s a real thing, and I will elaborate on it in the future 🙂

I think I’ll wrap this up at this point since its already rather long. I’ll try to get a write up of a topology I’ve deployed a few times at the edge of mid-large sized enterprise customers that have taken advantage of communities in order to help keep route-tables smaller were needed, and yet provide the best possible load balancing outbound across multiple ISPs.

Side note, cats are cool. Do NOT skin real cats. Here is my cat, his name is Luca and he is the greatest cat on the entire planet:

Untitled

Tidbit: Hide Routers from Traceroute

Do you have that one guy in desktop support that always blames the network? Does he constantly ask why his trace route is going through device XYZ!? Then I have a solution for you! Hide your network from that sketchy desktop support guy!

In MPLS its easy squeezy (‘no mpls ip propagate-ttl’), but what if you don’t have MPLS? Well here’s a quick way to do it:

ip access-list extended ACL_Traceroute
 permit icmp any any time-exceeded
 permit icmp any any port-unreachable

Firstly just make an extended ACL to match (permit) imcp time-exceeded and port-unreachable. The time-exceeded messages are basically the messages that the TTL is decremented (is decremented even a real word?) in and sent back to the original sender. The port-unreachable is basically what it sounds like — if the gateway does not have a route for the prefix, this message should be sent back to the sender.

So what are we doing with this ACL? Well we match it in a route-map, point it to the bit bucket, and apply it as local policy:

route-map RM_Traceroute perm 10
 match ip add ACL_Traceroute
 set interface null0
 !
 ip local policy route-map RM_Traceroute

The outcome is that the devices configured with this local policy do not show up in a trace route. Below is the initial ‘baseline’ test from an OSX client to an OSX client across three routers:

Carls-MacBook-Pro:~ carl$ traceroute 10.10.200.5
 traceroute to 10.10.200.5 (10.10.200.5), 64 hops max, 52 byte packets
 1 10.10.100.1 (10.10.100.1) 4.144 ms 1.497 ms 2.159 ms <--- Gateway for my first laptop
 2 10.10.1.2 (10.10.1.2) 8.056 ms 6.473 ms 6.217 ms <--- R1
 3 10.10.1.6 (10.10.1.6) 10.247 ms 10.345 ms 10.267 ms <--- R2
 4 10.10.200.5 (10.10.200.5) 12.336 ms 12.538 ms 12.312 ms <--- Final host connected to R3

Now, here is that exact same trace route after adding our local policy:

Carls-MacBook-Pro:~ carl$ traceroute 10.10.200.5
 traceroute to 10.10.200.5 (10.10.200.5), 64 hops max, 52 byte packets
 1 * * *
 2 * * *
 3 * * *
 4 10.10.200.5 (10.10.200.5) 12.807 ms 11.450 ms 12.440 ms

As you can see this doesn’t ‘break’ the trace route, just kind of hides our routers. Note that a trace route TO a router will not complete since there is no message being sent back to the original sender since you are basically dropping the relevant incoming messages. As long as your destination is connected to a router that does NOT have the local policy though your traces will complete as per normal.

My understanding is that there are essentially three different ‘flavors’ of trace route — UNIX trace route (UDP), Windows trace route (tracert) which is strictly ICMP, and TCPTraceroute. So I tested this policy with a Windows host as well with basically the same results.

The baseline test:

C:\Users\Carl>tracert 10.10.200.6
Tracing route to Carl-PC [10.10.200.6]
 over a maximum of 30 hops:
1 3 ms 2 ms 2 ms 10.10.100.1
 2 7 ms 6 ms 6 ms 10.10.1.2
 3 11 ms 11 ms 10 ms 10.10.1.6
 4 12 ms 11 ms 12 ms Carl-PC [10.10.200.6]

And after adding the policy:

C:\Users\Carl>tracert 10.10.200.6
Tracing route to Carl-PC [10.10.200.6]
 over a maximum of 30 hops:
1 * * * Request timed out.
 2 * * * Request timed out.
 3 * * * Request timed out.
 4 12 ms 12 ms 13 ms Carl-PC [10.10.200.6]

So everything worked as planned!

I’m fairly confident that this will work for TCPTraceroutes as well but I didn’t test it out as I’m not 100% how to go about doing that… perhaps thats a post for another time.