Tidbit: The basics are Kinda Important, see also: Being Dumb with napalm-ansible

Hello friends!

This is a tiny post to remind you that the basics (and the obvious stuff!) is kinda important, or so it turns out! I was messing around using napalm-ansible to push templatized (templetized? template-ized?) configurations to some devices. Everything was going great until it wasn’t. Using Ansible template my configurations were looking good, but for whatever reason napalm-ansible kept timing out and leaving me with a nasty-gram like this:

"msg": "cannot install config: Search pattern never detected in send_command_expect: [>##]\\s*$"

Well then… that isn’t ideal clearly. The extra fun part was that if I ran the exact same Playbook again immediately after failure it would complete and everything would be great. I hate when you reboot a switch (or anything) and the thing you were troubleshooting works… this is kinda how this felt for me. So determined to get to the bottom of it I cloned my CSR a bajillion times and started testing.

I found this GitHub issue: https://github.com/ktbyers/netmiko/issues/555 where Kirk suggested setting the global delay factor. As I understand it this is there to basically just delay timeouts for things so that if the underlying napalm config merge was taking a long time for one or more configs we could gain some buffer time. I did this and originally set it to 2 as Kirk suggested, then I tried 4, and when that didn’t work I tried 20. Needless to say that took a *really* long time, but still eventually failed 😦

Doing as Kirk suggested in that same issue and manually doing the config replace didn’t really work since 1) I wasn’t doing a config replace, and 2) my configuration template did not contain management access stuff, so doing a merge would gut my connectivity (because it was a CSR I still had access via console but still).

Eventually, I realized I was just being a noob and not paying attention to the little obvious stuff we always overlook (because you would think it would just work). Turns out that the router I was poking had no public interweb access — why would this matter you may ask yourself? Of course if napalm/ansible can get to it, shouldn’t life be all rainbows and puppy dogs? Oh, one would think, but yours truly was doing a dumb thing and setting not one, but FOUR NTP servers to a DNS name. Yeah… without internet access that whole resolving DNS thing doesn’t work out so well. I have no idea what the timeouts were (and w/ the global delay set to 20 or whatever I used it took forever to timeout), but it clearly was angering things! Flipping those NTP servers to dummy IPs immediately solved my problem, duh 🙂

This has been your friendly public service reminder that it’s always something obvious and simple!



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s