For the past few years I’ve been filling the role of a part-time sysadmin for a small development lab.

My team defines virtual machines using packer and hosts the images with vSphere. We then manage them using Ansible, and things generally work pretty well and I can focus on my day job: development.

But over the past few months the velocity at which we provision new vms and decommision old ones has been increasing, and answering the following questions efficiently has become more important:

1. What is the IP address of a certain machine?
2. How can we reduce friction for daily dev tasks on these machines?
3. Which IP addresses are being used and which are available for assignment?
4. Are our machines healthy?
5. How do we know there are no unaccounted for machines in our subnet?

And then along with these questions I wanted to stick to these guidelines:

1. No DBs. Let the source of truth be in version control.
2. Automatic not magic. Automate without hiding important information.
3. APIs not UIs. I don't want to learn another dashboard.
4. Cattle not pets. Servers should be killed and rebuilt without any issue.
5. DRY. Try to minimize the number of places information is duplicated.

Up until this point, we’ve been using an Excel spreadsheet to manage our IP addresses. But as you might imagine this solution wasn’t keeping up, I was constantly encountering out of date information or just plain missing information. So what to do?

The Solution

  1. Pull the subnet DNS config directly from the Ansible inventory.
  2. Feed the Ansible inventory directly into the nagios host list.
  3. Automate scans of the subnet and compare to our monitored host list.
  4. Using ssh config and DNS to make it easy for developers to access machines without ip addresses.

You might wonder, why not setup IPAM? That may very well be a good solution, but it did violate my preferences on not incorporating a new UI into my workflow, and keeping everything in version control not in a DB.