How to use ansible with vagrant environments

Ansible and vagrant are popular tools to automate infrastructure provisioning, but getting them to work together requires some extra setup steps. Once understood, combining both tools enables engineers and operators to fully automatically create, provision and manage the lifecycle of virtual machine instances.

Using a hardcoded inventory

While not exactly standard, if all you need is to access a single virtual machine created by vagrant, you can blindly use the default configuration for vagrant's ssh. First create and start a vm:

vagrant init debian/bookworm64
vagrant up

Then fill the defaults into an ansible inventory

inventory.ini

default ansible_host=127.0.0.1 ansible_port=2222 ansible_user='vagrant' ansible_ssh_private_key_file='.vagrant/machines/default/virtualbox/private_key'

That should be all you need to run playbooks against a single vm instance. Test your configuration with

ansible -i inventory.ini -m ping all

Note that this approach only works for a single vm instance, and only if the local vagrant installation is using defaults for port-forwarding and ssh setup.

Simple provisioning

A more robust approach to run ansible playbooks on vms is to use the ansible provisioner directly inside the Vagrantifle:

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "debian/bookworm64"
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "playbook.yml"
  end
end

Using the provisioner allows running a playbook directly after the vagrant machine is created with vagrant up.

Using this approach works for vagrant setups involving more than one vm, and can even invoke different playbooks for different hosts:

Vagrant.configure("2") do |config|
 vms = {
   "vm1" => "playbook1.yml",
   "vm2" => "playbook2.yml"
 }
 vms.each do |name, playbook|
   config.vm.define name do |vm|
     vm.vm.box = "debian/bookworm64"
     vm.vm.hostname = name
     vm.vm.provision "ansible" do |ansible|
       ansible.playbook = playbook
     end
   end
 end
end

While playbooks can be run for each host this way, it may not be sufficient for more complex infrastructures.

Optimizing playbook execution

Be careful where you invoke the ansible playbook: When defining vms in a loop in your Vagrantfile, the ansible provisioner will run the playbook for each vm immediately after it was created:

Vagrant.configure("2") do |config|
  config.vm.box = "debian/bookworm64"
  (1..3).each do |machine_id|
    config.vm.define "machine#{machine_id}" do |machine|
      machine.vm.provision :ansible do |ansible|
        ansible.playbook = "playbook.yml"
      end
    end
  end
end

This can be inefficient for setups involving multiple vms. A more efficient approach would be to first let vagrant create and start all vms, then run the playbook once for all of them:

Vagrant.configure("2") do |config|
  config.vm.box = "debian/bookworm64"
  N = 3
  (1..N).each do |machine_id|
    config.vm.define "machine#{machine_id}" do |machine|
      if machine_id == N
        machine.vm.provision :ansible do |ansible|
          ansible.limit = "all"
          ansible.playbook = "playbook.yml"
        end
      end
    end
  end
end

The adjusted Vagrantfile above includes an if condition to only run the ansible provisioner after the last vm is created, and sets ansible.limit = "all" to ensure the playbook runs against all vms, not just the last one.

Delaying the playbook execution to run it once against the entire created inventory of vms results in much faster completion times for high vm counts or large playbooks.

Using a loop for this use case is vital to combine all playbook executions into one, as doing so outside a loop has undesirable effects. Consider this Vagrantfile:

Vagrant.configure("2") do |config|
 config.vm.box = "debian/bookworm64"
 config.vm.define "vm1"
 config.vm.define "vm2"
 config.vm.provision "ansible" do |ansible|
   ansible.playbook = "playbook.yml"
   ansible.limit = "all"
 end
end

While you may initially believe this to run the playbook against all hosts once, it actually runs the playbook against all hosts for every created vm, executing the playbook on the same hosts multiple times in a row. Do not use ansible.limit = "all" outside of a controlled loop!

Advanced inventories

Most of ansible's inventory features can be mapped onto vagrant's ansible provisioner, allowing the creation of more complex inventories. You can easily create host variables:

Vagrant.configure("2") do |config|
  config.vm.define "vm1"
  config.vm.define "vm2"
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "playbook.yml"
    ansible.host_vars = {
      "vm1" => {
        "hostname" => "vm1.example.com",
        "enable_backups" => true
      },
      "vm2" => {
        "hostname" => "vm2.example.com",
        "enable_backups" => false      
      }
    }
  end
end

Or groups and group variables:

Vagrant.configure("2") do |config|
 config.vm.box = "debian/bookworm64"
 config.vm.define "vm1"
 config.vm.define "vm2"
 config.vm.provision "ansible" do |ansible|
   ansible.playbook = "playbook.yml"
   ansible.groups = {
     "webservers" => ["vm1"],
     "databases" => ["vm2", "vm3"],
     "databases:vars" => {
       "user" => "root",
       "pass" => "secret"
     },
     "all_groups:children" => [
       "webservers",
       "databases",
       "timeservers"
     ]
   }
 end
end

One thing to note here is that unreachable contents are not written to the inventory: Since the timeservers group has no members, it is omitted from the inventory. Similarly, the nonexistent vm3 for the databases group is quietly ignored. Be sure to check your vagrant script for typos when debugging issues with ansible playbooks and vagrant.

Supporting multiple playbooks

The ansible provisioner for vagrant is useful, but has it's limits. Consider a more real-world use case: You need to create a database, then manage it's lifecycle (updates, backups, restore backups etc). While the vagrant provisioner can take care of the initial installation playbook, there is no use running the others only once after installation.

To handle such scenarios, you need to refer to the generated inventory file from external ansible commands.

Start with a simple Vagrantfile:

Vagrant.configure("2") do |config|
 config.vm.box = "debian/bookworm64"
 config.vm.define "vm1"
 config.vm.define "vm2"
 config.vm.provision "ansible" do |ansible|
   ansible.playbook = "dummy_playbook.yml"
 end
end

The ansible provisioner needs a valid playbook to run, even if you don't intend to execute any real playbooks from vagrant itself, because you need it to generate the ansible inventory file for you. This ensures that the ansible inventory is always in sync with the actual vagrant infrastructure currently in place.

Create a simple dummy playbook to use from the Vagrantfile:

dummy_playbook.yml

- hosts: all
  gather_facts: false
  tasks:
    - ping:

Note the gather_facts: false setting here, to keep the playbook as lightweight as possible.

After running vagrant up, a new inventory will be generated at vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory. You can link ansible to this file with a config file:

ansible.cfg

[defaults]
inventory = .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory
host_key_checking = False

Take not of the host_key_checking = False line in the file. Since the SSH keys change every time a vm is recreated, your SSH client will likely complain about a "known" remote server having changed identity. Vagrant automatically disables this feature for it's ansible provisioner, but when running playbooks from outside vagrant you need to specify this setting manually.

The inventory contains all necessary information to connect and authenticate to each host, including host, port, username, SSH key etc. All defined host vars and group information will also be retained in the generated inventory file.

With all this configuration in place, you can simply run ansible commands as normal from the working directory: