Ansible is a general purpose automation tool that may be used for configuration management or workflow automation. Configuration management is an “infrastructure as code” practice that codifies things, e.g. what packages and versions should be installed on a system, or what daemons should be running. Workflow automation may be anything from provisioning cloud infrastructure to deploying software.
This article starts with a feature overview, a description of how Ansible fits with Chef or Puppet, and technical examples of configuration management and workflow automation. This article will give you a feel for what Ansible can do, and the high-level topics for further learning.
Features and Overview
Ansible is written in Python and uses SSH to execute commands on different machines. Ansible is agentless which makes it much easier to start out with. All you need is SSH access and Python installed on the relevant machines. Ansible uses declarative YML “playbooks” to map a group of hosts (from “inventory”) to well defined roles. Declaratives are used to instruct Ansible how to set up or change things, and Ansible makes the required changes. For example, you may run a “configure web server” playbook that installs Nginx and connects each machine to a load balancer.
The “inventory” is a static or dynamic (e.g. if you’re using EC2) list of machines with associated tags and other metadata (e.g. what user to connect as or SSH port). Playbooks list “plays”. A “play” declares machines to execute “tasks” on. Tasks use Ansible “modules” to do the invididual work. Ansible has modules to install apt
packages, generate files, manage users, connect to cloud providers, and much more. There are literally hundreds of modules and new modules are added with every release. It’s important you understand this hierarchy. Here’s an example workflow represented as an Ansible peusdo playbook:
- Create web server infrastructure (first “play”)
- Run from local host (machines from “inventory”)
- Do the following: (“tasks”)
- create a new Elastic IP address,
- create a new EC2 instance and associated the Elastic IP address, and
- Add the EC2 instance to the inventory.
- Configure the web server (second “play”)
- Run on the all web server instances
- Do the following: (“tasks”)
- Install Nginx,
- Clone the latest code from GitHub,
- Configure Nginx to serve the code,
- Restart Nginx if required,
- Make a test request to localhost to see if things are working, and
- Send a deployment notification.
Notice how this playbook mixes different sets of machines and what should be done with them. We’ll create real playbooks as we progress. You can check the official playbook introduction as well. Let’s focus on how Ansible fits with Chef and Puppet.
Ansible, Chef, and Puppet
Chef and Puppet are two other popular automation tools. All three may be used to solve similar problems and have similar feature sets. There are two major differences between Ansible and Chef/Puppet. Both Chef and Puppet are primarily agent based. Machines managed by Chef/Puppet run an agent. The agent checks back to the control machine to see what changes need to happen. This doesn’t require SSH, but it does require infrastructure to run the puppet/chef server. Ansible’s agentless model means that it’s easy to start with and works for smaller inventories. However, that becomes a problem with hundreds of machines. In this case, ansible-pull is one of the options. Ansible also relies on SSH for connecting to machines, so distributing keys is another facet to consider.
Chef and Puppet both use a custom domain specific language to describe what to do. Chef actually uses pure Ruby code. Puppet created an entirely new DSL. If you already know YAML, then you’re ready to start writing Ansible playbooks.
There are many more in-depth comparisions between these three tools. We won’t spend more time talking about the differences. Here’s a quick summary before we move on to actually using Ansible to get things done:
- Ansible uses YML to describe work,
- Chef uses Ruby code to describe work,
- Puppet use a custom DSL to describe work,
- Chef/Puppet are agent based, and
- Ansible is SSH and push based.
Ultimately, these three tools may be used to solve the same set of problems with different technical trade-offs. The best tool for your job depends on your requirements and team.
Configuration Management
For this section, let’s assume that “configuration management” refers to configuring state on existing machines. Common activities are:
- Installing packages,
- Generating files,
- Managing file system permissions,
- Setting environment variables,
- Restarting daemons/services,
- Updating code releases, and
- Managing firewall rules.
Ansible can do all of this and more. Let’s examine the first use case with an example playbook to install nginx
. Playbooks are YML files that list plays with their associated inventory and tasks. All playbooks start off the same way — they define the first play.
---
# Always available
- hosts: localhost
# Execute commands directly instead of over SSH
connection: local
# Tasks: now what to do in this play
tasks:
# Ansible prints this name when running the playbook
- name: Install nginx
# Now declare the module for this play. The "module" defines
# what to do.
apt:
state: present # This package should be installed
name: nginx
cache_valid_time: 18000 # 5 hours in seconds
# Use sudo
become: true
The apt manages apt packages. It can install, upgrade, and remove packages. Ansible commonly uses the state
argument to define the expected state. Common values are present
(to install in this case) or absent
(to uninstall in this case). You can find all the allowed values in the apt module docs docs. The name
defines the apt package. Note that you can specify versions like foo=VERSION
for apt
packages. The cache_valid_time
option will trigger an apt-get update
if the last refresh is older than the TTL. This is useful when you run playbooks on fresh machines. It’s also a great way to avoid refreshing the cache multiple times in the same play/playbook.
This playbook uses localhost
and local
connection so we could run it without needing a remote machine. Save this file as install-nginx.yml
then run it using ansible-playbook
.
ansible-introduction $ ansible-playbook install-nginx.yml
[WARNING]: Host file not found: /etc/ansible/hosts
[WARNING]: provided hosts list is empty, only localhost is available
PLAY [localhost] ***************************************************************
TASK [setup] *******************************************************************
ok: [localhost]
TASK [Install nginx] ***********************************************************
changed: [localhost]
PLAY RECAP *********************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0
Note the various ok
and changed
items. Ansible knows if an individual task is already as expected or something must be changed. We can re-run the playbook again to see what happens. This time everyting is simply ok
.
PLAY RECAP *********************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0
Let’s consider a case where we may have a non apt
based systems. We’d like to reuse the same playbook but use a different module depending on the operating system. Ansible supports conditional execution. This works well when combined with Ansible’s built in variables. Let’s refactor this playbook to work against apt
and yum
distributions.
---
# Assume this is defined in inventory
- hosts: webservers
gather_facts: true
tasks:
- name: Install nginx
apt:
# This package should be installed
state: present
name: nginx
# 5 hours in seconds
cache_valid_time: 18000
become: true
# Check the built in ansible_dtribution variable
when: 'ansible_distribution == "Ubuntu"'
- name: Install nginx
yum:
state: present
name: nginx
# The yum module does have an explicit TTL flag,
# so set update_cache and Ansible will update the cache
# if needed
update_cache: true
become: true
# Check the built in ansible_dtribution variable
when: 'ansible_distribution == "Fedora"'
Note the new when
keys. These are python snippets that evaluate to boolean expressions. It also introduces variables generated by facts. Facts are information about remote systems (gather_facts: true
). Examples include the Linux kernel version, number of disks, partition information, amount of phyiscal memory, or Linux distribution. The ansible_distribution
is a fact variable. It’s used corectly skip the tasks based on the Linux distribution.
Let’s wrap up this section with links to relevant documentation and follow up material:
- file – Manage files and thier permissions,
- git module – Clone git repos,
- copy module – Copy files to remote machines,
- template module – Generate Jinja2 templates,
- service module – Control system daemons,
- Package management modules – List of all packagement management modules,
- Command modules – For when you need to execute low level commands,
- File management modules – Modules for creating new files, manipulating their content, and managing permissions,
- Working with Variables – Everythign you ever wanted to know about using variables in playbooks,
- Working with Templates – Guide to using Jinja2 templates and Ansible’s custom additions, and
- Working with Conditionals – More on using conditionals (
when
) and other task control structures.
Workflow Automation
Remember the peusdo playbook from earlier where we describes creating EC2 infrastructure? Ansible can automate that as well. This falls into the “workflow automation” bucket. It’s common that processes may mix creating or manipulating some infrastructure and doing some sort of configuration management. Let’s consider other use cases. You may create an Ansible playbook that temporary removes a machine from a load balancer, pulls code with the git
module, runs a smoke test, then adds the machine back to the load balancer. You may also use Ansible to create compute infrastrucure in the cloud, configure DNS, then connect to those machines, and configure them to run your application. The common thread here is a mix of traditional configuration management and infrastructure manipulation. Ansible provides the best from both worlds through its large module library and flexbility.
Let’s look at example that deploys changes to a Heroku application. The playbook creates a new Heroku application if required, finds its URL, pushes a new deploy via git
, then does a test HTTP request with the uri
model. This could be done with a small Bash script but that would not scale up as nicely as an Ansible playbook. It’s possible to extend this playbook, we can add a custom domain that would automatically add DNS records using Ansible’s built in modules for providers like CloudFlare or AWS Route53.
---
# This play should only execute _once_. Target localhost and
# run modules directly.
- hosts: localhost
connection: local
# Speed up execution by disabling un-used facts
gather_facts: false
# Define variables used in this play
vars:
app_name: demo
tasks:
- name: Get existing heroku apps
command: heroku apps
changed_when: false
# Save output in a variable
register: apps
- name: Create heroku app
command: "heroku create {{ app_name }}"
# Only create the application if it does not exist yet
when: 'apps.stdout.find(app_name) == -1'
# Use the shell module to pick out the generated application URL
# and save it in a variable
- name: Get the app url
shell: |
heroku apps:info '{{ app_name }}' --shell \
| grep -F 'web_url' \
| cut -d '=' -f 2
changed_when: false
register: app_url
# Trigger a deploy with a git push
- name: Deploy to Heroku
command: git push heroku master
# Use the URI module to make a test HTTP request
# to the applicaton URL to verify the workflow.
- name: Test deploy
uri:
url: "{{ app_url.stdout }}/ping"
status_code: 200
You may find more examples in our previuous tutorials:
- Continuous Deployment with Docker, AWS, and Ansible
- Continuous Deployment with Google Container Engine
- Continuous Deployment with Golden Images
- Continuous Deployment for Static Sites
You may also like the list of cloud modules.
These tutorials cover using Ansible to automate a mix of infrastructre creation, provisioning, configuration management, and verification.
Running Ansible Playbooks on Semaphore
Using Ansible in the CI and deployment environment on Semaphore is straightforward. As Ansible is preinstalled, you can run ansible-playbook path/to/your/playbook.yml
or anything else that you need, and Semaphore will execute it.
What’s Next?
We’ve covered two large Ansible use cases, but there is so much more. Let’s highlight some important areas so you know what to look into next.
Ansible has a robust role system. Roles are self contained plays that may be applied to different machines. You may have a “web server”, “mysql-replica”, or “redis” role. Roles may declare dependencies and customize variables. This allows you to separate smaller sets of unique code and compose larger, more useful roles. Ansible Galaxy is a repository for using contributed roles. Odds are you may rely on some of these in your day-to-day work. You’ll find roles to set up databases, web servers, and monitoring systems.
Ansible is also extremely customizable. It’s easy to write your own Ansible modules. An Ansible module is simply an executable file that reads JSON from stdin and prints JSON to stdout. This means you may create your own modules in any language, be it Bash, Ruby, Python, Node.js, or Go. Here are some simple examples that wrap commands that do not have Ansible modules. We did not touch on Lookups yet. Lookups allow us to connect various systems to Ansible. Lookups are just another executable program so you can write them like modules. Here are some lookup examples.
We discussed how Ansible is agent-less. This works well for small setups but becomes problematic at a larger scale. There are few ways to overcome this problem, and ansible-pull is one of them. This command will poll a git repo, clone it, and execute playbooks. This is a good solution for auto-scaling cloud infrastructure which needs configuration after it’s created. Ansible Tower is another approach. It’s a full scale mission control system for Ansible installations. It’s currently a paid product only, but rumour has it that it will be open sourced soon.
These getting started guides are geared towards specific use case, ands offer more than we could cover together. There are guides for all the common cloud providers, and a few specific use cases.
That’s all. Good luck out there and happy shipping!