If you are new to VM management, it is good to do certain things manually and to learn how the system works. However, once you know the basics you quickly realise there is much more utility in automating the mundane repetitive task. Ansible is the tool that enables us to automate server setup in a flexible, error-resistant way. It has certain benefits over writing your own scripts in POSIX Shell or Bash, and we will get to these benefits over the course of this tutorial.
Prerequisites
- A Control Node where Ansible will be installed. This can be your desktop or another VPS. We will be using a VM running Ubuntu 20.04 LTS as our Control Node.
- One or more Target or Hosts. We will be using another VM running Ubuntu 20.04 LTS as our Host, which, Ansible will configure for us.
- A basic understanding of SSH and how to connect to a remote VPS and use it.
Goals
Before we get into the specific details, it is important to state what we are trying to accomplish, here. The playbook we are about to write will:
- Add a public SSH Key for the
root
user, allowing us to login as the root user using our public-private SSH Key pair. Here's an introduction to SSH and SSH keys - Disable password-based authentication and allow only key-based logins which are much secure.
- Update all the packages on the system. Equivalent to running
apt update; apt upgrade
on Ubuntu ordnf update
on CentOS and Fedora.
So let's get started.
Ansible Installation and Basics
On your Control Node, Ansible can be installed using your system's package manager or Python's Package manager pip
, since Ansible is written in Python. On macOS, it is recommended that you install it using pip
or pip3
:
$ pip install -U ansible
On Linux, you can get it straight from your system's package manager:
$ apt install ansible # For Debian or Ubuntu based systems
$ dnf install ansible # For RedHat, Fedora or CentOS based systems
On your Target Host no prior installation is necessary. As long as the Ansible Host has an SSH daemon running and, Python3 installed you are good. All the Linux VMs that you can get on SSDNodes (or any other cloud provider) would readily work with Ansible without any manual intervention. For this reason, Ansible is called an agentless automation engine. Because you don't have to install any specific software on the target.
So now we also know how Ansible works.
- It uses SSH to authenticate and take control of a Host, which means using it is as secure as our SSH connection, and we don't have to worry about additional security threats.
- It uses Python3 (Python2 works but is deprecated, and not recommended) to run all the automation, checks and data collection on the hosts.
Configuring Ansible
There are three key files needed on the control node:
- Ansible Playbook(s) describing what automation to run on your hosts. This will be our main focus.
- An inventory file listing all your hosts and grouping them together in logical ways. On most Linux distros this file is
/etc/ansible/hosts
- Ansible's configuration file. On most Linux distros this file is
/etc/ansible/ansible.cfg
For the sake of consistency we would like to have everything, the configuration, the inventory and playbooks, in one folder. So we create a folder called playbooks and create the inventory and configuration files inside it:
$ mkdir playbooks
$ touch ansible.cfg inventory
Ansible will automatically pick the current directory's ansible.cfg
file and override the main configuration with this one. Edit the ansible.cfg
file and add the following contents to it:
[defaults]
inventory = ./inventory
This will set the current directory's inventory
file to be the inventory for our playbooks. Because we are starting small, with just one VPS, we will add just one line to the inventory here, this will be the IP address (or Domain name) of your VPS. Make sure to use your actual IP address and not what is shown below:
127.0.0.1
The offical documentaion shows how you can create more complicated inventory capable of organizing hundreds of servers into dozens of categories.
We are targeting only one server, so we just added that one line here. If you want to save the playbooks to a git repo, make sure that you don't include the inventory file with it, especially if it contains sensitive information such as the IP Addresses of all your servers.
Writing the playbook
A playbook is essentially a description of how you desire the host system to be, also known as the desired state of the system. It is written in YAML, which, if you are unfamiliar, is language similar to JSON or XML but much more human readable while simulatenously being unambigious to a computer program. Think of it as a way of describing and structuring data, rather than writing a set of instructions like in a script or a program.
Create a file called initial-setup.yaml
and we can start building our playbook. The following first few lines
---
- name: Initial Setup
hosts: all
remote_user: root
The beginning ---
describes the start of a YAML file, and is optional. Next, we create a list with each element of the list starting with -
. There are going to be lists within lists in our yaml file and this is the outermost list, with just one item in it.
The first element of this outermost list contains an Object (objects are like dictionaries in Python3) and this object has the following attributes:
name: Initial Setup
which sets the name of our playbook as Initial Setuphosts: all
selects which hosts from our inventory file will this playbook act upon. We have decided to act on all the hosts inside the inventory.remote_user: root
sets what user do we wish to SSH as into the remote host.
The next item in this Object will be tasks
and herein will lie the bulk of our "configurations". I will show the tasks below as part of the larger playbook, because it is important to note that the indentation level on tasks
should be same as name
. However, tasks
itself has a list of, well, tasks within it. Each task is another object. an indentation level below tasks
and name
. And each task comes with a name and a module along with the action that the module is supposed to take.
---
- name: Initial Setup
hosts: all
remote_user: root
tasks:
- name: Add SSH key for root
lineinfile:
path: ~/.ssh/authorized_keys
create: yes
state: present
line: <COPY YOUR PUBLIC SSH KEY HERE>
So the first task here is named "Add SSH key for root" and it uses an Ansible module lineinfile
which makes sure that a specific line is present (or absent) in a given file. Here, the lineinfile
module also contains a dictionary of following parameters within it.
path: ~/.ssh/authorized_keys
specifies which file this particular task is concerned with.create: yes
says that, if the file is absent, create it!state: present
specifics that we want a given line to be present.line: <COPY YOUR SPECIFIC SSH KEY HERE>
specifics what line we want there to be.
Different modules will have different parameters. For example, package
module will not have a path
or line
variables because the installation of a package has nothing to do with path or lines.
Obviously, no one can remember how each module works and what parameters it takes. So, when you are writing your playbooks, the ansible documentation will be your best guide. For example, the lineinfile
module has all its various parameters described, along with the default values, and examples, here. The documentation is clean, easy to follow, and full of relevant examples.
You can see in the docs that state: present
is already the default value. So you can skip that line from your playbook if you just want to ensure that the given line exists in the file.
Ansible's automation system has hundreds of built-in modules, and you can also access many more community provided modules from ansible-galaxy.
Running the playbook
To run the above playbook, switch to the directory where the playbook lives and use the ansible-playbook
command:
$ cd playbooks/
$ ansible-playbook initial-setup.yaml
If you have not set your SSH keys, and are going to login using password, install sshpass
on your local machine and use ansible-playbook
with the flag --ask-pass
to allow ansible to login using plain text password:
$ sudo apt install sshpass
$ ansible-playbook --ask-pass initial-setup.yaml
Enter the password for your VPS when prompted.
The above command will be required only for the first time, since we are adding SSH keys as part of our server configuration.
Idempotent
Why not write a bash script to automate something like that? Well, bash scripts don't have the same reliability as Ansible. If there is a bug in your bash script, even if it is the most minor of bugs, it can potentially clobber your VPS and can lock you out of it, or corrupt its data.
Where as with Ansible it is much harder to unintentionally clobber the system. For example, if you run the above playbook again, it won't add the same SSH key twice to your authorized_keys
file. But if you just do a cat keyfile.pub >> authorized_keys
in bash, it will keep adding the line again and again each time you run the script. With more complicated setups a home made bash script won't be able check for errors, edge cases, or stay idempotent. Idempotent operations are those that can occur on a system one or more times and not change the system. It is only the first run that would matter.
So if your inventory grows, or if you add a new task to your playbook, you can simply rerun the playbook, and not worry about the pre-existing tasks, or hosts being affected.
Adding more stuff to our playbook and using Conditionals
Let's get the ball rolling with a slightly more complicated task. To upgrade all the system packages.
For Debian/Ubuntu hosts we will use the apt
module.
- name: Update all packages for Debian-like Systems
apt:
update_cache: yes
name: '*'
force_apt_get: yes
state: latest
Consult the documentation for the apt
module to understand various parameters here.
Obviously, the same module won't work on CentOS, RedHat and Fedora like systems, so we need to ensure that this works only for the Debian
family of distributions. To do that, we will use the fact that ansible gathers certain facts about the host, like what operating system it is running, what packages index it has, etc.
We can use this information to create a condition statement like so:
- name: Update all packages for Debian-like Systems
apt:
update_cache: yes
name: '*'
force_apt_get: yes
state: latest
when: ansible_facts['os_family'] == 'Debian'
The when
needs to be at the same indentation level as apt
and name
. This will ensure that this task is skipped for CentOS, RedHat and Fedora like systems. Let's write another module to update packages on those systems.
- name: Update all packages for RHEL like systems
dnf:
update_cache: yes
name: "*"
state: latest
when: ansible_facts['os_family'] == 'RedHat'
Looping through a list of items in a task
The above two tasks of updating and upgrading the systems are the only ones that are OS specific. Ansible has a generic module called package
that can be used to install packages on top of most pacakge managers. So we can use this to install a whole list of packages now:
- name: Install Packages
package:
name: "{{packages}}"
state: present
vars:
packages:
- vim
- curl
- wget
- nginx
Notice, we did something more intricate here. Instead of rewriting a list of tasks, all of which uses package
module, like below:
- name: Install vim
package:
name: vim
state: present
We instead created a list of packages that we need, and we looped over them. This is the power of Ansible and YAML. The little bits and pieces that we learned earlier, like lists, and objects are now used to extend really simple tasks. The vars
keyword is special to Ansible and is used to declare variables, in this case the variable is called packages
which itself has a list of package names inside it. Ansible sees the variable name inside curly braces like this {{packages}}
and understands to loop over the list of items.
There are other ways of looping as well using keywords loop
and with_items
.
Editing Configuration file
We initially promised that we will automatically configure SSH to accept only keys and not plain text passwords. To do this we will again use lineinfile
module to rewrite our /etc/ssh/sshd_config
file and this time we will add a handler
(which is another concept in Ansible) that will restart the SSHD service everytime the configuration file changes. Remember, that ansible playbooks are idempotent, so it doesn't mean that the SSHD service will restart everytime you run the playbook.
- name: Configure SSH Daemon
lineinfile:
path: /etc/ssh/sshd_config
regexp: '^[(#)|(# )]?PasswordAuthentication [(yes)|(no)]+$'
line: 'PasswordAuthentication no'
notify: Restart SSHD
handlers:
- name: Restart SSHD
service:
name: sshd
state: restarted
This needs a bit of explanation. The lineinfile
module, here, searches for a given regexp
(a regular expression) which is essentially a pattern. Here, the pattern says look for a line that may start with #
or the same symbol followed with a whitespace, or nothing, followed by PasswordAuthentication
, followed by a space and then either a yes
or a no
followed by end of line, described by +$
. This may include the following lines:
#PasswordAuthentication yes
# PasswordAuthentication yes
PasswordAuthentication yes
And the same three patterns are repeated with no
at the end.
Once such a pattern is found, the lineinfile
module replaces it with PasswordAuthentication no
and then, if the state of the file changes, it notifies the handler Restart SSHD
to restart the sshd service so that the new configuration takes into affect. If the pattern is not found, the line is added at the end of the file.
If all of this seems a bit heavy, just go through the documentation. You don't need regular expressions to work with ansible. It is only a small part of it. But if you are interested in knowing more here is a really great video on the topic.
Conclusion
Here is a complete playbook for you to run, with different syntax for loops illustrated, and with a few additional handlers to clean up unused old packages from the system. Be sure to add your public ssh key to be appropriate spot.
---
- name: Initial Setup
hosts: all
remote_user: root
tasks:
- name: Add SSH key for root
lineinfile:
path: ~/.ssh/authorized_keys
create: yes
state: present
line: <YOUR PUBLIC SSH KEY HERE>
- name: Configure SSH Daemon
lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
with_items:
- { regexp: '^[(#)|(# )]?Port[ 0-9]+$', line: 'Port 22' }
- { regexp: '^[(#)|(# )]?PermitRootLogin [(yes)|(no)|(without\-password)]+$' , line: 'PermitRootLogin without-password' }
- { regexp: '^[(#)|(# )]?PasswordAuthentication [(yes)|(no)]+$', line: 'PasswordAuthentication no' }
notify: Restart SSHD
- name: Update all packages for Debian-like Systems
apt:
update_cache: yes
name: '*'
force_apt_get: yes
state: latest
when: ansible_facts['os_family'] == 'Debian'
notify: Autoremove Packages using APT
- name: Update all packages for RHEL like systems
dnf:
update_cache: yes
name: "*"
state: latest
when: ansible_facts['os_family'] == 'RedHat'
notify: Autoremove Packages using DNF
- name: Install Packages
package:
name: "{{packages}}"
state: present
vars:
packages:
- vim
- curl
- wget
handlers:
- name: Autoremove Packages using DNF
dnf:
autoremove: yes
- name: Autoremove Packages using APT
apt:
autoremove: yes
- name: Restart SSHD
service:
name: sshd
state: restarted
Sign up for our newsletter, as there are more Ansible tutorials to automate Docker, LAMP, LEMP and MEAN setup. And I hope that this playbook will save you a lot of time and hassle in the future!
A note about tutorials: We encourage our users to try out tutorials, but they aren't fully supported by our team—we can't always provide support when things go wrong. Be sure to check which OS and version it was tested with before you proceed.
If you want a fully managed experience, with dedicated support for any application you might want to run, contact us for more information.