Setting Up A Docker Swarm Cluster
Important Notice!
Since the 1.12 release of docker, swarm is now built into the docker engine and you should read this tutorial instead.
Introduction
Docker Swarm is native clustering for Docker which turns multiple Docker hosts into a single virtual Docker host. This tutorial will utilize etcd for the backend discovery service but this is not the only way to deploy a cluster.
Terms of References
- node - a computer, virtual machine, or server.
- master/manager - the node or "gateway" that will be used to control the cluster. E.g. you will send your deployment commands to this node.
Prerequisites
- 4 virtual machines
- 1 for running the discovery service (etcd).
- 1 for acting as the cluster manager/manager. We will call this
swarm-master
- 2 for deploying docker containers to. We will call these
swarm1
andswarm2
.
Steps
Install etcd on the etcd server and start the process by executing:
cd /path/to/etcd-v2.2.2-linux-amd64 MY_IP="[server IP]" ./etcd \ -name infra0 \ -initial-advertise-peer-urls http://$MY_IP:2380 \ -listen-peer-urls="http://0.0.0.0:2380,http://0.0.0.0:7001" \ -listen-client-urls="http://0.0.0.0:2379,http://0.0.0.0:4001" \ -advertise-client-urls="http://$MY_IP:2379" \ -initial-cluster-token etcd-cluster-1 \ -initial-cluster infra0=http://$MY_IP:2380 \ -initial-cluster-state new
Install docker on each of the docker nodes (swarm1, swarm2, and master).
On each of the docker nodes, stop the docker daemon...
sudo service docker stop
... and then start the daemon using the host's networking.
docker daemon -H tcp://0.0.0.0:2375 \ -H unix:///var/run/docker.sock
Pull the Docker Swarm image on each of the swarm nodes (including the master).
docker pull swarm:latest
On each of the docker nodes except the master, join the swarm by executing:
docker run swarm join \ --advertise=[node IP or hostname]:2375 \ etcd://[etcd server IP or hostname]:2379/swarm
On the management node, execute the following to start managing the cluster.
docker run -p [management port]:2375 swarm manage \ etcd://etcd.programster.org:2379/swarm
List nodes In Discovery
At this point it's a good idea to check that your nodes are appearing to have registered in the discovery service correctly by running the following command on any node:
docker run --rm swarm list etcd://etcd.programster.org:2379/swarm
You should get output similar to:
swarm1.programster.org:2375 swarm2.programster.org:2375
Check Cluster Running Correctly
Now that the cluster should be up and running we should be able to get information about it from the master. Execute this command from any node, or even a computer not within the cluster.
docker -H tcp://[management node]:[management port] info
For example:
docker -H tcp://swarm-master.programster.org:4000 info
This should output something like the following:
Containers: 65 Images: 2 Role: primary Strategy: spread Filters: health, port, dependency, affinity, constraint Nodes: 2 swarm1.programster.org: swarm1.programster.org:2375 â”” Status: Healthy â”” Containers: 56 â”” Reserved CPUs: 0 / 1 â”” Reserved Memory: 0 B / 783.7 MiB â”” Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs swarm2.programster.org: swarm2.programster.org:2375 â”” Status: Healthy â”” Containers: 9 â”” Reserved CPUs: 0 / 1 â”” Reserved Memory: 0 B / 783.7 MiB â”” Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs CPUs: 2 Total Memory: 1.531 GiB Name: 920b14ff2d9f
Debugging
If your cluster is showing less nodes than you actually have, then it might be because two of your nodes have the same docker ID. This is usually the case if you installed docker on a virtual machine, and then cloned that machine for your other nodes, rather than installing docker on each node individually. You will be able to tell if this is the case because your management node will output something similar to:
time="2016-01-24T16:17:20Z" level=error msg="ID duplicated. OYOB:QJZT:BP46:BLOA:OTTD:ORHZ:2Z6X:PJZM:BLWD:MYTA:YOCK:NWUR shared by swarm2.programster.org:2375 and swarm1.programster.org:2375"
To fix this issue, simply delete the file at /etc/docker/key.json
before restarting docker.
Deployment
Now that you have a working cluster, you can deploy something simple by executing:
docker -H tcp://swarm-master.programster.org:4000 run -d nginx
This will connect to the master and tell it to deploy nginx on one of the nodes it manages.
Conclusion
We now have a working swarm. Next time we will look into
- securing our cluster with TLS.
- exploring docker networking.
- scheduling and filtering for deployment.
References
- Docker - Create a swarm for development
- Docker Docs - Docker Swarm overview
- Stack Overflow - docker swarm - etcd cluster is unavailable or misconfigured
First published: 16th August 2018