Programster's Blog

Tutorials focusing on Linux, programming, and open source

Setting Up A Docker Swarm Cluster

Notice

Since the 1.12 release of docker, swarm is now built into the docker engine and you should read this tutorial instead.

Introduction

Docker Swarm is native clustering for Docker which turns multiple Docker hosts into a single virtual Docker host. This tutorial will utilize etcd for the backend discovery service but this is not the only way to deploy a cluster.

Terms of References

  • node - a computer, virtual machine, or server.
  • master/manager - the node or "gateway" that will be used to control the cluster. E.g. you will send your deployment commands to this node.

Prerequisites

  • 4 virtual machines
    • 1 for running the discovery service (etcd).
    • 1 for acting as the cluster manager/manager. We will call this swarm-master
    • 2 for deploying docker containers to. We will call these swarm1 and swarm2.

Steps

Install etcd on the etcd server and start the process by executing:

cd /path/to/etcd-v2.2.2-linux-amd64

MY_IP="[server IP]"

./etcd \
-name infra0 \
-initial-advertise-peer-urls http://$MY_IP:2380 \
-listen-peer-urls="http://0.0.0.0:2380,http://0.0.0.0:7001" \
-listen-client-urls="http://0.0.0.0:2379,http://0.0.0.0:4001" \
-advertise-client-urls="http://$MY_IP:2379" \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster infra0=http://$MY_IP:2380 \
-initial-cluster-state new

Install docker on each of the docker nodes (swarm1, swarm2, and master).

On each of the docker nodes, stop the docker daemon...

sudo service docker stop

... and then start the daemon using the host's networking.

docker daemon -H tcp://0.0.0.0:2375 \
-H unix:///var/run/docker.sock

Pull the Docker Swarm image on each of the swarm nodes (including the master).

docker pull swarm:latest

On each of the docker nodes except the master, join the swarm by executing:

docker run swarm join \
--advertise=[node IP or hostname]:2375 \
etcd://[etcd server IP or hostname]:2379/swarm

On the management node, execute the following to start managing the cluster.

docker run -p [management port]:2375 swarm manage \
etcd://etcd.programster.org:2379/swarm

Use a unique port for management. You will use this to send deployment commands with later. It can't be 2375.

List nodes In Discovery

At this point it's a good idea to check that your nodes are appearing to have registered in the discovery service correctly by running the following command on any node:

docker run --rm swarm list etcd://etcd.programster.org:2379/swarm

You should get output similar to:

swarm1.programster.org:2375
swarm2.programster.org:2375

Check Cluster Running Correctly

Now that the cluster should be up and running we should be able to get information about it from the master. Execute this command from any node, or even a computer not within the cluster.

docker -H tcp://[management node]:[management port] info

For example:

docker -H tcp://swarm-master.programster.org:4000 info

This should output something like the following:

Containers: 65
Images: 2
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
 swarm1.programster.org: swarm1.programster.org:2375
  └ Status: Healthy
  └ Containers: 56
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 783.7 MiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs
 swarm2.programster.org: swarm2.programster.org:2375
  └ Status: Healthy
  └ Containers: 9
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 783.7 MiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs
CPUs: 2
Total Memory: 1.531 GiB
Name: 920b14ff2d9f

Debugging

If your cluster is showing less nodes than you actually have, then it might be because two of your nodes have the same docker ID. This is usually the case if you installed docker on a virtual machine, and then cloned that machine for your other nodes, rather than installing docker on each node individually. You will be able to tell if this is the case because your management node will output something similar to:

time="2016-01-24T16:17:20Z" level=error msg="ID duplicated. OYOB:QJZT:BP46:BLOA:OTTD:ORHZ:2Z6X:PJZM:BLWD:MYTA:YOCK:NWUR shared by swarm2.programster.org:2375 and swarm1.programster.org:2375"

To fix this issue, simply delete the file at /etc/docker/key.json before restarting docker.

Deployment

Now that you have a working cluster, you can deploy something simple by executing:

docker -H tcp://swarm-master.programster.org:4000 run -d nginx

This will connect to the master and tell it to deploy nginx on one of the nodes it manages.

Conclusion

We now have a working swarm. Next time we will look into

  • securing our cluster with TLS.
  • exploring docker networking.
  • scheduling and filtering for deployment.

References