Deploying a Bare Metal Kubernetes Cluster


Seems my blog needs a few updates in the buzz department. I like taking the shotgun approach, so we're going to deploy a 5 node bare metal docker cluster using ansible for host configuration and kubernetes for container management.

I wish I could have squeezed mesos into this story. But the kubernetes-mesos project isn't ready for prime time yet. So that'll be a post for another day.


  • ansible
  • librarian-ansible
  • git
  • CentOS 7, RHEL 7, Fedora 20+...whatever
  • some servers

Get the Ansible Work Environment

I've thrown together an ansible-sandbox project to automate the configuration of the clients. These aren't "production ready" by any means and almost certainly would require tweaking for a different target environment. However, I did make a first stab best effort to generalize them and make them moderately configurable. As such, they should serve more as a programmatic walk through then a turnkey solution.

Caveats aside, they should simplify the deployment of the environment.

% git clone
% cd ansible-sandbox
% librarian-ansible install

There's an inventory.example file included. You'll probably want to update the fqdn's to your local domain. The primary docker host is named 'dock1' by default. If you wish to change this, it's done in group_vars/dockers.yml.

Hostnames and Ansible

Ansible assumes it's capable of resolving the hostnames in your inventory or the ansible_ssh_host argument is set in your inventory. For example, if you don't have a dns server set up you might want to edit your inventory to look something like

[dockers] ansible_ssh_host=

Ansible general usage is beyond the scope of this walk through, so please see the upstream docs for any needed clarifications.

Configuring the Bootstrap Node

We're going to work from the assumption of a green field deployment. If the environment already has existing support services like PXE, dns, dhcp, file server, etc. I trust you already know how to adapt that environment accordingly.


Initial install is manual. In my particular configuration, I have a root drive in a RAID 1 and a very large storage array. I installed the OS with the following disk configuration.

  • /dev/sda1 : /boot
  • /dev/sda2 : lvm volgroup main - swap - /
  • /dev/sdb : lvm volgroup data - /var/lib/docker, btrfs - /var/docker/registry, ext4

Doesn't really matter. If you want to use the btrfs driver for docker (these scripts do), then you'll want to spin off a separate drive for mounting on /var/lib/docker.

This only needs to be done manually for the first host, our kickstart scripts will take care of the workers.

Configure My User Account

I'd recommend hacking together your own user (or at least replacing my key), regardless it's really convenient to have a passwordless ssh and sudo.

I use my 'jkyle' role for this and put it in a separate playbook from the site example called jkyle.yml. The first time we run, I have to provide a password.

% ansible-playbook --limit dock1 \
                   -i inventories/inventory.example \
                   --ask-pass \
                   --ask-sudo \

You might see something like this

PLAY [all]

ok: [dock1]

TASK: [jkyle | Install YUM Packages]
changed: [dock1] =>

TASK: [jkyle | Install APT Packages]
skipping: [dock1]

TASK: [jkyle | Create James]
changed: [dock1]

TASK: [jkyle | Configure jkyle sudoers]
changed: [dock1]

TASK: [jkyle | Deploy jkyle User Key]
changed: [dock1]

TASK: [jkyle | Setup James Home Directory]
changed: [dock1]

TASK: [jkyle | Link James configuration files]
changed: [dock1] => (item=zlogin)
changed: [dock1] => (item=zlogout)
changed: [dock1] => (item=zpreztorc)
changed: [dock1] => (item=zprofile)
changed: [dock1] => (item=zshenv)
changed: [dock1] => (item=zshrc)

dock1   : ok=7    changed=6    unreachable=0    failed=0

Services Configuration On Primary Host


The common role does some general configuration of networking, hostnames, etc. Make sure your inventory file includes the necessary group and host variables for your network. See the inventory.example file.


Our docker deployment provides both the local socket connection and a remote API with server/client certificates. Some example certificates are found in ansible-sandbox/librarian_roles/docker/files. However, you should probably generate your own as these are obviously insecure...particuarly if the servers have public ports. I generated the certs using a script and an openssl template you found in my CertGen github repo.

% ansible-playbook --limit dock1 \
                   -i inventories/inventory.example

You might see something like this

PLAY [dockers]

ok: [dock1]

TASK: [common | sudoers]
ok: [dock1]

TASK: [common | Yum Fastest Mirror]
ok: [dock1]

TASK: [common | Update System]
ok: [dock1]

TASK: [common | Install Packages]
ok: [dock1] => (item=bridge-utils,policycoreutils-python)

TASK: [common | disalbe firewalld]
ok: [dock1]

TASK: [common | disable network.service]
changed: [dock1]

TASK: [common | Configure Management Interface]
ok: [dock1]

TASK: [common | config sshd_config]
ok: [dock1]

TASK: [common | Base /etc/hosts Template]
changed: [dock1]

TASK: [common | Set Hostname]
ok: [dock1]

TASK: [common | Build hosts file]
changed: [dock1] => (item=dock1)
skipping: [dock1] => (item=dock2)
skipping: [dock1] => (item=dock3)
skipping: [dock1] => (item=dock4)
skipping: [dock1] => (item=dock5)

TASK: [docker | disable selinux]
changed: [dock1]

TASK: [docker | Install EPEL Repo]
ok: [dock1]

TASK: [docker | Install Packages]
ok: [dock1] => (item=docker,btrfs-progs,bridge-utils)

TASK: [docker | Install Packages]
skipping: [dock1]

TASK: [docker | Deploy ca.crt]
ok: [dock1]

TASK: [docker | Deploy server.crt]
ok: [dock1]

TASK: [docker | Deploy server.key]
ok: [dock1]

TASK: [docker | Create docker.socket directory path]
changed: [dock1]

TASK: [docker | Create docker.socket unit]
ok: [dock1]

TASK: [docker | Link docker.socket to standard location]
skipping: [dock1]

TASK: [docker | docker.service]
ok: [dock1]

TASK: [docker | Add Docker Users to Docker Group]
ok: [dock1] => (item=docker_users)

TASK: [docker | Enable & Start Services]
ok: [dock1] => (item=docker.socket)
ok: [dock1] => (item=docker.service)

dock1                      : ok=23   changed=5    unreachable=0    failed=0

After which, the following changes would have been applied:

  • A sudoers file deployed that allows individual configs in /etc/sudoers.d and passwordless sudo for wheel group members.
  • Installation of the yum fastest mirror plugin
  • A full system update
  • installation of bridge-utils, policycore utils
  • The firewalld daemon disabled. [*]
  • NetworkManager disabled. [†]
  • selinux disabled. [‡]
  • configuration of the management interface...probably already done, but this enforces it
  • a common sshd_config file deployed
  • /etc/hosts populated with entries for all the docker hosts
  • the target hosts hostname
  • A fully functional docker server with server/client certificates.

If you used the default certs, make sure to copy over the client certificate and certificate authorities to your ~/.docker directory.

You can interact with the docker server over tcp via

% export DOCKER_HOST=tcp://
% docker --tlsverify -i -t --rm ubuntu /bin/bash
% docker --tlsverify images

PXE Services

Next, we're going to deploy a container with pxe related services to bring up the worker hosts. I've created a project that builds a pxe server container given a specific environment context. You can use my docker-pxe-server project, or build your own...or even just install the services on the host itself.

The readme for the project should be sufficient to get going.

After pxe/dns/dhcp services are up for your environment, bounce the servers and bring them up. I typically do this with ipmitool. A script to do so might look something like

for i in 11 12 13 14;do
    ipmitool -I lanplus \
             -f ~/.racpasswd \
             -H 192.168.19.${i} \
             -U root chassis bootdev pxe
    ipmitool -I lanplus \
             -f ~/.racpasswd \
             -H 192.168.19.${i} \
             -U root chassis power cycle

Of course, you'll need your actually ILO/iDrac/etc ip's. Make a cup of coffee, catch up on the news. Whatever.


That completes the initial host configuration. You should have all the key services up such as an initial docker server & registry, PXE, DHCP, & DNS services, and your initial user account. Finally you should have a number of client hosts with fresh installs awaiting configuration.

Finalizing the Cluster.

All that's left is configuring the client hosts and installing kubernetes. Assuming our inventory is correctly configured in, we can just skip to what's needed for kubernetes and then run ansible on all nodes.


The kubernetes role assumes the required binaries and etcd are located in the


directory. These aren't bundled and since we're deploying the bleeding edge, it's probably best to build a fresh set anyway. They both have their own build systems that are covered in the docs, so please refer to the upstream kubernetes and etcd project for details.

A couple of hints though.

For Kubernetes, you'll want to read the in kubernetes's ./build directory.

For etcd, you'll probably just want to run these commands

docker build -t coreos/etcd .
docker run -d coreos/etcd
docker cp <container_id>:/opt/etcd/bin/etcd .
docker stop <container_id> && docker rm <container_id>

Then you have an amd64 etcd binary you can copy to roles/kubernetes/files.


No more configuration should be necessary. Assuming everything's gone well so far, the following should be sufficient

ansible-playbook --ask-pass \
                 --ask-sudo \
                 -i inventories/inventory.example \

Notice we're no longer limiting to the dock1 host or passing specific tags to run. Once done, you should be able to query your kubernetes minions. You can do so from any host where kubecfg is installed. Which includes all your hosts at /usr/local/bin/kubecfg. For example

% kubecfg -h list minions
Minion identifier

Next Steps.

Next up, you can deploy the Kubernetes GuestBook example. Or build your own!


[*]I had a hell of a time getting firewalld to work as advertised. Zones wouldn't persist, interfaces added to multiple zones, etc. Several upstream bugs were filed.Much easier to just write your own iptables rules.
[†]The network.service & NetworkManager seem to be mortal enemies in this release. You coudl tell NM manager to ignore a config, but it would happily still muck with it...such as adding it to the default zone if firewalld was enabled. In the end, leaving only NM running seemed to work best.
[‡]As of the time of this tutorial, the btrfs driver for docker does not support selinux

Share on: TwitterFacebookGoogle+Email

Comments !