Deploy systemd units in your Nomad cluster
Back in 2014 before Kubernetes even existed, CoreOs
included a simple cluster-scheduler called
Fleet
. With Fleet you could aggregate your
individual machines into a pool of resources and deploy systemd unit files to
them. You could choose to either run your units globally on all machines at the
same time or limit them to a set of hosts. The idea behind it was to treat your
machines as if they would share an init system.
In 2018 Fleet was removed from CoreOs in favor of Kubernetes and is since then
no longer maintained. Nevertheless the idea of being able to define a systemd
unit and deploy it on a set of machines still seems like it could be useful. So
after some tinkering I came up with a way to do exactly that using Nomad
and
my systemd-nspawn
driver.
I am going to show you how to deploy a simple systemd unit for Consul running inside a vanilla Debian image into your Nomad cluster.
The unit file
Using the
template
stanza
inside a Nomad job file, we will render a systemd unit for Consul in the local
task directory.
template {
data = <<EOH
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
[Service]
ExecStart=[[ env "NOMAD_TASK_DIR" ]]/consul/consul agent -dev -bind '{{ GetInterfaceIP "host0" }}' -client '{{ GetInterfaceIP "host0" }}'
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOH
destination = "local/systemd/consul.service"
left_delimiter = "[["
right_delimiter = "]]"
}
The above generates a simple unit file which runs Consul in dev mode into the
local/systemd
directory inside the started task. Consul is instructed to bind
it’s addresses to the host0
interface which is available inside a
systemd-nspawn container running with private networking enabled. To not
interfere with the command line arguments of Consul, we tell Nomad to use [[
and ]]
to delimit templating commands instead of the usual {{
and }}
.
To download the Consul binary, we make use of the
artifact
stanza.
artifact {
source = "https://releases.hashicorp.com/consul/1.9.0/consul_1.9.0_linux_amd64.zip"
destination = "local/consul"
}
With this unit file rendered, and the necessary binary downloaded, we need a way
to enable it on startup. Also we need to figure out how to make systemd load a
custom unit file from the local task directory instead of /etc/systemd/system
Enabling the unit
The usual way to enable a systemd unit is to run systemctl enable <unit name>
.
It will create a symbolic link inside the
/etc/systemd/system/multi-user.target.wants
pointing to your unit file.
Another way to enable a unit file without running a command, is to create a drop-in file for the
multi-user
target. In this file you need to define a Wants
section which contains your
unit name. This ensures that your unit is started before the
multi-user.target
, the same way as using systemctl enable
does.
Using another template
stanza in our job file, we can use the second method to
enable the unit file we rendered to the local task directory.
template {
data = <<EOH
[Unit]
Wants=consul.service
EOH
destination = "local/systemd/multi-user.target.d/wants.conf"
}
Loading systemd units from a different path
Systemd includes a nice little feature with allows you to specify additional
paths from where unit files are loaded on startup. All you need to do, is to
set the environment variable SYSTEMD_UNIT_PATH
to the directory containing
your files. If it ends with a :
, the usual load paths will be appended to the
content of the variable. This is similar to how you set the PATH
variable
inside your shell.
To set this, we simply need to make sure our systemd-container is started in
boot
mode. Then all environment variables we pass to it, will be available to
systemd on startup. Using boot
mode is the default behavior of the
systemd-nspawn task driver so we only need to define the image we want to use
and the mentioned environment variable.
config {
image = "consul"
image_download {
url = "https://cloud.debian.org/images/cloud/buster/20201214-484/debian-10-generic-amd64-20201214-484.qcow2"
force = true
type = "raw"
}
environment = {
SYSTEMD_UNIT_PATH = "${NOMAD_TASK_DIR}/systemd:"
}
}
The complete job
With all of the above in place, the complete job file now looks like this
job "consul" {
datacenters = ["dc1"]
type = "service"
group "linux" {
count = 1
task "consul" {
driver = "nspawn"
config {
image = "consul"
image_download {
url = "https://cloud.debian.org/images/cloud/buster/20201214-484/debian-10-generic-amd64-20201214-484.qcow2"
force = true
type = "raw"
}
environment = {
SYSTEMD_UNIT_PATH = "${NOMAD_TASK_DIR}/systemd:"
}
}
artifact {
source = "https://releases.hashicorp.com/consul/1.9.0/consul_1.9.0_linux_amd64.zip"
destination = "local/consul"
}
template {
data = <<EOH
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
[Service]
ExecStart=[[ env "NOMAD_TASK_DIR" ]]/consul/consul agent -dev -bind '' -client ''
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOH
destination = "local/systemd/consul.service"
left_delimiter = "[["
right_delimiter = "]]"
}
template {
data = <<EOH
[Unit]
Wants=systemd-networkd.service systemd-resolved.service consul.service
EOH
destination = "local/systemd/multi-user.target.d/wants.conf"
}
}
}
}
If you deploy the job inside your Nomad cluster and spawn a shell inside the task, you can see that the unit file defined in the job file is properly loaded on startup.
➜ ~ nomad exec -job consul /bin/bash
root@buster:/# systemctl status consul
● consul.service - "HashiCorp Consul - A service mesh solution"
Loaded: loaded (/local/systemd/consul.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2021-01-17 19:20:59 CET; 2min 15s ago
Docs: https://www.consul.io/
Main PID: 36 (consul)
CGroup: /system.slice/consul.service
└─36 /local/consul/consul agent -dev -bind -client
Jan 17 19:20:59 buster consul[36]: 2021-01-17T19:20:59.750+0100 [INFO] agent.server: member joined, marking health alive: member=buster
Jan 17 19:20:59 buster consul[36]: 2021-01-17T19:20:59.791+0100 [INFO] agent.server: federation state anti-entropy synced
Jan 17 19:20:59 buster consul[36]: 2021-01-17T19:20:59.826+0100 [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth
Jan 17 19:20:59 buster consul[36]: 2021-01-17T19:20:59.826+0100 [INFO] agent: Synced node info
Jan 17 19:21:01 buster consul[36]: 2021-01-17T19:21:01.771+0100 [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth
Jan 17 19:21:01 buster consul[36]: 2021-01-17T19:21:01.771+0100 [DEBUG] agent: Node info in sync
Jan 17 19:22:53 buster consul[36]: 2021-01-17T19:22:53.841+0100 [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth
Jan 17 19:22:53 buster consul[36]: 2021-01-17T19:22:53.841+0100 [DEBUG] agent: Node info in sync
Jan 17 19:22:59 buster consul[36]: 2021-01-17T19:22:59.699+0100 [DEBUG] agent.router.manager: Rebalanced servers, new active server: number_of_servers=1 active_server="buster.dc1 (Addr: tcp/192.168.74.222:8300) (DC: dc1)"
Jan 17 19:22:59 buster consul[36]: 2021-01-17T19:22:59.700+0100 [DEBUG] agent.router.manager: Rebalanced servers, new active server: number_of_servers=1 active_server="buster (Addr: tcp/192.168.74.222:8300) (DC: dc1)"
root@buster:/#
And that’s all there is to it :-). I hope you find this as useful as I do
Jan
© 2024 JanMa's Blog ― Powered by Jekyll and hosted on GitLab