How to start etcd in docker from systemd?

Question

Score:1

Server

How to start etcd in docker from systemd?

Jonas

12/18/22, 9:07 PM

I want to start etcd (single node) in docker from systemd, but something seems to go wrong - it gets terminated about 30 seconds after start.

It looks like the service starts in status "activating" but get terminated after about 30 seconds without reaching the status "active". Perhaps there are any missing signalling between docker container and systemd?

Update (see bottom of post): systemd service status reaches failed (Result: timeout) - when I remove the Restart=on-failure instruction.

When I check the status of the etcd service after boot, I get this result:

$ sudo systemctl status etcd● etcd.service - etcd   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Wed 2021-08-18 20:13:30 UTC; 4s ago
  Process: 2971 ExecStart=/usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 /usr/local/bin/etcd --data-dir=/etcd-data --name etcd0 --advertise-client-urls http://10.0.0.11:2379 --listen-client-urls http://0.0.0.0:2379 --initial-advertise-peer-urls http://10.0.0.11:2380 --listen-peer-urls http://0.0.0.0:2380 --initial-cluster etcd0=http://10.0.0.11:2380 (code=exited, status=125)
 Main PID: 2971 (code=exited, status=125)

I run this on an Amazon Linux 2 machine, with a user data script to run at launch. I have confirmed that docker.service and docker_ecr_login.service run successfully.

And short after launch of the machine, I can see that the etcd is running:

 sudo systemctl status etcd
● etcd.service - etcd
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: activating (start) since Wed 2021-08-18 20:30:07 UTC; 1min 20s ago
 Main PID: 1573 (docker)
    Tasks: 9
   Memory: 24.3M
   CGroup: /system.slice/etcd.service
           └─1573 /usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com...

Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.690Z","logger":"raft","caller":"...rm 2"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.691Z","caller":"etcdserver/serve..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"membership/clust..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"etcdserver/server.go:2...
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"api/capability.g..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"etcdserver/serve..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"embed/serve.go:9...ests"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.695Z","caller":"etcdmain/main.go...emon"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.695Z","caller":"etcdmain/main.go...emon"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.702Z","caller":"embed/serve.go:1...2379"}
Hint: Some lines were ellipsized, use -l to show in full.

I get the same behavior wether etcd listen to the Node IP (10.0.0.11) or 127.0.0.1.

I can run etcd locally, started from command line (and it does not terminate after 30 seconds), with:

sudo docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd-local \
my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 \
/usr/local/bin/etcd --data-dir=/etcd-data \
--name etcd0 \
--advertise-client-urls http://127.0.0.1:2379 \
--listen-client-urls http://0.0.0.0:2379 \
--initial-advertise-peer-urls http://127.0.0.1:2380 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-cluster etcd0=http://127.0.0.1:2380

The parameters to etcd is similar to Running a single node etcd - ectd 3.5 documentation.

This is the relevant part of the startup script that is intended to launch etcd:

sudo docker volume create --name etcd-data

cat <<EOF | sudo tee /etc/systemd/system/etcd.service
[Unit]
Description=etcd
After=docker_ecr_login.service

[Service]
Type=notify
ExecStart=/usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data \
 --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 \
 /usr/local/bin/etcd --data-dir=/etcd-data \
 --name etcd0 \
 --advertise-client-urls http://10.0.0.11:2379 \
 --listen-client-urls http://0.0.0.0:2379 \
 --initial-advertise-peer-urls http://10.0.0.11:2380 \
 --listen-peer-urls http://0.0.0.0:2380 \
 --initial-cluster etcd0=http://10.0.0.11:2380
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable etcd
sudo systemctl start etcd

When listing all containers on the machine, I can see that it has been running:

sudo docker ps -a
CONTAINER ID   IMAGE                                                       COMMAND                  CREATED          STATUS                      PORTS                          NAMES
a744aed0beb1   my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0   "/usr/local/bin/etcd…"   25 minutes ago   Exited (0) 24 minutes ago                          etcd

but I suspect that it cannot be restarted since the container name already exists.

Why does the etcd container get terminated after ~30 seconds, when started from systemd? It appears like it successfully start, but systemd only shows it in status "activating" but never in status "active" and it seem to be terminated after about 30 seconds. Is there some missing signalling from the etcd docker container to systemd? If so, how do I get that signalling correct?

UPDATE:

After removing the Restart=on-failure instruction in the service unit file, I now get status: failed (Result: timeout):

$ sudo systemctl status etcd
● etcd.service - etcd
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: failed (Result: timeout) since Wed 2021-08-18 21:35:54 UTC; 5min ago
  Process: 1567 ExecStart=/usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 /usr/local/bin/etcd --data-dir=/etcd-data --name etcd0 --advertise-client-urls http://127.0.0.1:2379 --listen-client-urls http://0.0.0.0:2379 --initial-advertise-peer-urls http://127.0.0.1:2380 --listen-peer-urls http://0.0.0.0:2380 --initial-cluster etcd0=http://127.0.0.1:2380 (code=exited, status=0/SUCCESS)
 Main PID: 1567 (code=exited, status=0/SUCCESS)

Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.332Z","caller":"osutil/interrupt...ated"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.333Z","caller":"embed/etcd.go:36...379"]}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: WARNING: 2021/08/18 21:35:54 [core] grpc: addrConn.createTransport failed ...ing...
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.335Z","caller":"etcdserver/serve...6a6c"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.337Z","caller":"embed/etcd.go:56...2380"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.338Z","caller":"embed/etcd.go:56...2380"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.339Z","caller":"embed/etcd.go:36...379"]}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal systemd[1]: Failed to start etcd.
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal systemd[1]: Unit etcd.service entered failed state.
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal systemd[1]: etcd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

557

1 + 0

linux

systemd

docker

etcd

amazon-linux-2

How to start etcd in docker from systemd?

Post an answer