I administer a few Debian servers with Docker CE installed and docker-compose orchestrating several services (roughly 20 containers per machine)..
Every single service is configured with restart: always
in docker-compose. However, there are random services (usually 1-2 per machine) that shut down correct but do not restart during a host reboot. This behaviour is completely random. Sometimes all services start, sometimes one single service from a docker-compose.yml file does not restart.
Following is an example where Traefik did shut down correctly but did not come up:
- Service configured to restart automatically:
$ cat docker-compose.yml
version: '3'
services:
reverse-proxy:
image: traefik:1.7
restart: always
command: --web
ports:
- "80:80"
- "443:443"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./traefik.toml:/traefik.toml
networks:
- web
- Exit code is 128
$ docker-compose ps
Name Command State Ports
---------------------------------------------
traefik_reverse-proxy_1 /traefik --web Exit 128
- Logs show that the service has shut down correctly but there are no signs of activity since then:
$ docker-compose logs --tail 6 -t
Attaching to traefik_reverse-proxy_1
reverse-proxy_1 | 2022-01-21T14:05:28.042399112Z time="2022-01-21T14:05:28Z" level=info msg="Stopping server gracefully"
reverse-proxy_1 | 2022-01-21T14:05:28.042450915Z time="2022-01-21T14:05:28Z" level=debug msg="Waiting 10s seconds before killing connections on entrypoint http..."
reverse-proxy_1 | 2022-01-21T14:05:28.042463326Z time="2022-01-21T14:05:28Z" level=debug msg="Waiting 10s seconds before killing connections on entrypoint api..."
reverse-proxy_1 | 2022-01-21T14:05:28.053256515Z time="2022-01-21T14:05:28Z" level=debug msg="Entrypoint api closed"
reverse-proxy_1 | 2022-01-21T14:05:28.053283046Z time="2022-01-21T14:05:28Z" level=debug msg="Entrypoint http closed"
reverse-proxy_1 | 2022-01-21T14:05:28.059721498Z time="2022-01-21T14:05:28Z" level=info msg="Shutting down"
- Server uptime corresponds with the shutdown message:
$ uptime
11:21:31 up 29 days, 20:15, 1 user, load average: 0.46, 0.43, 0.44
- My docker-version is the following:
$ docker --version
Docker version 19.03.12, build 48a66213fe
Let's not focus on Traefik alone because it is completely random which container does not start and when.