Score:2

Nginx Docker Container stops working irregularly

cn flag

The server:

I use Nginx as a ingress-proxy for my server. Nginx runs within a Docker container.

docker-compose.yml:

 nginx_ingress:
    image: nginx:latest
    ports:
      - "80:80"
      - "443:443"
    networks:
      front-tier: {}
      back-tier:
        ipv4_address: 172.28.1.1
    restart: always
    volumes:
      - /var/lib/my-server/config/nginx_ingress:/etc/nginx/conf.d
      - /var/lib/my-server/data/certbot/conf:/etc/letsencrypt
      - /var/lib/my-server/data/certbot/www:/var/www/certbot
    command: "/bin/sh -c 'while :; do sleep 6h & wait $${!}; nginx -s reload; done & nginx -g \"daemon off;\"'"

Since I manage multiple certificates with another container, I want Nginx to gracefully reload the config every 6 hours.

The idea is that I can manage all my certificates independent with another container. I don't want to have anything running on my host (no cronjob) and I don't want to combine my nginx-proxy-container with my certificate-container. I want every piece of the puzzle to run independent. (I got the idea from this tutorial: https://pentacent.medium.com/nginx-and-lets-encrypt-with-docker-in-less-than-5-minutes-b4b8a60d3a71)

The problem:

Sometimes the proxy (nginx) stops working. While the Docker container itself keeps running.

The logs show the following:

...
2021/10/25 05:51:42 [notice] 1#1: signal 3 (SIGQUIT) received, shutting down
2021/10/25 05:51:42 [notice] 12#12: gracefully shutting down
2021/10/25 05:51:42 [notice] 12#12: exiting
2021/10/25 05:51:42 [notice] 13#13: gracefully shutting down
2021/10/25 05:51:42 [notice] 12#12: exit
2021/10/25 05:51:42 [notice] 13#13: exiting
2021/10/25 05:51:42 [notice] 13#13: exit
2021/10/25 05:51:42 [notice] 1#1: signal 17 (SIGCHLD) received from 12
2021/10/25 05:51:42 [notice] 1#1: worker process 12 exited with code 0
2021/10/25 05:51:42 [notice] 1#1: worker process 13 exited with code 0
2021/10/25 05:51:42 [notice] 1#1: exit
2021/10/25 05:51:44 [notice] 1#1: using the "epoll" event method
2021/10/25 05:51:44 [notice] 1#1: nginx/1.21.3
2021/10/25 05:51:44 [notice] 1#1: built by gcc 8.3.0 (Debian 8.3.0-6) 
2021/10/25 05:51:44 [notice] 1#1: OS: Linux 5.4.0-86-generic
2021/10/25 05:51:44 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2021/10/25 05:51:44 [notice] 1#1: start worker processes
2021/10/25 05:51:44 [notice] 1#1: start worker process 9
2021/10/25 05:51:44 [notice] 1#1: start worker process 10

With sometimes I mean sometimes. I couldn't make out any pattern so far. First I thought Nginx is shutting down every 6 hours because of command. But it seems that this is not the case. I reduced the sleep time to 2 minutes and Nginx kept running fine for hours. Then I set the sleep time again to 6 hours and exactly 6 hours later Nginx stopped working (see the log above). I restarted the Docker container and since then Nginx is running (more than 24 hours have passed now). As you can see in the following logs, in most cases the Nginx reload works perfectly fine:

...
2021/10/25 11:51:44 [notice] 19#19: signal process started
2021/10/25 11:51:44 [notice] 1#1: signal 1 (SIGHUP) received from 19, reconfiguring
2021/10/25 11:51:44 [notice] 1#1: reconfiguring
2021/10/25 11:51:44 [notice] 1#1: using the "epoll" event method
2021/10/25 11:51:44 [notice] 1#1: start worker processes
2021/10/25 11:51:44 [notice] 1#1: start worker process 21
2021/10/25 11:51:44 [notice] 1#1: start worker process 22
2021/10/25 11:51:44 [notice] 10#10: gracefully shutting down
2021/10/25 11:51:44 [notice] 9#9: gracefully shutting down
2021/10/25 11:51:44 [notice] 10#10: exiting
2021/10/25 11:51:44 [notice] 9#9: exiting
2021/10/25 11:51:44 [notice] 9#9: exit
2021/10/25 11:51:44 [notice] 10#10: exit
2021/10/25 11:51:44 [notice] 1#1: signal 17 (SIGCHLD) received from 10
2021/10/25 11:51:44 [notice] 1#1: worker process 9 exited with code 0
2021/10/25 11:51:44 [notice] 1#1: worker process 10 exited with code 0
2021/10/25 11:51:44 [notice] 1#1: signal 29 (SIGIO) received
...
...
2021/10/25 17:51:44 [notice] 23#23: signal process started
2021/10/25 17:51:44 [notice] 1#1: signal 1 (SIGHUP) received from 23, reconfiguring
2021/10/25 17:51:44 [notice] 1#1: reconfiguring
2021/10/25 17:51:44 [notice] 1#1: using the "epoll" event method
2021/10/25 17:51:44 [notice] 1#1: start worker processes
2021/10/25 17:51:44 [notice] 1#1: start worker process 25
2021/10/25 17:51:44 [notice] 1#1: start worker process 26
2021/10/25 17:51:44 [notice] 22#22: gracefully shutting down
2021/10/25 17:51:44 [notice] 21#21: gracefully shutting down
2021/10/25 17:51:44 [notice] 21#21: exiting
2021/10/25 17:51:44 [notice] 22#22: exiting
2021/10/25 17:51:44 [notice] 21#21: exit
2021/10/25 17:51:44 [notice] 22#22: exit
2021/10/25 17:51:44 [notice] 1#1: signal 17 (SIGCHLD) received from 21
2021/10/25 17:51:44 [notice] 1#1: worker process 21 exited with code 0
2021/10/25 17:51:44 [notice] 1#1: signal 29 (SIGIO) received
2021/10/25 17:51:44 [notice] 1#1: signal 17 (SIGCHLD) received from 22
2021/10/25 17:51:44 [notice] 1#1: worker process 22 exited with code 0
2021/10/25 17:51:44 [notice] 1#1: signal 29 (SIGIO) received
...

My questions:

  1. Is there anything wrong with the command I use? Should the container stop as soon as Nginx stops? (maybe something wrong with the main-process?)
  2. Why does Nginx only stop irregularly? Why not every 6 hours? Do you see any difference between the first log and the second / third?
  3. Do you got any other suggestions how I could make Nginx reload itself? (as mentioned above, I don't want anything on the host + I don't want to combine nginx and certbot containers if not really necessary...)

Thank you for your help!

nulldevops avatar
cn flag
It seems to work now. I just need to `--force-recreate` on my `docker-compose up`. If I don't force-recreate something just throws nginx off from time to time... I hope this helps someone.
nulldevops avatar
cn flag
@lonix No I'm not using Docker Swarm.
lonix avatar
cn flag
I also get this problem, sometimes it works for months and sometimes it stops every few days. I'll try your force-recreate trick - thanks! But since you posted that, did you find the cause? This is really a baffling problem...
nulldevops avatar
cn flag
@lonix I haven't had any problems since. But I'm still unsure what caused this behavior.
lonix avatar
cn flag
Glad to hear that. PS you're not using docker stack/swarm are you? I am, but my config is similar to yours. I upgraded to latest nginx which has some bugfixes, maybe that'll help!
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.