Score:0

figuring out why a docker container becomes unresponsive

cn flag

I'm using AWS Elastic Container Services to start and stop Docker containers as demand dictates. The problem is that occasionally, in the middle of the day, a subset of employees just lose the ability to access this containerized website. Killing the Docker containers, one by one, forcing new ones to be spun up, seems to resolve the issue, however.

What I don't understand is what's causing the Docker container to be unresponsive. If the Docker container just died out of the blue then a new one would be created to accommodate the demand but in this case the Docker container isn't dying and I'm not seeing errors on AWS either. But maybe I'm just not looking in the right place?

Tim avatar
gp flag
Tim
Are your containers are behind an ALB? Are sticky sessions defined on the ALB? What is your ALB health check to the containers? Is there any scaling before the problem happens, particularly scaling in? Is the container actually unresponsive or is the request not getting to the container?
cn flag
@Tim - the containers are indeed behind an ALB. Stickiness is disabled. I'm not sure if CloudWatch is an ALB health check but nothing looks suspicious in the CloudWatch graphs for the affected period.
cn flag
"_Is there any scaling before the problem happens, particularly scaling in?_" Not that I've noticed. Minimum tasks is 3 and maximum tasks is 10. The desired count right now is 3 as is the running count. idk what the point of the desired count is. I mean, I desire the minimum number of containers to minimize my monthly spend. That's the whole point of containers anyway, isn't it?
cn flag
"_Is the container actually unresponsive or is the request not getting to the container?_" It is not clear to me how I would make that determination? Each container has it's public and private IP address - maybe pinging each one on one of those IP addresses would be sufficient if I'm connected to an OpenVPN instance that's part of the same network?
Tim avatar
gp flag
Tim
VPC flow logs might help you understand if the request is getting to the container, or maybe Cloudwatch Logs if it's integrated. I suggest you look into ALB health checks to make sure the ALB knows for sure if your container is available to service requests.
Tim avatar
gp flag
Tim
Another way to check container health is to create an EC2 instance in the same subnet and make a request direct to the container, and via the ALB.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.