Score:1

Server

AWS Application Load Balancer bringing ASP.NET application down

Diego Jancic

4/13/23, 9:11 PM

I have an AWS Application Load Balancer configured with EC2 and an auto-scaling group. The EC2 instances run a Windows+IIS web server. The Web Server connects to a database.

It has happened in some situations (once every 2 months) that the Health Checks for the ALB start to detect the application as unhealthy and take the EC2 instances down. There are always at least 2 instances running, and this happens for all instances at the same time. I am trying to understand why this is happening and I cannot find any useful logs or indications of where this is coming from.

See how the instances are dropping to zero all of a sudden on 12/6:

Zoomed-in:

The EC2 instances are terminated as:

The Health Check is configured to ping a page that does not query the database, so a bottleneck in the database doesn't seem the likely cause.

When that happens, the response time skyrockets:

And also as measured by NewRelic:

Note a few things:

all phases of the response are slower (Redis time, .NET time, etc)
it happens to all servers are the same time, so unlikely to be a problem with within the server
it always happened outside of business hours when load is low

Auto-Scaling configurations:

Minimum capacity=2
Maximum capacity=15
Instances distribution= 50% On-Demand, 50% Spot
Include On-Demand base capacity=Designate the first 1 instances as On-Demand
On-Demand allocation strategy=Prioritized
Spot allocation strategy=Lowest price - diversified across the 10 lowest priced pools
Capacity rebalance=Off
Instance scale-in protection=Not protected from scale in
Termination policies=Default
Default cooldown=300

Target Group Configurations:

Protocol=HTTPS
Path=/path/to/login/page
Port=Traffic port
Healthy threshold=2 consecutive health check successes
Unhealthy threshold=4 consecutive health check failures
Timeout=20 seconds
Interval=25 seconds
Success codes=200

0 + 0

asp.net

load-balancing

amazon-web-services

Tim

4/13/23, 10:40 PM

Could it be something like Windows Update rebooting the servers after doing patching? To mitigate that you might be able to increase the unhealthy threshold to give the instances more time to recover. I wonder if you can stagger windows update times so one instance stays healthy. To diagnose further it would be easiest to somehow "quarantine" servers that fail health checks for manual inspection. Pushing server logs to Cloudwatch Logs might help so long as the logs are pushed promptly.

Diego Jancic

4/14/23, 2:40 AM

Thanks. How do I do that? It doesn't happen often and when it does the instances are immediately terminated as soon as they become unhealthy.

Tim

4/14/23, 3:10 AM

I don't know how to do it, I would have to do some research, which you can look into. The first thing to do though is to change your image to push logs to Cloudwatch logs as quickly as possible, that way at least you can see what the server is doing before the health checks fail. I would push windows and application logs.

shearn89

5/6/23, 4:38 PM

Given the reason is "user initiated shutdown" this sounds like a windows update or something else happening. Or some other scheduled task - are you working in an account that is part of an AWS organization that might have stuff running? My last employer had some lambdas that would shutdown instances based on tags...

Diego Jancic

5/6/23, 8:19 PM

There are no other things running that could affect that AFAIK. The Windows Update maybe could be if all instances updated at the same time, but since some of the newly created instances were failing as well (until 30 minutes later when all of a sudden started working), it seems very unlikely.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: AWS Application Load Balancer bringing ASP.NET application down

TH: AWS Application Load Balancer ทำให้แอปพลิเคชัน ASP.NET หยุดทำงาน

RO: AWS Application Load Balancer reduce aplicația ASP.NET

RU: AWS Application Load Balancer отключает приложение ASP.NET

VI: AWS Application Load Balancer gỡ bỏ ứng dụng ASP.NET

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.