We are seeing a very strange issue using Azure Loadbancer with AKS.
We have a website that accepts WebSocket connections. this goes from client to AZ Loadbalancer and into the website itself inside AKS.
In our stress test app, we spawn 10k websocket connections. they all connect.
If we then non-gracefully stop the connections, then we see an initial drop of connections between the loadbalancer and the website.
But.. it then sticks around with a random number of connections for about 15-20 minutes before all connections disappear.
like so:
(The graph is the Loadbalancer inbound flow with the test app events drawn onto it)
If we instead run the load test app against the website itself, bypassing the AZ Loadbalancer, it all works as expected. killing the test app drops all the connections in the website instantly.
The Azure Loadbalancer does not have that many settings, we have tried both with Sticky-sessions: None and Client IP, same behavior
We have also verified that there are no hidden client connections from the test app machine. disabling the network completely, so that is not the issue.
We have no need for re-connecting sockets in that sense. The client can just make a full reconnect if disconnected.
Our impression is that the load balancer tries to be clever somehow, in case a connection might reconnect later.
If that is the case, can it somehow be disabled?
Any tips on what we should try is welcome