I occasionally experience massive performance issues accessing one of my systems via network (internet). I tried to narrow down the cause of the issue, actually made some progress, but I fail to proceed since I simply do not have much knowledge about low level network technology. Which is why I could use some help here, ideas, hints ...
The system is a private VPS (operated by a service provider on the base of KVM). It offers some services like web and email, at no point I can observe the system to hit any limitations like memory or processing power. I sometimes observe ridiculous low download speeds (~80kb/sec) compared to normal requests (10mb/sec). That problem can be observed only on some client systems accessing that system, not on others. If the issue arises at all, then it is reproduceable. The issue appears to be independent from factors like local network uplink, system load, time of day, size of requested resource. As far as I can see it is also independent of the high level protocol, I see the same issue with http
, https
, imaps
and scp
if it occurs at all on that client system. For a test I also replaced the apache http server with a NGINX installation - same behavior. The issue is independent of the operating system used on the client side.
I have now been able for the first time to reproduce the issue on one of my development systems, with somewhat surprising insights:
I experience the issue at a certain location, not at others (!). Interesting enough when using the local network uplink (WiFi and DSL) but likewise when using my mobile phone for the uplink. However the issue does not occur using the same uplink routes when I enable my VPN tunnel! So with VPN I see an improvement of a factor 1000 immediately. Regardless of the network route. The important detail here: the VPN peer I connect to is located on exactly that system, I operate the VPN service myself (wireguard). Which indeed does make a difference to routing inside the server system.
Without VPN I experience massive packages losses (I can see those using tcpdump
and wireshark
). I see no package losses when I enable the VPN. Certainly VPN in some cases takes care of package losses, but the massively raised throughput tells me that this is not the only benefit here. The difference seems to be either the local routing inside the peer system (the server) (packages traveling out through different interfaces) or maybe some form of problem with network borders that depends on the package size and content (I know little about such details).
You can see that I am at a loss here. I would like to understand what possible issues I might be looking at. So that I can address the providers support people (they actually do have in depth knowledge, I talked to one months ago about a different issue, I was impressed by his skills and the time he invested).
Are there more tests I could apply to further narrow down the issue?
Does anyone have an idea what the actual cause of the issue might be?