This seems to be network issues between Redis cluster (Deployed on our site's on-prem worker node) and Redis client (On GCP VM that is connected to our site via Cloud VPN). Specifically, the issue is related to the unresponsiveness of the Redis clients (i.e. redis-py or redis-cli) when using the info stats
/ info memory
commands (Large TCP Packets) while connected to the Redis cluster deployed on our site's node.
(https://i.stack.imgur.com/GNRMF.png)
Like above screenshot, when the Redis cluster is deployed within the Google Compute Engine (GCP) VMs (So at the same place as where redis client is), the info stats
/ info memory
commands work without any issues. (Note how 1,491 bytes for the info memory response)
(https://i.stack.imgur.com/LLhaT.png)
However, when the Redis cluster is deployed on the site's worker node, the Redis client hangs indefinitely when using the same commands, and the response from the server includes [TCP Previous segment not captured] Response: [fragment] [fragment] in the packet dump (Shown in the above screenshot).
(https://i.stack.imgur.com/WUuMC.png)
After reading (https://cloud.google.com/vpc/docs/mtu#handling_of_packets_that_exceed_mtu) (Above screenshot), I first thought it could be a problem regarding the MTU on Google VPC because it mentions that IP fragmentation is not supported in TCP.
Thus, my thinking flow was something like since we have more network layers to go through because of VPN, it means more bytes in the response -> And this exceeds the MTU limit -> So the client hangs. To be honest, I am not even sure whether I am on the right track. Valid MTU(https://cloud.google.com/vpc/docs/mtu#valid_mtus) are already specified however just to see if it changes anything, I also tried changing the VPC MTU to Jumbo (8,896 bytes) but did not help at all.
Shouldn't fragmentation happen by itself?
It is worth noting that all other info commands and any other commands (Which have much smaller packet sizes) work just fine in the same environment.
Therefore, the issue at hand seems to be network related issue, but I am unable to troubleshoot.
I would appreciate any help!