This is a follow-up to a prior question I asked, but with a different ask/approach. In case it matters, I'm on GKE, but I'm hoping there's a cloud-agnostic answer.
I'm trying to run the container factoriotools/factorio
, but the application has some particular requirements due to the use of an application-specific public server listing function, as seen in many video games that use user-hosted servers.
So far, I've been able to get things working with host networking, and direct connections to work with a NodePort Service. However, the application's public server listing function remains an issue for unprivileged containers.
Here's how Factorio figures out how to manage the public server listing:
- The container listens on one socket, say UDP Port 34197.
- The NodePort Service routes that traffic to 20635 publicly.
- The container sends a ping over its listening port (34197) to a ping-pong server, and the ping-pong server replies with the IP address and port it received the ping from. Outside of k8s, this would be port 34197 still.
- The container then uses this information to register with the server listing. IP Address, port, server name, and some other information.
- In GKE, the ping-pong gets routed to an arbitrary unused port (say, 40792).
- The container believes it is listening on a port other than the one I set up (20534), and then registers with the public server listing using the wrong port (40792, because the ping-pong server said so).
- Any attempt to connect from the public server listing then fails; the client believes that the server is listening on the port the ping-pong server witnessed (40792), but the container has been interacting with its listening port (34197) the entire time.
(I've been told this process is a variation of how ICE/STUN work)
So that means that if the container listens on 34197, but Kubernetes routes that to 20635 externally, both inbound and outbound traffic need to go through 20635 on the public side in order for the application's built-in server listing function to work.
If I bypass the public server listing, and connect directly to the container's node's public IP address with port 20635, it works flawlessly. But that's a pretty massive compromise for what I'm doing.
Host networking bypasses this entire issue by allowing the container to directly open whatever port it wants to on the host. For hosts that are already publicly exposed, this means nothing on the host (especially not k8s) can re-route the container traffic through extra layers and change the port numbers. So when the container opens port 34197, it gets port 34197. When it sends on port 34197, that traffic is sent on port 34197. The ping-pong server sees the port it's supposed to see. And because it's a public UDP port, it doesn't matter who sent the traffic first; traffic is traffic, the port is the port.
However, if I understand the docs correctly, running a container on the host's network stack requires a privileged container, which is effectively root access on the host, which is Very Bad In Production. So, for unprivileged containers, there needs to be a solution other than relying on host networking. I cannot find that "other solution".
I cannot find documentation anywhere about how to do this, or even evidence that anyone is thinking about it. How do I make this work?