Score:1

How to whitelist egress traffic with a NetworkPolicy that doesn't prevent Apache Ignite from starting up?

ru flag

I have some more or less complex microservice architecture, where Apache Ignite is used as a stateless database / cache. The Ignite Pod is the only Pod in its Namespace and the architecture has to pass a security audit, which it won't pass if I don't apply the most restrictive NetworkPolicy possible for egress traffic. It has to restrict all possible traffic that is not needed by Ignite itself.

At first, I thought: Nice, Ignite does not push any traffic to other Pods (there are no other pods in that Namespace), so this is gonna be easily done restricting all egress traffic in the Namespace where Ignite is the only Pod! ...

Well, that didn't actually work out great:
Any egress rule, even if I allow traffic to all the ports mentioned in the Ignite Documentation, will cause the startup to fail with an IgniteSpiException that says Failed to retrieve Ignite pods IP addresses, Caused by: java.net.ConnectException: Operation timed out (Connection timed out).

The problem seems to be the TcpDiscoveryKubernetsIpFinder, especially the method getRegisteredAddresses(...) which obviously does some egress traffic inside the Namespace in order to register IP addresses of Ignite nodes. The disovery port 47500 is of course allowed, but that does not change the situation. The functionality of Ignite with the other Pods from other Namespaces is working without egress rules applied, which means (to me) that the configuration concerning ClusterRole, ClusterRoleBinding, a Service in the Namespace and the xml configuration of Ignite itself etc. seems to be correct. Even ingress rules restricting traffic from other namespaces are working as expected, allowing exactly the desired traffic.

These are the policies I applied:

[WORKING, blocking undesired traffic only]:

## Denies all Ingress traffic to all Pods in the Namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-ingress-in-cache-ns
  namespace: cache-ns
spec:
  # selecting nothing here will deny all traffic between pods in the namespace
  podSelector:
    matchLabels: {}
  # traffic routes to be considered, here: incoming exclusively
  policyTypes:
    - Ingress
## Allows necessary ingress traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: netpol-cache-ns
  namespace: cache-ns
# defines the pod(s) that this policy is targeting
spec:
  policyTypes:
    - Ingress
  podSelector:
    matchLabels:
      app: ignite
  # <----incoming traffic----
  ingress:
    - from:
      - namespaceSelector:
          matchLabels:
            zone: somewhere-else
        podSelector:
          matchExpressions:
            - key: app
              operator: In
              values: [some-pod, another-pod]  # dummy names, these Pods don't matter at all
      ports:
        - port: 11211   # JDBC
          protocol: TCP
        - port: 47100   # SPI communication
          protocol: TCP
        - port: 47500   # SPI discovery (CRITICAL, most likely...)
          protocol: TCP
        - port: 10800   # SQL
          protocol: TCP
# ----outgoing traffic---->
# NONE AT ALL

With these two applied, everything is working fine, but the security audit will say something like
Where are the restrictions for egress? What if this node is hacked via the allowed routes because one of the Pods using these routes was hacked before? It may call a C&C server then! This configuration will not be permitted, harden your architecture!

[BLOCKING desired/necessary traffic]:

Generally deny all traffic...

## Denies all traffic to all Pods in the Namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-traffic-in-cache-ns
  namespace: cache-ns
spec:
  # selecting nothing here will deny all traffic between pods in the namespace
  podSelector:
    matchLabels: {}
  # traffic routes to be considered, here: incoming exclusively
  policyTypes:
    - Ingress
    - Egress   # <------ THIS IS THE DIFFERENCE TO THE WORKING ONE ABOVE

... and allow specific routes afterwards

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: netpol-cache-ns-egress
  namespace: cache-ns
# defines the pod(s) that this policy is targeting
spec:
  policyTypes:
    - Egress
  podSelector:
    matchLabels:
      app: ignite
  ----outgoing traffic---->
  egress:
    # [NOT SUFFICIENT]
    # allow egress to this namespace at specific ports
    - to:
      - namespaceSelector:
          matchLabels:
            zone: cache-zone
      ports:
        - protocol: TCP
          port: 10800
        - protocol: TCP
          port: 47100   # SPI communication
        - protocol: TCP
          port: 47500
    # [NOT SUFFICIENT]
    # allow dns resolution in general (no namespace or pod restriction)
    - ports:
      - protocol: TCP
        port: 53
      - protocol: UDP
        port: 53
    # [NOT SUFFICIENT]
    # allow egress to the kube-system (label is present!)
    - to:
      - namespaceSelector:
          matchLabels:
            zone: kube-system
    # [NOT SUFFICIENT]
    # allow egress in this namespace and for the ignite pod
    - to:
      - namespaceSelector:
          matchLabels:
            zone: cache-zone
        podSelector:
          matchLabels:
            app: ignite
    # [NOT SUFFICIENT]
    # allow traffic to the IP address of the ignite pod
    - to:
      - ipBlock:
          cidr: 172.21.70.49/32  # won't work well since those addresses are dynamic
      ports:
        - port: 11211   # JDBC
          protocol: TCP
        - port: 47100   # SPI communication
          protocol: TCP
        - port: 47500   # SPI discovery (CRITICAL, most likely...)
          protocol: TCP
        - port: 49112   # JMX
          protocol: TCP
        - port: 10800   # SQL
          protocol: TCP
        - port: 8080    # REST
          protocol: TCP
        - port: 10900   # thin clients
          protocol: TCP

Apache Ignite version used is 2.10.0

Now the question to all readers is:

How can I restrict Egress to an absolute minimum inside the Namespace so that Ignite starts up and works correctly? Would it be sufficient to just deny Egress to outside the cluster?

If you need any more yamls for an educated guess or hint, please feel free to request them in a comment.
And sorry for the tagging if it seems inappropriate, I couldn't find the tag kubernetes-networkpolicy as it is present on stackoverflow.

UPDATE:

Executing nslookup -debug kubernetes.default.svc.cluster.local from inside the ignite pod wihtout any policy restricting egress shows

BusyBox v1.29.3 (2019-01-24 07:45:07 UTC) multi-call binary.

Usage: nslookup HOST [DNS_SERVER]

Query DNS about HOST

As soon as (any) NetworkPolicy is applied that restricts Egress to specific ports, pods and namespaces the Ignite pod refuses to start and the lookup does not reach kubernetes.default.svc.cluster.local anymore.

Egress to DNS was allowed (UDP 53 to k8s-app: kube-dns) ⇒ still no ip lookup possible

Mikołaj Głodziak avatar
id flag
Which version of Kubernetes did you use and how did you set up the cluster? Did you use bare metal installation or some cloud providor?
deHaar avatar
ru flag
Cloud provider and I am currently updating the cluster and the workers to 1.21
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.