`aws-cloud-controller` fails to set `NodeNetworkUnavailable` to false

Question

Score:1

Server

`aws-cloud-controller` fails to set `NodeNetworkUnavailable` to false

rodorgas

10/4/23, 11:11 PM

I'm running a cluster with kops on AWS. Since I needed to have instances in the same VPC of the cluster, I reused an existing subnet:

kops create cluster --cloud=aws --zones=us-east-2a --node-size=t3.small --master-size=t3.small --name=${KOPS_CLUSTER_NAME} --subnets=subnet-c717c9ae --yes

However I got frequent errors:

$ kubectl logs -n kube-system -f aws-cloud-controller-manager-gv9bkgg
...
E1004 19:20:17.261728       1 route_controller.go:124] Couldn't reconcile node routes: error listing routes: unable to find route table for AWS cluster: cluster.mydomain.com

Then, I added a tag KubernetesCluster=cluster.mydomain.com to the subnet route table, which solved this problem, but created another:

I1004 20:44:37.606138       1 route_controller.go:199] Creating route for node i-06ac28fc2ced86895 100.96.2.0/24 with hint 05fd5d46-9513-4e89-8e50-1f44227549d4, throttled 15.465µs
I1004 20:44:37.606214       1 route_controller.go:199] Creating route for node i-03466cd9918eb7781 100.96.3.0/24 with hint d231da9b-b14d-48be-8ab4-ad3a18594071, throttled 4.837µs
I1004 20:44:38.343828       1 route_controller.go:219] Created route for node i-03466cd9918eb7781 100.96.3.0/24 with hint d231da9b-b14d-48be-8ab4-ad3a18594071 after 737.608324ms
I1004 20:44:38.360854       1 route_controller.go:219] Created route for node i-06ac28fc2ced86895 100.96.2.0/24 with hint 05fd5d46-9513-4e89-8e50-1f44227549d4 after 754.723911ms
I1004 20:44:38.361182       1 route_controller.go:313] Patching node status i-03466cd9918eb7781 with true previous condition was:nil
I1004 20:44:38.361301       1 route_controller.go:313] Patching node status i-06ac28fc2ced86895 with true previous condition was:nil
I1004 20:44:47.719493       1 route_controller.go:304] set node i-06ac28fc2ced86895 with NodeNetworkUnavailable=false was canceled because it is already set

Since then, this error is spamming the aws-cloud-controller logs around 10 times per minute:

I1004 22:53:26.151322       1 route_controller.go:304] set node i-01c013ae44a04b63b with NodeNetworkUnavailable=false was canceled because it is already set

What can I do to solve this issue? I already terminated the instances and updated the cluster so kops would recreate them, without luck. Maybe I could set NodeNetworkUnavailable to true, so controller can set it to false, but I don't know how to do it, nor if it really makes sense (also some sources said to avoid changing nodes' state).

96

0 + 0

amazon-web-services

kubernetes

kops

`aws-cloud-controller` fails to set `NodeNetworkUnavailable` to false

Post an answer