I'm running a cluster with kops
on AWS. Since I needed to have instances in the same VPC of the cluster, I reused an existing subnet:
kops create cluster --cloud=aws --zones=us-east-2a --node-size=t3.small --master-size=t3.small --name=${KOPS_CLUSTER_NAME} --subnets=subnet-c717c9ae --yes
However I got frequent errors:
$ kubectl logs -n kube-system -f aws-cloud-controller-manager-gv9bkgg
...
E1004 19:20:17.261728 1 route_controller.go:124] Couldn't reconcile node routes: error listing routes: unable to find route table for AWS cluster: cluster.mydomain.com
Then, I added a tag KubernetesCluster=cluster.mydomain.com
to the subnet route table, which solved this problem, but created another:
I1004 20:44:37.606138 1 route_controller.go:199] Creating route for node i-06ac28fc2ced86895 100.96.2.0/24 with hint 05fd5d46-9513-4e89-8e50-1f44227549d4, throttled 15.465µs
I1004 20:44:37.606214 1 route_controller.go:199] Creating route for node i-03466cd9918eb7781 100.96.3.0/24 with hint d231da9b-b14d-48be-8ab4-ad3a18594071, throttled 4.837µs
I1004 20:44:38.343828 1 route_controller.go:219] Created route for node i-03466cd9918eb7781 100.96.3.0/24 with hint d231da9b-b14d-48be-8ab4-ad3a18594071 after 737.608324ms
I1004 20:44:38.360854 1 route_controller.go:219] Created route for node i-06ac28fc2ced86895 100.96.2.0/24 with hint 05fd5d46-9513-4e89-8e50-1f44227549d4 after 754.723911ms
I1004 20:44:38.361182 1 route_controller.go:313] Patching node status i-03466cd9918eb7781 with true previous condition was:nil
I1004 20:44:38.361301 1 route_controller.go:313] Patching node status i-06ac28fc2ced86895 with true previous condition was:nil
I1004 20:44:47.719493 1 route_controller.go:304] set node i-06ac28fc2ced86895 with NodeNetworkUnavailable=false was canceled because it is already set
Since then, this error is spamming the aws-cloud-controller logs around 10 times per minute:
I1004 22:53:26.151322 1 route_controller.go:304] set node i-01c013ae44a04b63b with NodeNetworkUnavailable=false was canceled because it is already set
What can I do to solve this issue? I already terminated the instances and updated the cluster so kops
would recreate them, without luck. Maybe I could set NodeNetworkUnavailable
to true, so controller can set it to false, but I don't know how to do it, nor if it really makes sense (also some sources said to avoid changing nodes' state).