Score:0

GKE Arm-based cluster starts in invalid state

sd flag

After I install a new GKE cluster on Arm-based VM it immediately starts in a failing state. Specifically antrea-controller-horizontal-autoscaler cannot be run as it has no toleration for Arm-based architecture.

Did it happen to anyone else? Official docs don't mention anything like that and HPA should be allowed for Arm-based clusters.

EDIT: I'm running a GKE Standard Cluster. By trial and error I've found out that the configuration that starts antrea-controller is Dataplane V2. It's unfortunate that there's not a single line in GKE docs about Antrea. Anyway, as Dataplane V2 is considered preferred by GKE and there's no mention about it not being ready for Arm, I expect this to be an issue of GKE. Antrea even seems to support ARM but GKE seems to not leverage it?

I suppose it'd be incorrect of me to change the managed resources as I have no guarantee for it to work in the future and/or get replaced by new version of Antrea configuration by GKE. Am I right?

Veera Nagireddy avatar
nl flag
Your right, as per the GCP official doc on [Use toleration for scheduling multi-arch workloads to any architecture](https://cloud.google.com/kubernetes-engine/docs/how-to/prepare-arm-workloads-for-deployment). Refer to [Troubleshooting Arm VMs](https://cloud.google.com/compute/docs/troubleshooting/troubleshooting-arm-vms#instance_boot_looping_with_default_debian_image) & [Troubleshooting Arm workloads](https://cloud.google.com/kubernetes-engine/docs/troubleshooting/troubleshooting-arm-workloads), which may help to resolve your issue.
Veera Nagireddy avatar
nl flag
Check you may set the tolerance value on the individual HPA object, it needs to be set to the HPA in Kubernetes global configuration. Change the tolerance value by modifying the configuration file of the controller manager and then restarting the controller manager that runs on the kubernetes control plane.
Martin D avatar
sd flag
Thanks for your comment @VeeraNagireddy. The articles you reference talk about how to prepare my own deployment for Arm. However the issue here is with the `kube-system` deployment that is created and managed by GKE. I do not feel that I should change managed deployments as any changes will be overriden with any future cluster update or even reconciled before. So unless I'm missing something, it's a bug in GKE?
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.