I am operating an internal Kubernetes platform and we are upgrading Gatekeeper from v3.9.0 to v3.11.0. This upgrade includes adding v1 to the CRDs of a number of custom resources in the mutations.gatekeeper.sh
group (including Assign).
In order to allow us to downgrade Gatekeeper, I have patched the storage version of these CRDs to be v1beta1 instead of v1; this means that the storage version remains at v1beta1 after the upgrade and allows us to roll back if we find an issue after the Gatekeeper upgrade.
The upgrade works fine and the new version of Gatekeeper works as expected, however when I try to downgrade (which starts by applying the previous CRDs to remove version v1) I am getting the following error from a Flux Kustomization (which we are using to install cluster configuration): Assign/[ASSIGN_NAME] dry-run failed, error: request to convert CR to an invalid group/version: mutations.gatekeeper.sh/v1
.
I do not understand why the API server is trying to make that conversion since all of the Assign manifests specify apiVersion: mutations.gatekeeper.sh/beta1
and so shouldn't need to be converted since they match with the storage version.
In addition, if I upgrade the Gatekeeper CRDs but do not upgrade to the new Gatekeeper image then the downgrade works fine, so this error seems to be caused by Gatekeeper interacting with version v1 after the upgrade, despite the storage version remaining at v1beta1. Of course, this means that we could perform this upgrade across two releases (and still be able to roll back after each release) but I do not understand why this is necessary.
I suspect that this issue is related somehow to version priority.
I am running on GKE v1.24.8-gke.2000 with Flux v0.37.0.