Steps to reproduce:
kuberhealthy runs a deployment check regularly
While the deployment seems to complete it fails to report the status on kuberhealthy service
$ k get events -nkuberhealthy | grep deployment | tail
12m Normal ScalingReplicaSet deployment/deployment-deployment Scaled down replica set deployment-deployment-XXX to 2
12m Normal ScalingReplicaSet deployment/deployment-deployment Scaled up replica set deployment-deployment-XXXto 4
12m Normal ScalingReplicaSet deployment/deployment-deployment Scaled down replica set deployment-deployment-XXX to 0
3m31s Normal ScalingReplicaSet deployment/deployment-deployment Scaled up replica set deployment-deployment-XXX to 4
3m9s Normal ScalingReplicaSet deployment/deployment-deployment Scaled up replica set deployment-deployment-XXX to 2
3m9s Normal ScalingReplicaSet deployment/deployment-deployment Scaled down replica set deployment-deployment-69459d778b to 2
3m9s Normal ScalingReplicaSet deployment/deployment-deployment Scaled up replica set deployment-deployment-XXX to 4
3m Normal ScalingReplicaSet deployment/deployment-deployment Scaled down replica set deployment-deployment-XXX to 0
63m Warning FailedToUpdateEndpoint endpoints/deployment-svc Failed to update endpoint kuberhealthy/deployment-svc: Operation cannot be fulfilled on endpoints "deployment-svc": the object has been modified; please apply your changes to the latest version and try again
53m Warning FailedToUpdateEndpoint endpoints/deployment-svc Failed to update endpoint kuberhealthy/deployment-svc: Operation cannot be fulfilled on endpoints "deployment-svc": the object has been modified; please apply your changes to the latest version and try again
Alerts :
Expected Behavior:
this check should report success
Affected Branch:
Affected Build:
Affected Product Language:
debug logs
$ k logs deployment-XXX -nkuberhealthy
time="2022-12-16T12:36:43Z" level=info msg="Found instance namespace: kuberhealthy"
time="2022-12-16T12:36:43Z" level=info msg="Kuberhealthy is located in the kuberhealthy namespace."
time="2022-12-16T12:36:43Z" level=info msg="Debug logging enabled."
time="2022-12-16T12:36:43Z" level=debug msg="[/app/deployment-check]"
time="2022-12-16T12:36:43Z" level=info msg="Parsed CHECK_IMAGE: XXXX"
time="2022-12-16T12:36:43Z" level=info msg="Parsed CHECK_IMAGE_ROLL_TO: XXX"
time="2022-12-16T12:36:43Z" level=info msg="Found pod namespace: kuberhealthy"
time="2022-12-16T12:36:43Z" level=info msg="Performing check in kuberhealthy namespace."
time="2022-12-16T12:36:43Z" level=info msg="Parsed CHECK_DEPLOYMENT_REPLICAS: 2"
time="2022-12-16T12:36:43Z" level=info msg="Parsed CHECK_SERVICE_ACCOUNT: default"
time="2022-12-16T12:36:43Z" level=info msg="Check time limit set to: 14m46.760673918s"
time="2022-12-16T12:36:43Z" level=info msg="Parsed CHECK_DEPLOYMENT_ROLLING_UPDATE: true"
time="2022-12-16T12:36:43Z" level=info msg="Check deployment image will be rolled from [XXX] to [XXXX]"
time="2022-12-16T12:36:43Z" level=debug msg="Allowing this check 14m46.760673918s to finish."
time="2022-12-16T12:36:43Z" level=info msg="Kubernetes client created."
time="2022-12-16T12:36:43Z" level=info msg="Waiting for node to become ready before starting check."
time="2022-12-16T12:36:43Z" level=debug msg="Checking if the kuberhealthy endpoint: XXX is ready."
time="2022-12-16T12:36:43Z" level=debug msg="XXX."
time="2022-12-16T12:36:43Z" level=debug msg="Kuberhealthy endpoint: XXX is ready. Proceeding to run check."
time="2022-12-16T12:36:43Z" level=info msg="Starting check."
time="2022-12-16T12:36:43Z" level=info msg="Wiping all found orphaned resources belonging to this check."
time="2022-12-16T12:36:43Z" level=info msg="Attempting to find previously created service(s) belonging to this check."
time="2022-12-16T12:36:43Z" level=debug msg="Found 1 service(s)."
time="2022-12-16T12:36:43Z" level=debug msg="Service: kuberhealthy"
time="2022-12-16T12:36:43Z" level=info msg="Did not find any old service(s) belonging to this check."
time="2022-12-16T12:36:43Z" level=info msg="Attempting to find previously created deployment(s) belonging to this check."
time="2022-12-16T12:36:44Z" level=debug msg="Found 1 deployment(s)"
time="2022-12-16T12:36:44Z" level=debug msg=kuberhealthy
time="2022-12-16T12:36:44Z" level=info msg="Did not find any old deployment(s) belonging to this check."
time="2022-12-16T12:36:44Z" level=info msg="Successfully cleaned up prior check resources."
time="2022-12-16T12:36:44Z" level=info msg="Creating deployment resource with 2 replica(s) in kuberhealthy namespace using image XXX]"
time="2022-12-16T12:36:44Z" level=info msg="Creating container using image [XXX]"
time="2022-12-16T12:36:44Z" level=info msg="Created deployment resource."
time="2022-12-16T12:36:44Z" level=info msg="Creating deployment in cluster with name: deployment-deployment"
time="2022-12-16T12:36:44Z" level=info msg="Watching for deployment to exist."
time="2022-12-16T12:36:44Z" level=debug msg="Received an event watching for deployment changes: deployment-deployment got event ADDED"
time="2022-12-16T12:36:47Z" level=debug msg="Received an event watching for deployment changes: deployment-deployment got event MODIFIED"
time="2022-12-16T12:36:48Z" level=debug msg="Received an event watching for deployment changes: deployment-deployment got event MODIFIED"
time="2022-12-16T12:36:53Z" level=debug msg="Received an event watching for deployment changes: deployment-deployment got event MODIFIED"
time="2022-12-16T12:36:53Z" level=info msg="Deployment is reporting Available with True."
time="2022-12-16T12:36:53Z" level=info msg="Created deployment in kuberhealthy namespace: deployment-deployment"
time="2022-12-16T12:36:53Z" level=info msg="Creating service resource for kuberhealthy namespace."
time="2022-12-16T12:36:53Z" level=info msg="Created service resource."
time="2022-12-16T12:36:53Z" level=info msg="Creating service in cluster with name: deployment-svc"
time="2022-12-16T12:36:53Z" level=info msg="Watching for service to exist."
time="2022-12-16T12:36:53Z" level=debug msg="Received an event watching for service changes: ADDED"
time="2022-12-16T12:36:53Z" level=info msg="Cluster IP found:XXX"
time="2022-12-16T12:36:53Z" level=info msg="Created service in kuberhealthy namespace: deployment-svc"
time="2022-12-16T12:36:53Z" level=debug msg="Retrieving a cluster IP belonging to: deployment-svc"
time="2022-12-16T12:36:53Z" level=info msg="Found service cluster IP address: XXX"
time="2022-12-16T12:36:53Z" level=info msg="Looking for a response from the endpoint."
time="2022-12-16T12:36:53Z" level=debug msg="Setting timeout for backoff loop to: 3m0s"
time="2022-12-16T12:36:53Z" level=info msg="Beginning backoff loop for HTTP GET request."
time="2022-12-16T12:36:53Z" level=debug msg="Making GET to XXX"
time="2022-12-16T12:36:53Z" level=debug msg="Got a 401"
time="2022-12-16T12:36:53Z" level=info msg="Retrying in 5 seconds."
time="2022-12-16T12:36:58Z" level=error msg="error occurred making request to service in cluster: could not get a response from the given address: XXX"
time="2022-12-16T12:36:58Z" level=info msg="Cleaning up deployment and service."
time="2022-12-16T12:36:58Z" level=info msg="Attempting to delete service deployment-svc in kuberhealthy namespace."
time="2022-12-16T12:36:58Z" level=debug msg="Checking if service has been deleted."
time="2022-12-16T12:36:58Z" level=debug msg="Delete service and wait has not yet timed out."
time="2022-12-16T12:36:58Z" level=debug msg="Waiting 5 seconds before trying again."
time="2022-12-16T12:37:03Z" level=info msg="Attempting to delete deployment in kuberhealthy namespace."
time="2022-12-16T12:37:03Z" level=debug msg="Checking if deployment has been deleted."
time="2022-12-16T12:37:03Z" level=debug msg="Delete deployment and wait has not yet timed out."
time="2022-12-16T12:37:03Z" level=debug msg="Waiting 5 seconds before trying again."
time="2022-12-16T12:37:08Z" level=info msg="Finished clean up process."
time="2022-12-16T12:37:08Z" level=error msg="Reporting errors to Kuberhealthy: [could not get a response from the given address: XXX"