Description of problem: While the upgrade is in process, the pods in the `openshift-vertical-pod-autoscaler` namespace, fail to get created with `no endpoints available for service "vpa-webhook"` messages. The vpa-admission-plugin-default pod itself fails to get created with the same error message causing a deadlock kind of situation. Steps to Reproduce: 1. Install VPA operator. 2. Start an upgrade from OCP v4.6.x to v4.6.y Actual results: Observed the following messages while upgrading: ~~~ $ oc get pod -n openshift-vertical-pod-autoscaler NAME READY STATUS RESTARTS AGE pod/vertical-pod-autoscaler-operator-6c64cd877b-46rmd 1/1 Running 0 14h pod/vpa-recommender-default-649f9f4479-jd4jx 1/1 Running 0 13h pod/vpa-updater-default-59bf95f4db-bvwld 1/1 Running 0 13h ~~~ - As the svc vpa-webhook has its endpoints populated as vpa-admission-plugin-default pods IP, the 'no endpoints available for service "vpa-webhook"' was encountered. [*] The description of replicaset for vpa-admission-plugin-default pod: ~~~ $ oc describe replicaset.apps/vpa-admission-plugin-default-7d4c654465 Name: vpa-admission-plugin-default-7d4c654465 Namespace: openshift-vertical-pod-autoscaler Selector: app=vpa-admission-controller,pod-template-hash=7d4c654465,vertical-pod-autoscaler=default ... Controlled By: Deployment/vpa-admission-plugin-default Replicas: 0 current / 1 desired Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 145m replicaset-controller Created pod: vpa-admission-plugin-default-7d4c654465-nq9ht Warning FailedCreate 105s (x21 over 24m) replicaset-controller Error creating: Internal error occurred: failed calling webhook "vpa.k8s.io": Post "https://vpa-webhook.openshift-vertical-pod-autoscaler.svc:443/?timeout=10s": no endpoints available for service "vpa-webhook" ~~~ - The replicaset itself is failing to create the replica due to no endpoints available for vpa-webhook svc. Expected results: The pods should be created without these messages and upgrade should complete. [ Workaround ] Deleting the mutatingwebhookconfigurations helps to overcome the issue: $ oc delete mutatingwebhookconfigurations vpa-webhook-config
*** Bug 1909982 has been marked as a duplicate of this bug. ***
This should be working in 4.7 already. We have a 4.6 PR open to fix it which we're working to merge: https://github.com/openshift/kubernetes-autoscaler/pull/186