Bug 1909982 - Cluster upgrade fails because of vpa webhook
Summary: Cluster upgrade fails because of vpa webhook
Keywords:
Status: CLOSED DUPLICATE of bug 1909983
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Joel Smith
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-22 08:47 UTC by Apoorva Jagtap
Modified: 2024-06-13 23:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-04 19:35:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Apoorva Jagtap 2020-12-22 08:47:54 UTC
Description of problem:

While the upgrade is in process, the pods in the `openshift-vertical-pod-autoscaler` namespace, fail to get created with `no endpoints available for service "vpa-webhook"` messages. The vpa-admission-plugin-default pod itself fails to get created with the same error message causing a deadlock kind of situation. 


Steps to Reproduce:
1. Install VPA operator.
2. Start an upgrade from OCP v4.6.x to v4.6.y

Actual results:

Observed the following messages while upgrading:
~~~
$ oc get pod -n openshift-vertical-pod-autoscaler 
NAME                                                    READY   STATUS    RESTARTS   AGE
pod/vertical-pod-autoscaler-operator-6c64cd877b-46rmd   1/1     Running   0          14h
pod/vpa-recommender-default-649f9f4479-jd4jx            1/1     Running   0          13h
pod/vpa-updater-default-59bf95f4db-bvwld                1/1     Running   0          13h
~~~ 
- As the svc vpa-webhook has its endpoints populated as vpa-admission-plugin-default pods IP, the 'no endpoints available for service "vpa-webhook"' was encountered.

[*] The description of replicaset for vpa-admission-plugin-default pod:
~~~
$ oc describe replicaset.apps/vpa-admission-plugin-default-7d4c654465
Name:           vpa-admission-plugin-default-7d4c654465
Namespace:      openshift-vertical-pod-autoscaler
Selector:       app=vpa-admission-controller,pod-template-hash=7d4c654465,vertical-pod-autoscaler=default
...
Controlled By:  Deployment/vpa-admission-plugin-default
Replicas:       0 current / 1 desired
Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Events:
  Type     Reason            Age                  From                   Message
  ----     ------            ----                 ----                   -------
  Normal   SuccessfulCreate  145m                 replicaset-controller  Created pod: vpa-admission-plugin-default-7d4c654465-nq9ht
  Warning  FailedCreate      105s (x21 over 24m)  replicaset-controller  Error creating: Internal error occurred: failed calling webhook "vpa.k8s.io": Post "https://vpa-webhook.openshift-vertical-pod-autoscaler.svc:443/?timeout=10s": no endpoints available for service "vpa-webhook"
~~~
- The replicaset itself is failing to create the replica due to no endpoints available for vpa-webhook svc.


Expected results:

The pods should be created without these messages and upgrade should complete.

Comment 1 Neelesh Agrawal 2021-01-04 19:35:08 UTC

*** This bug has been marked as a duplicate of bug 1909983 ***


Note You need to log in before you can comment on or make changes to this bug.