Bug 2024900 - Operator upgrade kube-apiserver
Summary: Operator upgrade kube-apiserver
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Arda Guclu
QA Contact: Eldar Weiss
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-19 13:12 UTC by Devan Goodwin
Modified: 2022-03-10 16:30 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:29:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-baremetal-operator pull 217 0 None open Bug 2024900: Not enable CBO webhook in unsupported platform 2021-11-19 15:07:16 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:30:04 UTC

Description Devan Goodwin 2021-11-19 13:12:48 UTC
Operator upgrade kube-apiserver

has begun failing frequently in CI, see:
https://sippy.ci.openshift.org/sippy-ng/tests/4.10/analysis?test=Operator%20upgrade%20kube-apiserver

Aggregated jobs on CI payloads appear to have caught a regression in this test, historically passing 100% of the time, now failing 20-30% of the time.

A good sample prow job would be:

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-azure-upgrade/1461619400548290560

Test failure looks as follows:

Failed to upgrade kube-apiserver, operator was degraded (ValidatingAdmissionWebhookConfiguration_WebhookServiceConnectionError): ValidatingAdmissionWebhookConfigurationDegraded: vprovisioning.kb.io: dial tcp 172.30.203.253:443: connect: connection refused

It is often accompanied by:

operator conditions kube-apiserver expand_less 	0s
Operator degraded (ValidatingAdmissionWebhookConfiguration_WebhookServiceConnectionError): ValidatingAdmissionWebhookConfigurationDegraded: vprovisioning.kb.io: dial tcp 172.30.203.253:443: connect: connection refused

The problem appears to have begun last night, somewhere around this CI release:

https://amd64.ocp.releases.ci.openshift.org/releasestream/4.10.0-0.ci/release/4.10.0-0.ci-2021-11-19-045525

This payload did contain a kube apiserver operator change:

cluster-kube-apiserver-operator

    set kube-apiserver degraded=true if a webhook service is missing or down #1245

Comment 3 Devan Goodwin 2021-11-19 13:43:16 UTC
Problem has likely been around for awhile, but new checks went in last night which caught the problem: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1256 has been opened to revert the new checks while a proper fix is pursued.

Reverting so we can get payloads flowing again, the checks look great, just need to solve this before they can go in.

Comment 10 errata-xmlrpc 2022-03-10 16:29:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.