Description of problem: During the upgrade from 4.7.0-0.nightly-2021-05-17-040457 to 4.8.0-0.nightly-2021-05-19-092807, it fails with authentication degraded. Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2021-05-17-040457 to 4.8.0-0.nightly-2021-05-19-092807 How reproducible: Not sure Steps to Reproduce: 1. 2. 3. Actual results: Upgrade from 4.7.0-0.nightly-2021-05-17-040457 to 4.8.0-0.nightly-2021-05-19-092807 hangs with authentication degraded: oc describe co authentication shows: Conditions: Last Transition Time: 2021-05-19T14:41:33Z Message: OAuthServiceEndpointsCheckEndpointAccessibleControllerDegraded: Get "https://10.129.0.17:6443/healthz": context canceled Reason: OAuthServiceEndpointsCheckEndpointAccessibleController_SyncError Status: True Type: Degraded Last Transition Time: 2021-05-19T14:42:48Z Message: All is well Reason: AsExpected Status: False Type: Progressing Last Transition Time: 2021-05-19T14:44:48Z Message: All is well Reason: AsExpected Status: True Type: Available Check the must gather log, 10.129.0.17:6443 is the ip of the pod which belongs to openshift-authentication but the pod is deleted at "May 19 14:40:35.293321" and new pods are created around "2021-05-19T14:40+" with new ips 10.130.0.38|10.128.0.53|10.129.0.56. From the upgrade CI log, the health check happens at '[2021-05-19T17:12:11.721Z]', more than 2 hours later, but still uses the previous pod ip(10.129.0.17) not the new pod ip(10.130.0.38|10.128.0.53|10.129.0.56) to do health check. That's the failure reason for health check must gather log link: http://file.rdu.redhat.com/~xxia/bug_1967398_must-gather.local.5095653185111688673.tar.gz Expected results: Upgrade from 4.7.0-0.nightly-2021-05-17-040457 to 4.8.0-0.nightly-2021-05-19-092807 successes. Additional info: matrix: 27_UPI on GCP with RHCOS && XPN
Test upgrade from 4.7.0-0.nightly-2021-06-07-095830 to 4.8.0-0.nightly-2021-06-07-180258 $ oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-06-07-180258 --force=true --allow-explicit-upgrade=true During the upgrade process, force update the oauth configuration 5 times to redeploy new pods with new ips, original issue hanging with old pod's IP is gone $ oc edit oauth cluster Check cluster version after upgarde finished $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-06-07-180258 True False 123m Cluster version is 4.8.0-0.nightly-2021-06-07-180258
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438