Bug 1868376
Summary: | cloud-credential operator pod is in CrashLoopBackOff and blocking downgrade from 4.6 to 4.5 | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | pmali | |
Component: | Cloud Credential Operator | Assignee: | Joel Diaz <jdiaz> | |
Status: | CLOSED ERRATA | QA Contact: | wang lin <lwan> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.5 | CC: | gshereme, jdiaz, lwan, sandeepredhat, sdodson, wking, xxia | |
Target Milestone: | --- | Keywords: | TestBlocker | |
Target Release: | 4.6.0 | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: Moving from 4.5 to 4.6 some fields that were left to their defaults are now specified in 4.6.
Consequence: This affects the ability to downgrade from 4.6 to 4.5.
Fix: Rather than leave the fields unspecified in 4.5, explicitly specify the default values so that on a downgrade attempt, those fields are restored to what they should be for 4.5.
Result: Downgrading from 4.6 to 4.5 can succeed.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1873345 (view as bug list) | Environment: | ||
Last Closed: | 2020-10-27 16:28:06 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1860922, 1873345 |
Description
pmali
2020-08-12 12:50:37 UTC
will investigate next sprint Moving to 4.5. The issue appears to be that the Deployment in the release-4.5 branch for cloud-cred-operator doesn't specify a ServiceAccount, and the one in 4.6 does specify one (one that is new to 4.6). After downgrade to 4.5, the cloud-cred-operator deployment has the reference to the orphaned ServiceAccount (named "cloud-credential-operator") instead of to the ServiceAccount named "default" (the one we actually use in 4.5). Can you please confirm that this is a test case where you started with 4.5, upgraded to 4.6, then downgraded back to 4.5? To keep Eric's bot happy, we'll probably want to move this bug to MODIFIED so we can VERIFY with "4.6->4.6" does not crash-loop the cred operator. Then we can clone back to a bug targeting 4.5.z and actually fix it. downgrading 4.6 -> 4.6, cco won't crash. downgrade from 4.6.0-0.nightly-2020-08-26-215737 to 4.6.0-0.nightly-2020-08-21-084833 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-08-26-215737 True True 9s Working towards registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-08-21-084833: downloading update $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-08-21-084833 True False 6m23s Cluster version is 4.6.0-0.nightly-2020-08-21-084833 $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.6.0-0.nightly-2020-08-21-084833 True False False 8m56s cloud-credential 4.6.0-0.nightly-2020-08-21-084833 True False False 139m cluster-autoscaler 4.6.0-0.nightly-2020-08-21-084833 True False False 127m config-operator 4.6.0-0.nightly-2020-08-21-084833 True False False 131m console 4.6.0-0.nightly-2020-08-21-084833 True False False 11m csi-snapshot-controller 4.6.0-0.nightly-2020-08-21-084833 True False False 11m dns 4.6.0-0.nightly-2020-08-21-084833 True False False 22m etcd 4.6.0-0.nightly-2020-08-21-084833 True False False 129m image-registry 4.6.0-0.nightly-2020-08-21-084833 True False False 122m ingress 4.6.0-0.nightly-2020-08-21-084833 True False False 122m insights 4.6.0-0.nightly-2020-08-21-084833 True False False 127m kube-apiserver 4.6.0-0.nightly-2020-08-21-084833 True False False 129m kube-controller-manager 4.6.0-0.nightly-2020-08-21-084833 True False False 128m kube-scheduler 4.6.0-0.nightly-2020-08-21-084833 True False False 128m kube-storage-version-migrator 4.6.0-0.nightly-2020-08-21-084833 True False False 11m machine-api 4.6.0-0.nightly-2020-08-21-084833 True False False 123m machine-approver 4.6.0-0.nightly-2020-08-21-084833 True False False 127m machine-config 4.6.0-0.nightly-2020-08-21-084833 True False False 8m16s marketplace 4.6.0-0.nightly-2020-08-21-084833 True False False 12m monitoring 4.6.0-0.nightly-2020-08-21-084833 True False False 121m network 4.6.0-0.nightly-2020-08-21-084833 True False False 122m node-tuning 4.6.0-0.nightly-2020-08-21-084833 True False False 32m openshift-apiserver 4.6.0-0.nightly-2020-08-21-084833 True False False 9m5s openshift-controller-manager 4.6.0-0.nightly-2020-08-21-084833 True False False 125m openshift-samples 4.6.0-0.nightly-2020-08-21-084833 True False False 32m operator-lifecycle-manager 4.6.0-0.nightly-2020-08-21-084833 True False False 130m operator-lifecycle-manager-catalog 4.6.0-0.nightly-2020-08-21-084833 True False False 130m operator-lifecycle-manager-packageserver 4.6.0-0.nightly-2020-08-21-084833 True False False 9m7s service-ca 4.6.0-0.nightly-2020-08-21-084833 True False False 131m storage 4.6.0-0.nightly-2020-08-21-084833 True False False 11m $ oc get pods -n openshift-cloud-credential-operator NAME READY STATUS RESTARTS AGE cloud-credential-operator-869b565fc5-gcws4 2/2 Running 0 14m pod-identity-webhook-7f99757f4c-nj7tq 1/1 Running 0 14m (In reply to Scott Dodson from comment #5) > Can you please confirm that this is a test case where you started with 4.5, upgraded to 4.6, then downgraded back to 4.5? Yes. I tried today, still hit. I launched 4.5.0-0.nightly-2020-08-31-101523 ipi gcp env, then upgraded successfully to 4.6.0-0.nightly-2020-09-01-042030, then tried the downgrade to 4.5.0-0.nightly-2020-08-31-10152, hit it. The fix haven't backport to 4.5, so downgrading from 4.6 to 4.5 still hits this issue. isn't this a verify 4.6 -> 4.6 downgrade ? refer to https://bugzilla.redhat.com/show_bug.cgi?id=1868376#c6 And this one (https://bugzilla.redhat.com/show_bug.cgi?id=1873345) is an actual fix for this downgrade issue. Is my understanding wrong? Sorry I didn't notice carefully there was already a 4.5 clone. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |