Bug 1847351
| Summary: | Upgrade hanging due to unexpected on-disk state validating against rendered... | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | huirwang |
| Component: | Machine Config Operator | Assignee: | Antonio Murdaca <amurdaca> |
| Status: | CLOSED DUPLICATE | QA Contact: | Michael Nguyen <mnguyen> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.5 | CC: | walters |
| Target Milestone: | --- | Keywords: | TestBlocker |
| Target Release: | 4.5.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-06-16 10:30:22 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
So, yesterday https://github.com/openshift/machine-config-operator/pull/1822 merged in 4.5 and your cluster is experiencing what that PR was trying to solve - is this consistent and happening all the time? The payload you're upgrading to contains the following MCO commit: https://github.com/openshift/machine-config-operator/commit/908117045fe9ef32662554ed9ed557b3c1e1a965 The fix I referenced above (PR https://github.com/openshift/machine-config-operator/pull/1822) is fixing this behavior Please use a newer payload and also take a look at some testing that went into the duplicate BZ http://bugzilla.redhat.com/show_bug.cgi?id=1846690 and https://bugzilla.redhat.com/show_bug.cgi?id=1842906#c55 *** This bug has been marked as a duplicate of bug 1846690 *** Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475 |
Description: Upgrade 4.4.8 to 4.5.0-0.nightly-2020-06-16-014907, upgrade stucks due to unexpected on-disk state validating against rendered... Steps to Reproduce: 1. Install ocp 4.4.8 on baremetal 2. Then upgrade to 4.5.0-0.nightly-2020-06-16-014907 with command:oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-06-16-014907 --force=true --allow-explicit-upgrade=true Result: Found two nodes stucks in SchedulingDisabled state. oc get machineconfigpools.machineconfiguration.openshift.io NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-6d45441c2be99b0d129ec345f6b6e114 False True True 3 0 0 1 6h23m worker rendered-worker-e4e32d87b18decf05a89b1af84fe9075 False True True 3 0 0 1 6h23m oc describe machineconfigpools.machineconfiguration.openshift.io master Name: master Namespace: Labels: custom-kubelet=small-pods machineconfiguration.openshift.io/mco-built-in= operator.machineconfiguration.openshift.io/required-for-upgrade= Annotations: <none> API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfigPool Metadata: Creation Timestamp: 2020-06-16T02:36:16Z Generation: 4 Resource Version: 178817 Self Link: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/master UID: 4f35005c-1297-4952-8444-95003516e5ca Spec: Configuration: Name: rendered-master-9a5283ce51b09cb5b76bc796e74e5940 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-master API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-4f35005c-1297-4952-8444-95003516e5ca-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-4f35005c-1297-4952-8444-95003516e5ca-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-fips API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-ssh Machine Config Selector: Match Labels: machineconfiguration.openshift.io/role: master Node Selector: Match Labels: node-role.kubernetes.io/master: Paused: false Status: Conditions: Last Transition Time: 2020-06-16T02:37:02Z Message: Reason: Status: False Type: RenderDegraded Last Transition Time: 2020-06-16T08:02:36Z Message: Reason: Status: False Type: Updated Last Transition Time: 2020-06-16T08:02:36Z Message: All nodes are updating to rendered-master-9a5283ce51b09cb5b76bc796e74e5940 Reason: Status: True Type: Updating Last Transition Time: 2020-06-16T08:09:30Z Message: Node huir-upg-jlqd2-control-plane-0 is reporting: "unexpected on-disk state validating against rendered-master-513bb8d25aeba1f69d4ccf1708ce5a6e" Reason: 1 nodes are reporting degraded status on sync Status: True Type: NodeDegraded Last Transition Time: 2020-06-16T08:09:30Z Message: Reason: Status: True Type: Degraded Configuration: Name: rendered-master-6d45441c2be99b0d129ec345f6b6e114 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-master API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-4f35005c-1297-4952-8444-95003516e5ca-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-4f35005c-1297-4952-8444-95003516e5ca-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-fips API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-ssh Degraded Machine Count: 1 Machine Count: 3 Observed Generation: 4 Ready Machine Count: 0 Unavailable Machine Count: 1 Updated Machine Count: 0 Events: <none> oc describe machineconfigpools.machineconfiguration.openshift.io worker Name: worker Namespace: Labels: machineconfiguration.openshift.io/mco-built-in= Annotations: <none> API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfigPool Metadata: Creation Timestamp: 2020-06-16T02:36:16Z Generation: 3 Resource Version: 177371 Self Link: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker UID: 690b3f0d-6b01-433d-89a1-76ecae732a0e Spec: Configuration: Name: rendered-worker-5b16814212becca0a0e2432af89fbeb5 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-worker API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-690b3f0d-6b01-433d-89a1-76ecae732a0e-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-fips API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-ssh Machine Config Selector: Match Labels: machineconfiguration.openshift.io/role: worker Node Selector: Match Labels: node-role.kubernetes.io/worker: Paused: false Status: Conditions: Last Transition Time: 2020-06-16T02:37:02Z Message: Reason: Status: False Type: RenderDegraded Last Transition Time: 2020-06-16T08:02:36Z Message: Reason: Status: False Type: Updated Last Transition Time: 2020-06-16T08:02:36Z Message: All nodes are updating to rendered-worker-5b16814212becca0a0e2432af89fbeb5 Reason: Status: True Type: Updating Last Transition Time: 2020-06-16T08:07:49Z Message: Node huir-upg-jlqd2-compute-2 is reporting: "unexpected on-disk state validating against rendered-worker-e4e32d87b18decf05a89b1af84fe9075" Reason: 1 nodes are reporting degraded status on sync Status: True Type: NodeDegraded Last Transition Time: 2020-06-16T08:07:49Z Message: Reason: Status: True Type: Degraded Configuration: Name: rendered-worker-e4e32d87b18decf05a89b1af84fe9075 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-worker API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-690b3f0d-6b01-433d-89a1-76ecae732a0e-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-fips API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-ssh Degraded Machine Count: 1 Machine Count: 3 Observed Generation: 3 Ready Machine Count: 0 Unavailable Machine Count: 1 Updated Machine Count: 0 Events: <none> Expected results: UPgrade suceeded.