Bug 2076793
Summary: | CVO exits upgrade immediately rather than waiting for etcd backup | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jack Ottofaro <jack.ottofaro> |
Component: | Etcd | Assignee: | W. Trevor King <wking> |
Status: | CLOSED ERRATA | QA Contact: | Yang Yang <yanyang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.10 | CC: | alray, emoss, jack.ottofaro, jhou, lmohanty, lxia, wking, yanyang |
Target Milestone: | --- | Keywords: | Upgrades |
Target Release: | 4.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | 2072389 | Environment: | |
Last Closed: | 2022-08-10 11:07:44 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2072389, 2083370 | ||
Bug Blocks: | 2079660 |
Description
Jack Ottofaro
2022-04-19 20:32:09 UTC
As of change https://github.com/openshift/cluster-version-operator/pull/683 CVO no longer sets Failing=true when the preconditions, including the etcd backup precondition, fail. CVO now sets the ReleaseAccepted condition to indicate whether payload has been successfully loaded. Etcd should now instead check ReleaseAccepted!=true. Verified with 4.11.0-0.nightly-2022-04-26-030643, upgrade from 4.10.11 to 4.11.0-0.nightly-2022-04-26-030643, it succeed. Verifying with 4.11.0-0.nightly-2022-04-26-181148 by patching the cv status to change the ReleaseAccepted to false Before patching cv status # oc get co/etcd -oyaml - lastTransitionTime: "2022-04-27T05:50:20Z" reason: ControllerStarted status: Unknown type: RecentBackup Patching cv to change ReleaseAccepted to false # oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator deployment.apps/cluster-version-operator scaled # oc proxy & # curl -k -XPATCH -H "Accept: application/json" -H "Content-Type: applicaton/json-patch+json" 'http://127.0.0.1:8001/apis/config.openshift.io/v1/clusterversions/version/status' -d '[{"op": "add", "path": "/status/conditions", "value": [{"type":"ReleaseAccepted", "status": "False", "reason": "UpgradePreconditionCheckFailed", "message": "EtcdRecentBackup failed", "lastTransitionTime": "2022-04-27T18:25:51Z"}]}]' { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2022-04-27T05:47:06Z", "generation": 4, "managedFields": [ { "apiVersion": "config.openshift.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:spec": { ".": {}, "f:channel": {}, "f:clusterID": {} } }, "manager": "cluster-bootstrap", "operation": "Update", "time": "2022-04-27T05:47:06Z" }, { "apiVersion": "config.openshift.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:status": { ".": {}, "f:availableUpdates": {}, "f:capabilities": { ".": {}, "f:enabledCapabilities": {}, "f:knownCapabilities": {} }, "f:desired": { ".": {}, "f:image": {}, "f:version": {} }, "f:history": {}, "f:observedGeneration": {}, "f:versionHash": {} } }, "manager": "cluster-version-operator", "operation": "Update", "subresource": "status", "time": "2022-04-27T05:47:10Z" }, { "apiVersion": "config.openshift.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:status": { "f:conditions": {} } }, "manager": "curl", "operation": "Update", "subresource": "status", "time": "2022-04-27T12:42:23Z" } ], "name": "version", "resourceVersion": "165197", "uid": "f1924212-4134-4bfb-a860-b24d8e084bad" }, "spec": { "channel": "stable-4.11", "clusterID": "09edcc03-502b-4f63-81f3-d307a002253f" }, "status": { "availableUpdates": null, "capabilities": { "enabledCapabilities": [ "baremetal", "marketplace", "openshift-samples" ], "knownCapabilities": [ "baremetal", "marketplace", "openshift-samples" ] }, "conditions": [ { "lastTransitionTime": "2022-04-27T18:25:51Z", "message": "EtcdRecentBackup failed", "reason": "UpgradePreconditionCheckFailed", "status": "False", "type": "ReleaseAccepted" } ], "desired": { "image": "registry.ci.openshift.org/ocp/release@sha256:30452e14cbefed21f883ac38652b9dbaf653a922a1ca0efd6f3a1a10acfc2e1c", "version": "4.11.0-0.nightly-2022-04-26-181148" }, "history": [ { "completionTime": "2022-04-27T06:07:32Z", "image": "registry.ci.openshift.org/ocp/release@sha256:30452e14cbefed21f883ac38652b9dbaf653a922a1ca0efd6f3a1a10acfc2e1c", "startedTime": "2022-04-27T05:47:10Z", "state": "Completed", "verified": false, "version": "4.11.0-0.nightly-2022-04-26-181148" } ], "observedGeneration": 2, "versionHash": "QNLRulmodCo=" } } # oc get co/etcd -oyaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: exclude.release.openshift.io/internal-openshift-hosted: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2022-04-27T05:47:10Z" generation: 1 name: etcd ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: f1924212-4134-4bfb-a860-b24d8e084bad resourceVersion: "165237" uid: 1ebee225-51a9-4de3-9b2d-1a1c9d240a4c spec: {} status: conditions: - lastTransitionTime: "2022-04-27T05:59:50Z" message: |- NodeControllerDegraded: All master nodes are ready EtcdMembersDegraded: No unhealthy members found reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2022-04-27T06:10:04Z" message: |- EtcdMembersProgressing: No unstarted etcd members found NodeInstallerProgressing: 3 nodes are at revision 8 reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2022-04-27T05:52:50Z" message: |- EtcdMembersAvailable: 3 members are available StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 8 reason: AsExpected status: "True" type: Available - lastTransitionTime: "2022-04-27T05:50:19Z" message: All is well reason: AsExpected status: "True" type: Upgradeable - lastTransitionTime: "2022-04-27T12:42:29Z" message: UpgradeBackup pre 4.9 located at path /etc/kubernetes/cluster-backup/upgrade-backup-2022-04-27_124223 on node "yanyang-0427a-j7zrw-master-0.c.openshift-qe.internal" reason: UpgradeBackupSuccessful status: "True" type: RecentBackup extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: etcds - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-etcd-operator resource: namespaces - group: "" name: openshift-etcd resource: namespaces versions: - name: raw-internal version: 4.11.0-0.nightly-2022-04-26-181148 - name: etcd version: 4.11.0-0.nightly-2022-04-26-181148 - name: operator version: 4.11.0-0.nightly-2022-04-26-181148 # oc get co/etcd -oyaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: exclude.release.openshift.io/internal-openshift-hosted: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2022-04-27T05:47:10Z" generation: 1 name: etcd ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: f1924212-4134-4bfb-a860-b24d8e084bad resourceVersion: "165237" uid: 1ebee225-51a9-4de3-9b2d-1a1c9d240a4c spec: {} status: conditions: - lastTransitionTime: "2022-04-27T05:59:50Z" message: |- NodeControllerDegraded: All master nodes are ready EtcdMembersDegraded: No unhealthy members found reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2022-04-27T06:10:04Z" message: |- EtcdMembersProgressing: No unstarted etcd members found NodeInstallerProgressing: 3 nodes are at revision 8 reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2022-04-27T05:52:50Z" message: |- EtcdMembersAvailable: 3 members are available StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 8 reason: AsExpected status: "True" type: Available - lastTransitionTime: "2022-04-27T05:50:19Z" message: All is well reason: AsExpected status: "True" type: Upgradeable - lastTransitionTime: "2022-04-27T12:42:29Z" message: UpgradeBackup pre 4.9 located at path /etc/kubernetes/cluster-backup/upgrade-backup-2022-04-27_124223 on node "yanyang-0427a-j7zrw-master-0.c.openshift-qe.internal" reason: UpgradeBackupSuccessful status: "True" type: RecentBackup extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: etcds - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-etcd-operator resource: namespaces - group: "" name: openshift-etcd resource: namespaces versions: - name: raw-internal version: 4.11.0-0.nightly-2022-04-26-181148 - name: etcd version: 4.11.0-0.nightly-2022-04-26-181148 - name: operator version: 4.11.0-0.nightly-2022-04-26-181148 Etcd RecentBackup goes to True. Looks good to me. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |