Bug 2076793
| Summary: | CVO exits upgrade immediately rather than waiting for etcd backup | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jack Ottofaro <jack.ottofaro> |
| Component: | Etcd | Assignee: | W. Trevor King <wking> |
| Status: | CLOSED ERRATA | QA Contact: | Yang Yang <yanyang> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.10 | CC: | alray, emoss, jack.ottofaro, jhou, lmohanty, lxia, wking, yanyang |
| Target Milestone: | --- | Keywords: | Upgrades |
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 2072389 | Environment: | |
| Last Closed: | 2022-08-10 11:07:44 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2072389, 2083370 | ||
| Bug Blocks: | 2079660 | ||
|
Description
Jack Ottofaro
2022-04-19 20:32:09 UTC
As of change https://github.com/openshift/cluster-version-operator/pull/683 CVO no longer sets Failing=true when the preconditions, including the etcd backup precondition, fail. CVO now sets the ReleaseAccepted condition to indicate whether payload has been successfully loaded. Etcd should now instead check ReleaseAccepted!=true. Verified with 4.11.0-0.nightly-2022-04-26-030643, upgrade from 4.10.11 to 4.11.0-0.nightly-2022-04-26-030643, it succeed. Verifying with 4.11.0-0.nightly-2022-04-26-181148 by patching the cv status to change the ReleaseAccepted to false
Before patching cv status
# oc get co/etcd -oyaml
- lastTransitionTime: "2022-04-27T05:50:20Z"
reason: ControllerStarted
status: Unknown
type: RecentBackup
Patching cv to change ReleaseAccepted to false
# oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator
deployment.apps/cluster-version-operator scaled
# oc proxy &
# curl -k -XPATCH -H "Accept: application/json" -H "Content-Type: applicaton/json-patch+json" 'http://127.0.0.1:8001/apis/config.openshift.io/v1/clusterversions/version/status' -d '[{"op": "add", "path": "/status/conditions", "value": [{"type":"ReleaseAccepted", "status": "False", "reason": "UpgradePreconditionCheckFailed", "message": "EtcdRecentBackup failed", "lastTransitionTime": "2022-04-27T18:25:51Z"}]}]'
{
"apiVersion": "config.openshift.io/v1",
"kind": "ClusterVersion",
"metadata": {
"creationTimestamp": "2022-04-27T05:47:06Z",
"generation": 4,
"managedFields": [
{
"apiVersion": "config.openshift.io/v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:spec": {
".": {},
"f:channel": {},
"f:clusterID": {}
}
},
"manager": "cluster-bootstrap",
"operation": "Update",
"time": "2022-04-27T05:47:06Z"
},
{
"apiVersion": "config.openshift.io/v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:status": {
".": {},
"f:availableUpdates": {},
"f:capabilities": {
".": {},
"f:enabledCapabilities": {},
"f:knownCapabilities": {}
},
"f:desired": {
".": {},
"f:image": {},
"f:version": {}
},
"f:history": {},
"f:observedGeneration": {},
"f:versionHash": {}
}
},
"manager": "cluster-version-operator",
"operation": "Update",
"subresource": "status",
"time": "2022-04-27T05:47:10Z"
},
{
"apiVersion": "config.openshift.io/v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:status": {
"f:conditions": {}
}
},
"manager": "curl",
"operation": "Update",
"subresource": "status",
"time": "2022-04-27T12:42:23Z"
}
],
"name": "version",
"resourceVersion": "165197",
"uid": "f1924212-4134-4bfb-a860-b24d8e084bad"
},
"spec": {
"channel": "stable-4.11",
"clusterID": "09edcc03-502b-4f63-81f3-d307a002253f"
},
"status": {
"availableUpdates": null,
"capabilities": {
"enabledCapabilities": [
"baremetal",
"marketplace",
"openshift-samples"
],
"knownCapabilities": [
"baremetal",
"marketplace",
"openshift-samples"
]
},
"conditions": [
{
"lastTransitionTime": "2022-04-27T18:25:51Z",
"message": "EtcdRecentBackup failed",
"reason": "UpgradePreconditionCheckFailed",
"status": "False",
"type": "ReleaseAccepted"
}
],
"desired": {
"image": "registry.ci.openshift.org/ocp/release@sha256:30452e14cbefed21f883ac38652b9dbaf653a922a1ca0efd6f3a1a10acfc2e1c",
"version": "4.11.0-0.nightly-2022-04-26-181148"
},
"history": [
{
"completionTime": "2022-04-27T06:07:32Z",
"image": "registry.ci.openshift.org/ocp/release@sha256:30452e14cbefed21f883ac38652b9dbaf653a922a1ca0efd6f3a1a10acfc2e1c",
"startedTime": "2022-04-27T05:47:10Z",
"state": "Completed",
"verified": false,
"version": "4.11.0-0.nightly-2022-04-26-181148"
}
],
"observedGeneration": 2,
"versionHash": "QNLRulmodCo="
}
}
# oc get co/etcd -oyaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
exclude.release.openshift.io/internal-openshift-hosted: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
creationTimestamp: "2022-04-27T05:47:10Z"
generation: 1
name: etcd
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: f1924212-4134-4bfb-a860-b24d8e084bad
resourceVersion: "165237"
uid: 1ebee225-51a9-4de3-9b2d-1a1c9d240a4c
spec: {}
status:
conditions:
- lastTransitionTime: "2022-04-27T05:59:50Z"
message: |-
NodeControllerDegraded: All master nodes are ready
EtcdMembersDegraded: No unhealthy members found
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2022-04-27T06:10:04Z"
message: |-
EtcdMembersProgressing: No unstarted etcd members found
NodeInstallerProgressing: 3 nodes are at revision 8
reason: AsExpected
status: "False"
type: Progressing
- lastTransitionTime: "2022-04-27T05:52:50Z"
message: |-
EtcdMembersAvailable: 3 members are available
StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 8
reason: AsExpected
status: "True"
type: Available
- lastTransitionTime: "2022-04-27T05:50:19Z"
message: All is well
reason: AsExpected
status: "True"
type: Upgradeable
- lastTransitionTime: "2022-04-27T12:42:29Z"
message: UpgradeBackup pre 4.9 located at path /etc/kubernetes/cluster-backup/upgrade-backup-2022-04-27_124223
on node "yanyang-0427a-j7zrw-master-0.c.openshift-qe.internal"
reason: UpgradeBackupSuccessful
status: "True"
type: RecentBackup
extension: null
relatedObjects:
- group: operator.openshift.io
name: cluster
resource: etcds
- group: ""
name: openshift-config
resource: namespaces
- group: ""
name: openshift-config-managed
resource: namespaces
- group: ""
name: openshift-etcd-operator
resource: namespaces
- group: ""
name: openshift-etcd
resource: namespaces
versions:
- name: raw-internal
version: 4.11.0-0.nightly-2022-04-26-181148
- name: etcd
version: 4.11.0-0.nightly-2022-04-26-181148
- name: operator
version: 4.11.0-0.nightly-2022-04-26-181148
# oc get co/etcd -oyaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
exclude.release.openshift.io/internal-openshift-hosted: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
creationTimestamp: "2022-04-27T05:47:10Z"
generation: 1
name: etcd
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: f1924212-4134-4bfb-a860-b24d8e084bad
resourceVersion: "165237"
uid: 1ebee225-51a9-4de3-9b2d-1a1c9d240a4c
spec: {}
status:
conditions:
- lastTransitionTime: "2022-04-27T05:59:50Z"
message: |-
NodeControllerDegraded: All master nodes are ready
EtcdMembersDegraded: No unhealthy members found
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2022-04-27T06:10:04Z"
message: |-
EtcdMembersProgressing: No unstarted etcd members found
NodeInstallerProgressing: 3 nodes are at revision 8
reason: AsExpected
status: "False"
type: Progressing
- lastTransitionTime: "2022-04-27T05:52:50Z"
message: |-
EtcdMembersAvailable: 3 members are available
StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 8
reason: AsExpected
status: "True"
type: Available
- lastTransitionTime: "2022-04-27T05:50:19Z"
message: All is well
reason: AsExpected
status: "True"
type: Upgradeable
- lastTransitionTime: "2022-04-27T12:42:29Z"
message: UpgradeBackup pre 4.9 located at path /etc/kubernetes/cluster-backup/upgrade-backup-2022-04-27_124223
on node "yanyang-0427a-j7zrw-master-0.c.openshift-qe.internal"
reason: UpgradeBackupSuccessful
status: "True"
type: RecentBackup
extension: null
relatedObjects:
- group: operator.openshift.io
name: cluster
resource: etcds
- group: ""
name: openshift-config
resource: namespaces
- group: ""
name: openshift-config-managed
resource: namespaces
- group: ""
name: openshift-etcd-operator
resource: namespaces
- group: ""
name: openshift-etcd
resource: namespaces
versions:
- name: raw-internal
version: 4.11.0-0.nightly-2022-04-26-181148
- name: etcd
version: 4.11.0-0.nightly-2022-04-26-181148
- name: operator
version: 4.11.0-0.nightly-2022-04-26-181148
Etcd RecentBackup goes to True. Looks good to me.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |