Bug 1807615
Summary: | [4.6] Upgrade OCP 4.3.3. to OCP 4.4.0 with alpha snapshot CRDs should print better error message | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Guy Inger <ginger> |
Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
Storage sub component: | Operators | QA Contact: | Chao Yang <chaoyang> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | alitke, andcosta, aos-bugs, cnv-qe-bugs, danken, eparis, fdeutsch, ginger, jokerman, jsafrane, lxia, ncredi, ngavrilo, sreber, talayan |
Version: | 4.4 | Keywords: | Reopened, Upgrades |
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 15:55:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1845433 |
Description
Guy Inger
2020-02-26 18:12:07 UTC
[cnv-qe-jenkins@cnv-executor-ginger4 ~]$ oc describe co csi-snapshot-controller Name: csi-snapshot-controller Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2020-02-26T16:55:17Z Generation: 1 Resource Version: 331744 Self Link: /apis/config.openshift.io/v1/clusteroperators/csi-snapshot-controller UID: 01b8335a-d68a-45b8-9c44-61723a887d40 Spec: Status: Conditions: Last Transition Time: 2020-02-26T16:57:17Z Message: Degraded: failed to sync CRDs: CustomResourceDefinition.apiextensions.k8s.io "volumesnapshots.snapshot.storage.k8s.io" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions Reason: _OperatorSync Status: True Type: Degraded Last Transition Time: 2020-02-26T16:55:17Z Reason: NoData Status: Unknown Type: Progressing Last Transition Time: 2020-02-26T16:55:17Z Reason: NoData Status: Unknown Type: Available Last Transition Time: 2020-02-26T16:55:17Z Reason: NoData Status: Unknown Type: Upgradeable Extension: <nil> Related Objects: Group: Name: openshift-csi-snapshot-controller Resource: namespaces Group: Name: openshift-csi-snapshot-controller-operator Resource: namespaces Group: operator.openshift.io Name: cluster Resource: csisnapshotcontrollers Events: <none> Hi Adam, IS the volumesnapshots.snapshot.storage.k8s.io" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions something that cnv bring ? and then it can break the ocp upgrade? @Jan Safranek, I think v1alpha1 version of volumesnapshots CRD should be fixed to be using the beta api version of k8s, this will be done by the cnv developer. However i think that ocp shouldn't fail because of that. What do you think? @Guy, can you please try to first upgrade CNV2.2 to CNV2.3 and after that try to upgrade OCP4.3 to OCP4.4? Hey Adam, I tried what you suggested and I deploy a 4.3.0 without any storage classes. I first tried to upgrade it to 4.3.5 (oc adm upgrade --force=true --allow-explicit-upgrade --to-image quay.io/openshift-release-dev/ocp-release:4.3.5-x86_64), which failed. oc get clusterversion returns "Unable to apply 4.3.5: the cluster operator monitoring has not yet successfully rolled out". More info: [cnv-qe-jenkins@cnv-executor-ginger1 ~]$ oc describe co monitoring Name: monitoring Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2020-03-11T14:56:07Z Generation: 1 Resource Version: 586313 Self Link: /apis/config.openshift.io/v1/clusteroperators/monitoring UID: 323dbdde-fe28-496b-a704-f2a74c50d1fe Spec: Status: Conditions: Last Transition Time: 2020-03-12T11:14:59Z Message: Failed to rollout the stack. Error: running task Updating node-exporter failed: reconciling node-exporter DaemonSet failed: updating DaemonSet object failed: waiting for DaemonSetRollout of node-exporter: daemonset node-exporter is not ready. status: (desired: 6, updated: 1, ready: 6, unavailable: 0) Reason: UpdatingnodeExporterFailed Status: True Type: Degraded Last Transition Time: 2020-03-12T13:52:29Z Message: Rollout of the monitoring stack is in progress. Please wait until it finishes. Reason: RollOutInProgress Status: True Type: Upgradeable Last Transition Time: 2020-03-12T11:14:59Z Status: False Type: Available Last Transition Time: 2020-03-12T13:52:29Z Message: Rolling out the stack. Reason: RollOutInProgress Status: True Type: Progressing Extension: <nil> Related Objects: Group: Name: openshift-monitoring Resource: namespaces Group: Name: openshift-monitoring Resource: all Group: monitoring.coreos.com Name: Resource: servicemonitors Group: monitoring.coreos.com Name: Resource: prometheusrules Group: monitoring.coreos.com Name: Resource: alertmanagers Group: monitoring.coreos.com Name: Resource: prometheuses Versions: Name: operator Version: 4.3.0-0.nightly-2020-03-09-200240 Events: <none> Ok that is a completely different error not related to storage at all. Let's see if we can get a successful upgrade without any storage classes and then we'll have this isolated a bit more. In the meantime I would not promote this to a blocker since our official HCO installation does not configure the Snapshot alpha components which were blocking your upgrade tests. I tried upgrading 4.3.5 to 4.4 which worked. There might still be an issue with upgrading from 4.3.3 to 4.4. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |