Bug 2086506
Summary: | OVN-Kubernetes CNO upgrade logic broken for Hypershift | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Casey Callendrello <cdc> |
Component: | Networking | Assignee: | zenghui.shi <zshi> |
Networking sub component: | ovn-kubernetes | QA Contact: | Ross Brattain <rbrattai> |
Status: | VERIFIED --- | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | rbrattai |
Version: | 4.11 | Keywords: | TestBlocker |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | Type: | Bug | |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Casey Callendrello
2022-05-16 11:14:16 UTC
Management cluster upgrade from 4.11.0-fc.3 to 4.11.0-rc.0 stuck after 11h17m version 4.11.0-fc.3 True True 11h Working towards 4.11.0-rc.0: 647 of 802 done (80% complete), waiting on network [ { "completionTime": null, "image": "quay.io/openshift-release-dev/ocp-release:4.11.0-rc.0-x86_64", "startedTime": "2022-06-30T14:13:44Z", "state": "Partial", "verified": false, "version": "4.11.0-rc.0" }, { "completionTime": "2022-06-30T13:15:52Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:af2fc44a39aaef937ce2eb895c61f2c40d8ec721c99eb866cc8e6d1a4c1b0401", "startedTime": "2022-06-30T12:47:58Z", "state": "Completed", "verified": false, "version": "4.11.0-fc.3" } ] blocking Hypershift OVN upgrading. Err, wait, the statefulest is in the hostedcluster not the management cluster. So the original mgmt upgrade test was wrong. oc patch -n clusters hostedcluster $(oc get -n clusters hostedcluster -o jsonpath='{.items[0].metadata.name}') -p='{"spec": {"release": {"image": "quay.io/openshift-release-dev/ocp-release:4.12.0-ec.4-x86_64"}}}' --type=merge oc get -n clusters hostedcluster -o jsonpath='{.items[*].status.version.history}' |jq '. | sort_by(.startedTime) ' Hosted cluster upgrade from 4.11.9 to 4.12.0-ec.4 failed in ovnkube-node, so maybe different issue but blocking full upgrade success. kube-scheduler 4.12.0-ec.4 True False False 8h kube-storage-version-migrator 4.12.0-ec.4 True False False 8h monitoring 4.12.0-ec.4 True False False 5h49m network 4.11.9 True True True 8h DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2022-10-13T22:14:09Z node-tuning 4.12.0-ec.4 True True False 3h38m Waiting for 1/3 Profiles to be applied openshift-apiserver 4.12.0-ec.4 True False False 8h openshift-controller-manager 4.12.0-ec.4 True False False 8h ovnkube-node-lnnxh 4/5 CrashLoopBackOff 21 (3m2s ago) 85m 10.0.141.120 ip-10-0-141-120.compute.internal <none> <none> Will gather logs and file new bug. Intermittent failures upgrade hostedCluster and then nodepool. If I just upgrade mgmt cluster then hostedCluster upgrade seems to succeed. Verified on 4.11.9 to 4.12.0-0.nightly-2022-10-15-094115 oc get -n clusters hostedcluster -o jsonpath='{.items[*].status.version.history}' |jq '. | sort_by(.startedTime) ' [ { "completionTime": "2022-10-16T20:46:50Z", "image": "quay.io/openshift-release-dev/ocp-release:4.11.9-x86_64", "startedTime": "2022-10-16T20:34:57Z", "state": "Completed", "verified": false, "version": "4.11.9" }, { "completionTime": "2022-10-17T00:15:14Z", "image": "registry.ci.openshift.org/ocp/release:4.12.0-0.nightly-2022-10-15-094115", "startedTime": "2022-10-16T23:58:21Z", "state": "Completed", "verified": false, "version": "4.12.0-0.nightly-2022-10-15-094115" } ] |