Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2086506

Summary:	OVN-Kubernetes CNO upgrade logic broken for Hypershift
Product:	OpenShift Container Platform	Reporter:	Casey Callendrello <cdc>
Component:	Networking	Assignee:	zenghui.shi <zshi>
Networking sub component:	ovn-kubernetes	QA Contact:	Ross Brattain <rbrattai>
Status:	CLOSED WONTFIX	Docs Contact:
Severity:	high
Priority:	high	CC:	rbrattai
Version:	4.11	Keywords:	TestBlocker
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2024-04-30 18:04:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Casey Callendrello 2022-05-16 11:14:16 UTC

All of the CNO OVN-Kubernetes upgrade logic assumes the presence of a master DaemonSet. This is clearly wrong -- Hypershift uses a StatefulSet.

Comment 5 Ross Brattain 2022-07-01 01:34:20 UTC

Management cluster upgrade from 4.11.0-fc.3 to 4.11.0-rc.0 stuck after 11h17m

version   4.11.0-fc.3   True        True          11h     Working towards 4.11.0-rc.0: 647 of 802 done (80% complete), waiting on network

[
  {
    "completionTime": null,
    "image": "quay.io/openshift-release-dev/ocp-release:4.11.0-rc.0-x86_64",
    "startedTime": "2022-06-30T14:13:44Z",
    "state": "Partial",
    "verified": false,
    "version": "4.11.0-rc.0"
  },
  {
    "completionTime": "2022-06-30T13:15:52Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:af2fc44a39aaef937ce2eb895c61f2c40d8ec721c99eb866cc8e6d1a4c1b0401",
    "startedTime": "2022-06-30T12:47:58Z",
    "state": "Completed",
    "verified": false,
    "version": "4.11.0-fc.3"
  }
]

Comment 7 Ross Brattain 2022-07-18 16:28:20 UTC

blocking Hypershift OVN upgrading.

Comment 10 Ross Brattain 2022-10-13 23:37:16 UTC

Err, wait, the statefulest is in the hostedcluster not the management cluster.

So the original mgmt upgrade test was wrong.

oc patch -n clusters hostedcluster $(oc get -n clusters hostedcluster  -o jsonpath='{.items[0].metadata.name}') -p='{"spec": {"release": {"image": "quay.io/openshift-release-dev/ocp-release:4.12.0-ec.4-x86_64"}}}' --type=merge

oc get -n clusters hostedcluster  -o jsonpath='{.items[*].status.version.history}' |jq '. | sort_by(.startedTime) '

Comment 11 Ross Brattain 2022-10-14 16:13:25 UTC

Hosted cluster upgrade from 4.11.9 to 4.12.0-ec.4 failed in ovnkube-node, so maybe different issue but blocking full upgrade success.

kube-scheduler                             4.12.0-ec.4   True        False         False      8h
kube-storage-version-migrator              4.12.0-ec.4   True        False         False      8h
monitoring                                 4.12.0-ec.4   True        False         False      5h49m
network                                    4.11.9        True        True          True       8h      DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2022-10-13T22:14:09Z
node-tuning                                4.12.0-ec.4   True        True          False      3h38m   Waiting for 1/3 Profiles to be applied
openshift-apiserver                        4.12.0-ec.4   True        False         False      8h
openshift-controller-manager               4.12.0-ec.4   True        False         False      8h

ovnkube-node-lnnxh   4/5     CrashLoopBackOff   21 (3m2s ago)   85m     10.0.141.120   ip-10-0-141-120.compute.internal   <none>           <none>

Will gather logs and file new bug.

Comment 12 Ross Brattain 2022-10-17 23:54:29 UTC

Intermittent failures upgrade hostedCluster and then nodepool.  

If I just upgrade mgmt cluster then hostedCluster upgrade seems to succeed.

Verified on 4.11.9 to 4.12.0-0.nightly-2022-10-15-094115

oc get -n clusters hostedcluster  -o jsonpath='{.items[*].status.version.history}' |jq '. | sort_by(.startedTime) '

[
  {
    "completionTime": "2022-10-16T20:46:50Z",
    "image": "quay.io/openshift-release-dev/ocp-release:4.11.9-x86_64",
    "startedTime": "2022-10-16T20:34:57Z",
    "state": "Completed",
    "verified": false,
    "version": "4.11.9"
  },
  {
    "completionTime": "2022-10-17T00:15:14Z",
    "image": "registry.ci.openshift.org/ocp/release:4.12.0-0.nightly-2022-10-15-094115",
    "startedTime": "2022-10-16T23:58:21Z",
    "state": "Completed",
    "verified": false,
    "version": "4.12.0-0.nightly-2022-10-15-094115"
  }
]

Comment 13 Rory Thrasher 2024-04-30 18:04:53 UTC

OCP is no longer using Bugzilla and this bug appears to have been left in an orphaned state. If the bug is still relevant, please open a new issue in the OCPBUGS Jira project: https://issues.redhat.com/projects/OCPBUGS/summary