+++ This bug was initially created as a clone of Bug #1887040 +++ Description of problem: upgrade from 4.5 to 4.6 with rhel worker and sdn plugin. ovs pod crashed due to oc logs ovs-r4sd8 -n openshift-sdn openvswitch is running in systemd id: openvswitch: no such user Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-10-08-190330 --> 4.6.0-rc.2 How reproducible: always Steps to Reproduce: 1. upgrade cluster from 4.5 to 4.6 with rhel worker and sdn plugin 2. 3. Actual results: Rhel worker ovs pod crashed with logs: oc logs ovs-r4sd8 -n openshift-sdn openvswitch is running in systemd id: openvswitch: no such user and it blocked the upgrade process Expected results: Additional info: --- Additional comment from zhaozhanqi on 2020-10-10 10:22:24 UTC --- this issue happen since rhel worker has not been upgraded to 4.6 version and no openvswith2.13 package installed. When I met the ovs pod crashed. then I upgraded the rhel worker to 4.6 version. after the rhel worker upgraded finished. The rest of worker of cluster can continue upgrade and finally the cluster can upgrade successfully. is there a way to avoid the ovs pod crashed before upgrade rhel worker, if not. we at least tell customer this situation: when met ovs pod crash for rhel worker during upgrade to 4.6 version. it's normal and upgrade rhel worker can resolve this issue. --- Additional comment from zhaozhanqi on 2020-10-12 01:49:44 UTC --- from the document: https://docs.openshift.com/container-platform/4.5/updating/updating-cluster-rhel-compute.html#rhel-compute-updating_updating-cluster-rhel-compute >> After you update your cluster, you must update the Red Hat Enterprise Linux (RHEL) compute machines in your cluster it's after upgrade the cluster. and then upgrdae the rhcl worker. if so. this is an issue. --- Additional comment from Tim Rozet on 2020-10-12 14:35:51 UTC --- It looks like openvswitch is upgraded with a playbook post upgrade. This is the order of operations with UPI install so moving it to installer team. --- Additional comment from Scott Dodson on 2020-10-12 17:15:50 UTC --- We're going to have to make sure that the OVS pods in 4.6 maintain compatibility until OVS can be installed on the RHEL Workers as part of the RHEL worker upgrade playbooks. I assume that the reason this is working in RHCOS is because OVS was actually installed in RHCOS 4.5 whereas that wasn't done for RHEL 7 workers. --- Additional comment from Tim Rozet on 2020-10-12 22:45:56 UTC --- Zhanqi, can you please provide the systemd journal to one of your nodes, or provide a setup please? If you didn't have openvswitch installed, I don't see how ovs-configuration.service would have executed and written the /var/run/ovs-config-executed, which we use to determine if OVS is running in systemd. --- Additional comment from Feng Pan on 2020-10-12 22:55:58 UTC --- Moving this to 4.7 with 4.6.z backport as this does not actually affect overall upgrade success.
*** Bug 1890652 has been marked as a duplicate of this bug. ***
Verified this bug on 4.6.0-0.nightly-2020-10-27-154553
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.3 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4339