Description of problem: Upgrade fails, OVS is not running so the network is not functional. Upgraded from 4.5.0-0.nightly-2020-09-28-124031 to 4.6.0-0.nightly-2020-09-30-145011 on Azure. The change in https://bugzilla.redhat.com/show_bug.cgi?id=1874696 changed the fallback to running OVS in a container so now we fail if openvswitch.service is not enabled. During the upgrade some the node that switch to host OVS get stuck. Upgrading from 4.5.0-0.nightly-2020-09-28-124031 to 4.6.0-0.nightly-2020-09-30-091659 succeeded on AWS. Version-Release number of selected component (if applicable): 4.6.0-0.nightly-2020-09-30-145011 How reproducible: Steps to Reproduce: 1. Upgrade from 4.5.0-0.nightly-2020-09-28-124031 to 4.6.0-0.nightly-2020-09-30-145011 on Azure 2. 3. Actual results: Nodes are stuck in SchedulingDisabled, openvswitch.service is not enabled, ovs-vswitchd is not running Expected results: openvswitch.service is enabled, ovs-vswitchd is running Additional info: sh-4.4# systemctl status openvswitch ● openvswitch.service - Open vSwitch Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; disabled; vendor preset: disabled) Active: inactive (dead) sh-4.4# systemctl status ovs-vswitchd ● ovs-vswitchd.service - Open vSwitch Forwarding Unit Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled) Drop-In: /etc/systemd/system/ovs-vswitchd.service.d └─10-ovs-vswitchd-restart.conf Active: inactive (dead) sh-4.4# ls -l /etc/systemd/system/multi-user.target.wants/openvswitch.service ls: cannot access '/etc/systemd/system/multi-user.target.wants/openvswitch.service': No such file or directory ovs container logs openvswitch is running in container Starting ovsdb-server. PMD: net_mlx4: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory PMD: net_mlx4: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx4) net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory net_mlx5: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx5) Configuring Open vSwitch system IDs. Enabling remote OVSDB managers. PMD: net_mlx4: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory PMD: net_mlx4: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx4) net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory net_mlx5: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx5) Starting ovs-vswitchd. Enabling remote OVSDB managers. 2020-09-30 21:11:30 info: Loading previous flows ... 2020-09-30 21:11:30 info: Adding br0 if it doesn't exist ... 2020-09-30 21:11:30 info: Created br0, now adding flows ... + ovs-ofctl add-tlv-map br0 '' 2020-09-30T21:11:30Z|00001|vconn|WARN|unix:/var/run/openvswitch/br0.mgmt: version negotiation failed (we support version 0x01, peer supports version 0x04) ovs-ofctl: br0: failed to connect to socket (Broken pipe) + ovs-ofctl -O OpenFlow13 add-groups br0 /var/run/openvswitch/ovs-save.nVSt9McrJW/br0.groups.dump + ovs-ofctl -O OpenFlow13 replace-flows br0 /var/run/openvswitch/ovs-save.nVSt9McrJW/br0.flows.dump + rm -rf /var/run/openvswitch/ovs-save.nVSt9McrJW 2020-09-30 21:11:30 info: Done restoring the existing flows ... 2020-09-30 21:11:30 info: Remove other config ... 2020-09-30 21:11:30 info: Removed other config ... 2020-09-30T21:11:29.736Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovsdb-server.log 2020-09-30T21:11:29.741Z|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.11.5 2020-09-30T21:11:29.748Z|00003|jsonrpc|WARN|unix#0: receive error: Connection reset by peer 2020-09-30T21:11:29.748Z|00004|reconnect|WARN|unix#0: connection dropped (Connection reset by peer) 2020-09-30T21:11:30.131Z|00031|bridge|INFO|bridge br0: added interface vethec1140a0 on port 4 2020-09-30T21:11:30.132Z|00032|bridge|INFO|bridge br0: added interface br0 on port 65534 2020-09-30T21:11:30.132Z|00033|bridge|INFO|bridge br0: added interface vetha0b45f6d on port 6 2020-09-30T21:11:30.132Z|00034|bridge|INFO|bridge br0: added interface vethe77d62ce on port 10 2020-09-30T21:11:30.132Z|00035|bridge|INFO|bridge br0: using datapath ID 00001a0aabc20744 2020-09-30T21:11:30.132Z|00036|connmgr|INFO|br0: added service controller "punix:/var/run/openvswitch/br0.mgmt" 2020-09-30T21:11:30.135Z|00037|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.11.5 2020-09-30T21:11:30.197Z|00038|vconn|WARN|unix#0: version negotiation failed (we support version 0x04, peer supports version 0x01) 2020-09-30T21:11:30.197Z|00039|rconn|WARN|br0<->unix#0: connection dropped (Protocol error) 2020-09-30T21:11:30.252Z|00040|connmgr|INFO|br0<->unix#6: 111 flow_mods in the last 0 s (111 adds) 2020-09-30T21:11:39.747Z|00005|memory|INFO|7496 kB peak resident set size after 10.0 seconds 2020-09-30T21:11:39.747Z|00006|memory|INFO|cells:652 json-caches:1 monitors:2 sessions:2 2020-09-30T21:11:40.138Z|00041|memory|INFO|59596 kB peak resident set size after 10.3 seconds 2020-09-30T21:11:40.138Z|00042|memory|INFO|handlers:1 ports:10 revalidators:1 rules:115 udpif keys:132 2020-09-30T21:18:15.278Z|00043|connmgr|INFO|br0<->unix#58: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:18:15.309Z|00044|connmgr|INFO|br0<->unix#61: 4 flow_mods in the last 0 s (4 deletes) 2020-09-30T21:18:15.339Z|00045|bridge|INFO|bridge br0: deleted interface veth956fb903 on port 3 2020-09-30T21:18:26.104Z|00046|bridge|INFO|bridge br0: added interface vethd3e3323a on port 12 2020-09-30T21:18:26.142Z|00047|connmgr|INFO|br0<->unix#64: 5 flow_mods in the last 0 s (5 adds) 2020-09-30T21:18:26.183Z|00048|connmgr|INFO|br0<->unix#67: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:28:05.860Z|00049|connmgr|INFO|br0<->unix#132: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:28:06.011Z|00050|connmgr|INFO|br0<->unix#137: 4 flow_mods in the last 0 s (4 deletes) 2020-09-30T21:28:06.121Z|00051|bridge|INFO|bridge br0: deleted interface vetha0b45f6d on port 6 2020-09-30T21:28:06.256Z|00052|connmgr|INFO|br0<->unix#141: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:28:06.400Z|00053|connmgr|INFO|br0<->unix#144: 4 flow_mods in the last 0 s (4 deletes) 2020-09-30T21:28:06.725Z|00054|bridge|INFO|bridge br0: deleted interface veth7add96e2 on port 9 2020-09-30T21:28:06.878Z|00055|connmgr|INFO|br0<->unix#147: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:28:07.031Z|00056|connmgr|INFO|br0<->unix#150: 4 flow_mods in the last 0 s (4 deletes) 2020-09-30T21:28:07.334Z|00057|bridge|INFO|bridge br0: deleted interface vethec1140a0 on port 4 2020-09-30T21:28:07.471Z|00058|connmgr|INFO|br0<->unix#153: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:28:07.594Z|00059|connmgr|INFO|br0<->unix#156: 4 flow_mods in the last 0 s (4 deletes) 2020-09-30T21:28:07.675Z|00060|bridge|INFO|bridge br0: deleted interface vethe77d62ce on port 10 2020-09-30T21:28:08.166Z|00061|connmgr|INFO|br0<->unix#159: 2 flow_mods in the last 0 s (2 deletes) 2020-09-30T21:28:08.249Z|00062|connmgr|INFO|br0<->unix#162: 4 flow_mods in the last 0 s (4 deletes) 2020-09-30T21:28:08.376Z|00063|bridge|INFO|bridge br0: deleted interface vethac02a791 on port 8 2020-09-30 21:28:16 info: Saving flows ... ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory) rm: cannot remove '/var/run/openvswitch/ovs-vswitchd.pid': No such file or directory openvswitch is running in systemd (objectpath '/org/freedesktop/systemd1/job/796',) tail: cannot open '/host/var/log/openvswitch/ovs-vswitchd.log' for reading: No such file or directory tail: cannot open '/host/var/log/openvswitch/ovsdb-server.log' for reading: No such file or directory tail: '/host/var/log/openvswitch/ovsdb-server.log' has appeared; following new file 2020-09-30T21:28:56.511Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovsdb-server.log 2020-09-30T21:28:56.518Z|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.13.2 2020-09-30T21:28:58.661Z|00003|jsonrpc|WARN|unix#4: receive error: Connection reset by peer 2020-09-30T21:28:58.661Z|00004|reconnect|WARN|unix#4: connection dropped (Connection reset by peer) 2020-09-30T21:29:00.177Z|00005|jsonrpc|WARN|unix#7: receive error: Connection reset by peer 2020-09-30T21:29:00.177Z|00006|reconnect|WARN|unix#7: connection dropped (Connection reset by peer) 2020-09-30T21:29:06.526Z|00007|memory|INFO|7640 kB peak resident set size after 10.0 seconds 2020-09-30T21:29:06.526Z|00008|memory|INFO|cells:122 monitors:2 sessions:1 2020-09-30T21:29:44.579Z|00009|jsonrpc|WARN|unix#19: receive error: Connection reset by peer 2020-09-30T21:29:44.579Z|00010|reconnect|WARN|unix#19: connection dropped (Connection reset by peer) 2020-09-30T21:29:47.487Z|00011|jsonrpc|WARN|unix#21: receive error: Connection reset by peer 2020-09-30T21:29:47.487Z|00012|reconnect|WARN|unix#21: connection dropped (Connection reset by peer) 2020-09-30T21:29:52.488Z|00013|jsonrpc|WARN|unix#22: receive error: Connection reset by peer 2020-09-30T21:29:52.488Z|00014|reconnect|WARN|unix#22: connection dropped (Connection reset by peer) 2020-09-30T21:29:57.488Z|00015|jsonrpc|WARN|unix#23: receive error: Connection reset by peer 2020-09-30T21:29:57.488Z|00016|reconnect|WARN|unix#23: connection dropped (Connection reset by peer) 2020-09-30T21:30:02.484Z|00017|jsonrpc|WARN|unix#24: receive error: Connection reset by peer 2020-09-30T21:30:02.484Z|00018|reconnect|WARN|unix#24: connection dropped (Connection reset by peer) 2020-09-30T21:30:07.487Z|00019|jsonrpc|WARN|unix#25: receive error: Connection reset by peer 2020-09-30T21:30:07.487Z|00020|reconnect|WARN|unix#25: connection dropped (Connection reset by peer) 2020-09-30T21:30:12.494Z|00021|jsonrpc|WARN|unix#26: receive erro
not sure why it's MCO since it has been assessed this is a network ovn thing, moving there
*** Bug 1883521 has been marked as a duplicate of this bug. ***
Verified upgrade from 4.5.14 to 4.6.0-0.nightly-2020-10-08-043318 on upi-on-vsphere/versioned-installer-vsphere_slave ipi-on-osp/versioned-installer-https_proxy-etcd_encryption-ci
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196