Description of problem: OVN CI jobs started to fail - https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/ Root cause is that during the upgrade controller gets to latest version where OVN container is running on host with OVS2.13 but the computes need "hybrid" mode where only OVN container is switched to the latest version and the host stays the same with OVS2.11. This worked well up until new version of OVN. Error in logs: 2021-08-11T14:44:54.158Z|00038|reconnect|INFO|tcp:172.17.1.46:6642: continuing to reconnect in the background but suppressing further logging 2021-08-11T14:46:26.727Z|00039|fatal_signal|WARN|terminating with signal 15 (Terminated) 2021-08-11T14:46:31.546Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovn-controller.log 2021-08-11T14:46:31.548Z|00002|reconnect|INFO|unix:/run/openvswitch/db.sock: connecting... 2021-08-11T14:46:31.548Z|00003|reconnect|INFO|unix:/run/openvswitch/db.sock: connected 2021-08-11T14:46:31.550Z|00004|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?) 2021-08-11T14:46:31.550Z|00005|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?) 2021-08-11T14:46:31.550Z|00006|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?) 2021-08-11T14:46:31.551Z|00007|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?) 2021-08-11T14:46:31.553Z|00008|main|INFO|OVN internal version is : [21.06.0-20.17.0-56.0] 2021-08-11T14:46:31.553Z|00009|main|INFO|OVS IDL reconnected, force recompute. 2021-08-11T14:46:31.553Z|00010|reconnect|INFO|tcp:172.17.1.46:6642: connecting... 2021-08-11T14:46:31.553Z|00011|main|INFO|OVNSB IDL reconnected, force recompute. 2021-08-11T14:46:31.553Z|00012|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:46:31.554Z|00013|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:46:31.554Z|00014|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:46:31.555Z|00015|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:46:31.555Z|00016|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:46:32.215Z|00017|reconnect|INFO|tcp:172.17.1.60:6642: connecting... 2021-08-11T14:46:33.215Z|00018|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt timed out 2021-08-11T14:46:34.215Z|00019|reconnect|INFO|tcp:172.17.1.60:6642: connecting... 2021-08-11T14:46:35.215Z|00020|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt timed out 2021-08-11T14:46:35.215Z|00021|reconnect|INFO|tcp:172.17.1.60:6642: waiting 2 seconds before reconnect 2021-08-11T14:46:37.215Z|00022|reconnect|INFO|tcp:172.17.1.60:6642: connecting... 2021-08-11T14:46:39.057Z|00023|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt failed (No route to host) 2021-08-11T14:46:39.057Z|00024|reconnect|INFO|tcp:172.17.1.60:6642: waiting 4 seconds before reconnect 2021-08-11T14:46:41.545Z|00025|memory|INFO|4488 kB peak resident set size after 10.0 seconds 2021-08-11T14:46:43.057Z|00026|reconnect|INFO|tcp:172.17.1.60:6642: connecting... 2021-08-11T14:46:45.069Z|00027|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt failed (No route to host) 2021-08-11T14:46:45.069Z|00028|reconnect|INFO|tcp:172.17.1.60:6642: continuing to reconnect in the background but suppressing further logging 2021-08-11T14:47:31.553Z|00029|ovsdb_idl|WARN|Dropped 269083 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2021-08-11T14:47:31.553Z|00030|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:48:31.553Z|00031|ovsdb_idl|WARN|Dropped 271389 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2021-08-11T14:48:31.553Z|00032|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:49:31.553Z|00033|ovsdb_idl|WARN|Dropped 266958 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2021-08-11T14:49:31.553Z|00034|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:50:31.553Z|00035|ovsdb_idl|WARN|Dropped 270291 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2021-08-11T14:50:31.553Z|00036|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} 2021-08-11T14:51:31.553Z|00037|ovsdb_idl|WARN|Dropped 267861 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2021-08-11T14:51:31.553Z|00038|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"} Version-Release number of selected component (if applicable): docker ps | grep ovn_controller c9aa688033bf undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-controller:16.2_20210804.1 "dumb-init --singl..." 45 minutes ago Up 45 minutes ovn_controller [root@compute-0 ~]# docker exec -u root -it ovn_controller rpm -qa | grep openv network-scripts-openvswitch2.15-2.15.0-26.el8fdp.x86_64 openvswitch2.15-2.15.0-26.el8fdp.x86_64 python3-openvswitch2.15-2.15.0-26.el8fdp.x86_64 rhosp-network-scripts-openvswitch-2.15-4.el8ost.1.noarch openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch rhosp-openvswitch-2.15-4.el8ost.1.noarch python3-rhosp-openvswitch-2.15-4.el8ost.1.noarch [root@compute-0 ~]# docker exec -u root -it ovn_controller rpm -qa | grep ovn ovn-2021-21.06.0-17.el8fdp.x86_64 rhosp-ovn-2021-4.el8ost.1.noarch ovn-2021-host-21.06.0-17.el8fdp.x86_64 rhosp-ovn-host-2021-4.el8ost.1.noarch [root@compute-0 ~]# rpm -qa | grep openvswitch rhosp-openvswitch-2.11-0.7.el7ost.noarch rhosp-openvswitch-ovn-central-2.11-0.7.el7ost.noarch python-openvswitch2.11-2.11.3-86.el7fdp.x86_64 rhosp-openvswitch-ovn-host-2.11-0.7.el7ost.noarch openvswitch-selinux-extra-policy-1.0-17.el7fdp.noarch python-rhosp-openvswitch-2.11-0.7.el7ost.noarch openstack-neutron-openvswitch-12.1.1-42.el7ost.noarch openvswitch2.11-2.11.3-86.el7fdp.x86_64
The impact is OVN can't talk to the local ovsdb and e.g. create patch ports between br-int and provider bridges.
FYI, it appears that the "datapaths" column was added in August 2019 to OVS, which means it would be present in OVS 2.12+. In general, the OVN team recommends using whatever version of OVS is contemporary with the current version of OVN. When it comes to versions of OVN from the 20.XX series onward (ovn2.13 and ovn-2021 in RHEL), those should be compatible with OVS 2.13+. In this case, trying to upgrade OVN but using OVS 2.11 is going to cause this issue. OVS needs to be at 2.13+.
Question for OSP: does 21.06.0-13 work? A bunch of the datapath related code in 21.06 was added to 21.06.0-14 for "ovn-controller: Detect OVS datapath capabilities."
@Dan, I don't think that will help in this case. The problem here is a schema mismatch, and that causes errors upon connection when ovn-controller sends its "monitor_cond" (or "monitor_cond_since", I'm not sure which exactly) request. ovn-controller is interested in these columns that do not exist in the current version of OVS, and there's not a way for ovn-controller to fall back to requesting monitoring based on an older schema.
Changing issue to MODIFIED. I was wrong with my reply to Dan earlier. It turns out the probing for capabilities relied on a database table that was added during the OVS 2.12 cycle. As such, this exposed the incompatibility with OVS 2.11. The offending commits have been reverted so as to provide a build. A longer-term solution will be created to allow for us to check datapath capabilities without relying on assumptions about the schema.
reproduce steps: 1. install openvswitch2.11-2.11.3-86.el8fdp and ovn-2021-21.06.0-17.el8fdp on client 2. install openvswitch2.15-2.15.0-26.el8fdp and ovn-2021-21.06.0-17.el8fdp on server 3. start ovn on server: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.25 systemctl restart ovn-controller ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1p1 ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:01 192.168.1.1" ovn-nbctl lsp-add ls1 ls1p2 ovn-nbctl lsp-set-addresses ls1p2 "00:00:00:01:01:02 192.168.1.2" ovs-vsctl add-port br-int ls1p1 -- set interface ls1p1 type=internal external_ids:iface-id=ls1p1 ip netns add ls1p1 ip link set ls1p1 netns ls1p1 ip netns exec ls1p1 ip link set ls1p1 address 00:00:00:01:01:01 ip netns exec ls1p1 ip link set ls1p1 up ip netns exec ls1p1 ip addr add 192.168.1.1/24 dev ls1p1 4. start ovn on client systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.26 systemctl restart ovn-controller ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15 ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2 ip netns add ls1p2 ip link set ls1p2 netns ls1p2 ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02 ip netns exec ls1p2 ip link set ls1p2 up ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2 result on client: + ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15 ovs-vsctl: no row "br-int" in table Bridge + ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2 ovs-vsctl: no bridge named br-int + ip netns add ls1p2 + ip link set ls1p2 netns ls1p2 Cannot find device "ls1p2" + ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02 Cannot find device "ls1p2" + ip netns exec ls1p2 ip link set ls1p2 up Cannot find device "ls1p2" + ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2 Cannot find device "ls1p2" [root@wsfd-advnetlab17 bz1992705]# grep WARN /var/log/ovn/ovn-controller.log 2021-09-01T08:04:24.020Z|00004|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?) 2021-09-01T08:04:24.020Z|00005|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?) 2021-09-01T08:04:24.020Z|00006|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?) 2021-09-01T08:04:24.020Z|00007|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?) 2021-09-01T08:04:24.024Z|00013|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"} 2021-09-01T08:04:24.024Z|00014|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"} 2021-09-01T08:04:24.026Z|00015|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"} 2021-09-01T08:04:24.027Z|00016|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"} 2021-09-01T08:04:24.027Z|00017|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"} 2021-09-01T08:04:24.030Z|00020|rconn|WARN|unix:/run/openvswitch/br-int.mgmt: connection failed (No such file or directory) [root@wsfd-advnetlab17 bz1992705]# rpm -qa | grep -E "openvswitch2.11|ovn-2021" ovn-2021-21.06.0-17.el8fdp.x86_64 openvswitch2.11-2.11.3-86.el8fdp.x86_64 ovn-2021-central-21.06.0-17.el8fdp.x86_64 ovn-2021-host-21.06.0-17.el8fdp.x86_64 Verified on ovn-2021-21.06.0-24: [root@wsfd-advnetlab17 bz1992705]# bash -x client.sh + systemctl start openvswitch + systemctl start ovn-northd + ovn-nbctl set-connection ptcp:6641 + ovn-sbctl set-connection ptcp:6642 + ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.26 + systemctl restart ovn-controller + ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15 + ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2 + ip netns add ls1p2 + ip link set ls1p2 netns ls1p2 + ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02 + ip netns exec ls1p2 ip link set ls1p2 up + ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2 [root@wsfd-advnetlab17 bz1992705]# rpm -qa | grep -E "openvswitch2.11|ovn-2021" ovn-2021-central-21.06.0-24.el8fdp.x86_64 openvswitch2.11-2.11.3-86.el8fdp.x86_64 ovn-2021-21.06.0-24.el8fdp.x86_64 ovn-2021-host-21.06.0-24.el8fdp.x86_64 [root@wsfd-advnetlab16 bz1992705]# ovn-sbctl show Chassis hv1 hostname: wsfd-advnetlab16.anl.lab.eng.bos.redhat.com Encap geneve ip: "20.0.175.25" options: {csum="true"} Port_Binding ls1p1 Chassis hv0 hostname: wsfd-advnetlab17.anl.lab.eng.bos.redhat.com Encap geneve ip: "20.0.175.26" options: {csum="true"} Port_Binding ls1p2 [root@wsfd-advnetlab16 bz1992705]# ip netns exec ls1p1 ping 192.168.1.2 -c 1 PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data. 64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=2.21 ms --- 192.168.1.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.205/2.205/2.205/0.000 ms
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3450
*** Bug 2016343 has been marked as a duplicate of this bug. ***
1518 packets transmitted, 1518 received, 0% packet loss, time 1547210ms ovn-2021-21.09.0-5.el8fdp.x86_64