+++ This bug was initially created as a clone of Bug #1787360 +++ Description of problem: If ovn-controller is woken up (e.g., a SB DB update needs to be processed) it will try to run the incremental processing engine. If the current SB DB transaction is still in progress this will fail and trigger a full recompute. However, in some cases processing the changes incrementally doesn't require write operations to the SB DB. To fix this, the engine should try to run even when no SB DB txn is available. Version-Release number of selected component (if applicable): How reproducible: Sometimes, based on how long it takes to process the MAC_Binding row update in ovsdb-server. Steps to Reproduce: 1. Configure a logical switch attached to a logical router. Configure an IP subnet on the logical router port. 2. Enable debug traces in ovn-controller: ovn-appctl -t ovn-controller vlog/set DBG 2. Send a GARP from a VM attached to the logical switch. 3. This should not trigger a full recompute of the database, i.e., no "engine did not run, force recompute next time" log should be seen in ovn-controller.log. Actual results: If the transaction issued by ovn-controller is still in progress when the update is received back from ovsdb-server, ovn-controller will trigger a full recompute. Expected results: MAC_Binding updates should be processed incrementally even when SB DB txn is NULL. Additional info: Fixed by upstream commit: https://github.com/ovn-org/ovn/commit/e2ab60e3a7c60f3adb8da40e4d1cfeb890d6f80e
reproduced on ovn2.11.1-24 with reproducer in https://bugzilla.redhat.com/show_bug.cgi?id=1787360#c3: [root@dell-per740-12 bz1787360]# rm /var/log/openvswitch/ovn-controller.log -f [root@dell-per740-12 bz1787360]# [root@dell-per740-12 bz1787360]# bash -x rep.sh + systemctl restart openvswitch + systemctl restart ovn-northd + ovn-nbctl set-connection ptcp:6641 + ovn-sbctl set-connection ptcp:6642 + ovs-vsctl set open . external-ids:system_id=hv1 external-ids:ovn-remote=tcp:20.0.30.25:6642 external-ids:ovn-encap-type=geneve external-ids:ovn-encap-ip=20.0.30.25 + systemctl restart ovn-controller + ovn-nbctl lr-add lr1 + ovn-nbctl lrp-add lr1 lrp1 00:01:02:00:02:01 192.168.0.254/24 2001::a/64 + ovn-nbctl ls-add ls1 + ovn-nbctl lsp-add ls1 ls1-lr1 + ovn-nbctl lsp-set-options ls1-lr1 router-port=lrp1 + ovn-nbctl lsp-set-addresses ls1-lr1 '00:01:02:00:02:01 192.168.0.254 2001::a' + ovn-nbctl lsp-add ls1 lsp1 + ovn-nbctl set Logical-Switch ls1 other_config:subnet=192.168.0.0/16 + ovn-nbctl set Logical-switch ls1 other_config:ipv6_prefix=2001::0 + ovn-nbctl lsp-set-addresses lsp1 '00:01:02:00:02:02 192.168.0.1 2001::1' + ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal + ip netns add server0 + ip link set vm1 netns server0 + ip netns exec server0 ip link set lo up + ip netns exec server0 ip link set vm1 up + ip netns exec server0 ip link set vm1 address 00:01:02:00:02:02 + ip netns exec server0 ip addr add 192.168.0.1/24 dev vm1 + ip netns exec server0 ip addr add 2001::1/64 dev vm1 + ovs-vsctl set Interface vm1 external_ids:iface-id=lsp1 + ovn-nbctl lsp-add ls1 lsp2 + ovn-nbctl lsp-set-addresses lsp2 '00:01:02:00:02:03 192.168.0.2 2001::2' + ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal + ip netns add server1 + ip link set vm2 netns server1 + ip netns exec server1 ip link set lo up + ip netns exec server1 ip link set vm2 up + ip netns exec server1 ip link set vm2 address 00:01:02:00:02:03 + ip netns exec server1 ip addr add 192.168.0.2/24 dev vm2 + ip netns exec server1 ip addr add 2001::2/64 dev vm2 + ovs-vsctl set Interface vm2 external_ids:iface-id=lsp2 + ovs-appctl -t ovn-controller vlog/set DBG + ip netns exec server0 python garp.py . Sent 1 packets. [root@dell-per740-12 bz1787360]# grep "engine did not run, force recompute next time" /var/log/openvswitch/ovn-controller.log -c 559 <=== several lines [root@dell-per740-12 bz1787360]# rpm -qa | grep ovn ovn2.11-central-2.11.1-24.el7fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch ovn2.11-2.11.1-24.el7fdp.x86_64 ovn2.11-host-2.11.1-24.el7fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-basic-1.0-22.noarch Verified on ovn2.11.1-38: [root@dell-per740-12 bz1787360]# grep "engine did not run, force recompute next time" /var/log/openvsw itch/ovn-controller.log -c 1 [root@dell-per740-12 bz1787360]# rpm -qa | grep ovn ovn2.11-host-2.11.1-38.el7fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch ovn2.11-central-2.11.1-38.el7fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-basic-1.0-22.noarch ovn2.11-2.11.1-38.el7fdp.x86_64
also verified on rhel8 version: [root@hp-dl380pg8-12 bz1787360]# grep "engine did not run, force recompute next time" /var/log/openvswitch/ovn-controller.log -c 1 [root@hp-dl380pg8-12 bz1787360]# rpm -qa | grep -E "openvswitch|ovn" ovn2.11-central-2.11.1-38.el8fdp.x86_64 openvswitch-selinux-extra-policy-1.0-22.el8fdp.noarch openvswitch2.11-2.11.0-50.el8fdp.x86_64 ovn2.11-2.11.1-38.el8fdp.x86_64 ovn2.11-host-2.11.1-38.el8fdp.x86_64
All these bugs have been verified and have shipped in FDP 20.G or earlier.