The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2196286 - tug-of-war between ovn-controllers for external gateway port causes havoc for ml2-ovn
Summary: tug-of-war between ovn-controllers for external gateway port causes havoc for...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn-2021
Version: FDP 20.H
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Ales Musil
QA Contact: Ehsan Elahi
URL:
Whiteboard:
: 2189267 (view as bug list)
Depends On: 1974898
Blocks: 1728282 1994427 2081631 2189267
TreeView+ depends on / blocked
 
Reported: 2023-05-08 15:54 UTC by Ihar Hrachyshka
Modified: 2024-07-15 16:28 UTC (History)
15 users (show)

Fixed In Version: ovn-2021-21.12.0-136.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1974898
Environment:
Last Closed: 2023-11-30 00:16:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2842 0 None None None 2023-05-08 15:57:19 UTC
Red Hat Product Errata RHBA-2023:7591 0 None None None 2023-11-30 00:16:34 UTC

Comment 5 Mark Michelson 2023-07-28 18:26:28 UTC
*** Bug 2189267 has been marked as a duplicate of this bug. ***

Comment 6 Ales Musil 2023-09-15 05:32:20 UTC
Backport posted: https://patchwork.ozlabs.org/project/ovn/list/?series=372659

Comment 9 Jianlin Shi 2023-10-25 02:24:40 UTC
Hi Ales,

the link for patch in comment 6 is expired, which patch fix the issue? any reproducer?

Comment 10 Jianlin Shi 2023-10-26 07:43:33 UTC
thanks to Dumitru's guidance, I found the patches: https://patchwork.ozlabs.org/project/ovn/list/?series=372659&state=*

Comment 11 Ehsan Elahi 2023-10-27 11:50:29 UTC
Reproduced on: 
[root@~ bz2196286]# rpm -qa | grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-31.el8fdp.noarch
openvswitch2.17-2.17.0-50.el8fdp.x86_64
ovn-2021-21.12.0-103.el8fdp.x86_64
ovn-2021-central-21.12.0-103.el8fdp.x86_64
ovn-2021-host-21.12.0-103.el8fdp.x86_64

Here is the reproducer:
####### HV1 ########
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv1

ovs-vsctl set open . external_ids:ovn-remote=tcp:192.168.20.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=192.168.20.1
ovs-vsctl set open . external_ids:ovn-monitor-all=true
systemctl start ovn-controller

ovn-nbctl ls-add ls
ovn-nbctl lsp-add ls lsp0
ovn-nbctl lsp-set-addresses lsp0 "00:00:00:00:20:10 192.168.20.10"

ip netns add vm0
ovs-vsctl add-port br-int vm0 -- set interface vm0 type=internal
ip netns exec vm0 ip link set lo up
ip link set vm0 netns vm0
ip netns exec vm0 ip link set vm0 address 00:00:00:00:20:10
ip netns exec vm0 ip link set vm0 up
ip netns exec vm0 ip addr add 192.168.20.10/24 dev vm0
ovs-vsctl set interface vm0 external_ids:iface-id=lsp0

############ HV2 ###############
systemctl start ovn-northd
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv0

ovs-vsctl set open . external_ids:ovn-remote=tcp:192.168.20.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=192.168.20.2
ovs-vsctl set open . external_ids:ovn-monitor-all=true
systemctl start ovn-controller

ip netns add vm0
ovs-vsctl add-port br-int vm0 -- set interface vm0 type=internal
ip netns exec vm0 ip link set lo up
ip link set vm0 netns vm0
ip netns exec vm0 ip link set vm0 address 00:00:00:00:20:10
ip netns exec vm0 ip link set vm0 up
ip netns exec vm0 ip addr add 192.168.20.10/24 dev vm0
ovs-vsctl set interface vm0 external_ids:iface-id=lsp0

########## Results on non fixed release ##################
[root@~ bz2196286]# grep -c 'Claiming\|Changing chassis' /var/log/ovn/ovn-controller.log && sleep 3 && grep -c 'Claiming\|Changing chassis' /var/log/ovn/ovn-controller.log
30460
84534

<<< ############ Within 3 seconds, there's a bulk of port claiming and changing logs ########


Verified ON:
[root@~ bz2196286]# rpm -qa | grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-31.el8fdp.noarch
openvswitch2.17-2.17.0-134.el8fdp.x86_64
ovn-2021-21.12.0-137.el8fdp.x86_64
ovn-2021-central-21.12.0-137.el8fdp.x86_64
ovn-2021-host-21.12.0-137.el8fdp.x86_64

########## Results on fixed release ##################
[root@~ bz2196286]# grep -c 'Claiming\|Changing chassis' /var/log/ovn/ovn-controller.log && sleep 3 && grep -c 'Claiming\|Changing chassis' /var/log/ovn/ovn-controller.log
1874
1886

<<< ############ Within 3 seconds, there's less than 10 port claiming and less than 10 chassis changing logs ########

<<======== Few tail lines from ovn-controller.log on HV1. This shows that port claiming/changing is done once in 0.5 seconds. 

2023-10-27T10:21:45.133Z|02023|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:45.133Z|02024|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:45.633Z|02027|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:45.633Z|02028|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:46.134Z|02030|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:46.134Z|02031|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:46.634Z|02033|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:46.634Z|02034|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:47.134Z|02036|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:47.134Z|02037|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:47.635Z|02039|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:47.635Z|02040|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:48.136Z|02041|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:48.136Z|02042|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:48.637Z|02043|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:48.637Z|02044|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:49.138Z|02045|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:49.138Z|02046|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10
2023-10-27T10:21:49.638Z|02047|binding|INFO|Changing chassis for lport lsp0 from hv0 to hv1.
2023-10-27T10:21:49.638Z|02048|binding|INFO|lsp0: Claiming 00:00:00:00:20:10 192.168.20.10

Comment 13 errata-xmlrpc 2023-11-30 00:16:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn-2021 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7591


Note You need to log in before you can comment on or make changes to this bug.