Bug 2003719

Summary: OVN controller constantly reporting transaction violation for MAC_Binding table
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Maysa Macedo <mdemaced>
Component: ovn-2021Assignee: OVN Team <ovnteam>
Status: MODIFIED --- QA Contact: Ehsan Elahi <eelahi>
Severity: high Docs Contact:
Priority: medium    
Version: FDP 21.GCC: apevec, ctrautma, echaudro, fyanac, jiji, jishi, jlibosva, lhh, lmartins, majopela, mdulko, mmichels, scohen
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn-2021-21.12.0-94.el9fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2002121    

Description Maysa Macedo 2021-09-13 14:13:57 UTC
Description of problem:

After attempting a couple of OpenShift installations with and without Kuryr as SDN in parallel, we started noticing that Networks were first not allowed to get created, then that Routers external gateway were not allowed to be set. After checking the ovn-controller logs it was noticed that the following warning was constantly triggered:

2021-09-10T14:21:46.314Z|104284|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"MAC_Binding\" table to have identical values (lrp-5929ab56-6070-46c7-9700-bc9fb91e636e and \"38.x.x.x\") for index on columns \"logical_port\" and \"ip\".  First row, with UUID c20de7f7-b5ad-4a1d-bfc0-864eba110aeb, existed in the database before this transaction and was not modified by the transaction.  Second row, with UUID ce8406ab-f425-411c-a39d-a377f71460ed, was inserted by this transaction.","error":"constraint violation"}

We have attempted setting neutron_sync_mode=repair to fix it, but it was eventually happening again.

There was a discussion[0] on the mailing list reporting similar issues, but as the workaround[1] is enforced (we were using tripleo master) I assume these warnings might be caused by some other corner case.

[0] https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047604.html
[1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L500-L518


Version-Release number of selected component (if applicable):
[stack@mecha-central tmp]$ sudo podman exec -it ovn_controller sh
sh-4.4# rpm -qa|grep ovn
rdo-ovn-2021-2.el8.noarch
ovn-2021-host-21.03.0-40.el8s.x86_64
ovn-2021-21.03.0-40.el8s.x86_64
rdo-ovn-host-2021-2.el8.noarch

sh-4.4# rpm -qa|grep openvswitch                                                                                                                                                             
network-scripts-openvswitch2.15-2.15.0-24.el8s.x86_64
python3-openvswitch2.15-2.15.0-24.el8s.x86_64
rdo-openvswitch-2.15-2.el8.noarch
openvswitch-selinux-extra-policy-1.0-28.el8.noarch
openvswitch2.15-2.15.0-24.el8s.x86_64
rdo-network-scripts-openvswitch-2.15-2.el8.noarch
python3-rdo-openvswitch-2.15-2.el8.noarch


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info: