Bug 1954659 - With OVN SB DB being busy, not all claims by ovn-controller are written to the DB
Summary: With OVN SB DB being busy, not all claims by ovn-controller are written to th...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 21.B
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: OVN Team
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-28 14:22 UTC by Jakub Libosvar
Modified: 2023-07-13 07:25 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1281 0 None None None 2021-09-20 13:51:55 UTC

Description Jakub Libosvar 2021-04-28 14:22:01 UTC
Description of problem:
We have a flaky environment with VIP failing over in between 3 VMs on different chassis. Here are the events how they go in time:

The node 1 claims the vip:
2021-04-28T11:25:56.885Z|05691|pinctrl|INFO|Claiming virtual lport 560d0404-d880-436c-85a1-6d837a5ee656 for this chassis with the virtual parent 20e3c9f2-2146-48ec-aa64-670ed3e89878

SB DB is updated
record 33173: 2021-04-28 11:26:14.693
  table Port_Binding row 6cc4860e (6cc4860e):
    chassis=24e23da4-822b-49ff-b130-46825b732baf
    virtual_parent="20e3c9f2-2146-48ec-aa64-670ed3e89878"

The node 2 claims the vip:
2021-04-28T11:27:57.488Z|03372|pinctrl|INFO|Claiming virtual lport 560d0404-d880-436c-85a1-6d837a5ee656 for this chassis with the virtual parent 73983cf1-3d3b-4660-ad15-95baffd9a4d3

The node 3 claims the vip and it stays there:
2021-04-28T11:28:31.156Z|09316|pinctrl|INFO|Claiming virtual lport 560d0404-d880-436c-85a1-6d837a5ee656 for this chassis with the virtual parent 22ccfce2-5170-40aa-a29b-6ec0654edf4c

SB DB is updated to the node 2
record 33181: 2021-04-28 11:28:53.843
  table Port_Binding row 6cc4860e (6cc4860e):
    chassis=707f372a-5e47-41bb-9c80-fa25c609e739
    virtual_parent="73983cf1-3d3b-4660-ad15-95baffd9a4d3"

There is no other update to the port binding and it stays like this. From this point the traffic is delivered to the node 2 despite the VIP being on node 3.

Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-104.el8fdp.x86_64

How reproducible:
Not sure

Steps to Reproduce:
1. Have a busy SB DB
2. Migrate VIP around
3. Observe the parent port of the VIP and if it's on a correct node

Actual results:


Expected results:


Additional info:
NB and SB DBs are attached


Note You need to log in before you can comment on or make changes to this bug.