Bug 2158626

Summary: Provider patch port is re-created on restart
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Jakub Libosvar <jlibosva>
Component: ovn-2021Assignee: OVN Team <ovnteam>
Status: CLOSED WONTFIX QA Contact: Jianlin Shi <jishi>
Severity: medium Docs Contact:
Priority: high    
Version: FDP 21.ACC: ctrautma, dceara, jamsmith, jiji, jveiraca, kgilliga, mmichels, sathlang, twilson
Target Milestone: ---Flags: kgilliga: needinfo? (sathlang)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-28 18:01:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2154344    

Description Jakub Libosvar 2023-01-05 21:33:43 UTC
Description of problem:
If a chassis has a port bound that uses provider network then a patch port between the provider bridge and the integration bridge is created. This patch port has an ofproto number in br-int. In case conditional monitoring is on (ovn-monitor-all is disabled or set to false), this patch port is re-created again on each ovn-controller start. It leads to a change of the ofproto number on the integration brdige. With combination of deferred flow installation (ovn-ofctrl-wait-before-clear) parameter, this creates a data plane disruption of the provider network because the old flows still refer to the old ofproto number until the time runs up and new flows are installed. Until then, the traffic doesn't flow between the provider bridge and the integration bridge.

Version-Release number of selected component (if applicable):
ovn-2021-21.12.0-93.el8fdp.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Have a patch port between br-int and provider bridge
2. Unset ovn-monitor-all and set ovn-ofctrl-wait-before-clear to some number (e.g. 60000)
3. Start traffic to the bound port
4. Restart ovn-controller

Actual results:
The traffic doesn't flow between patch port and the br-int

Expected results:
The ofport on br-int is not changed

Additional info:
The patch port is not re-created in case ovn-monitor-all is set to true. Note that monitor-all=true is a default in OSP since 16.2.2 and the ovn-ofctrl-wait-before-clear parameter was introduced in OSP 16.2.4 - so the reproducer is very specific for updates from times when monitor-all wasn't set

Comment 1 Dumitru Ceara 2023-01-09 10:20:31 UTC
I think this is addressed by upstream commit https://github.com/ovn-org/ovn/commit/2b27eb3482a33136581796389d331d3be496a643

This is present on branch-22.09 and available downstream in all ovn22.09 builds.

Comment 2 Jakub Libosvar 2023-01-09 16:44:50 UTC
Is it possible to backport the patch to ovn-2021 that's used in OSP 16.2?

Comment 3 Dumitru Ceara 2023-01-10 11:34:02 UTC
This is the series of commit that would need to be backported:

https://github.com/ovn-org/ovn/commit/46135d00233c91ea17764307896c7494daa03b4c
https://github.com/ovn-org/ovn/commit/50b3af8938c93491d429dcabe8f9902f0aa43426
https://github.com/ovn-org/ovn/commit/5d733bbe7b4f0750b3110e6073106dfdf4d9d5e4
https://github.com/ovn-org/ovn/commit/2b27eb3482a33136581796389d331d3be496a643

They don't apply cleanly to ovn-2021 because there were changes done to the binding I-P code in the meantime.  Backporting these seems risky.  Is it an option for OSP 16 to move to OVN 22.x instead?

Thanks,
Dumitru

Comment 4 Jakub Libosvar 2023-01-10 15:13:33 UTC
(In reply to Dumitru Ceara from comment #3)
> This is the series of commit that would need to be backported:
> 
> https://github.com/ovn-org/ovn/commit/
> 46135d00233c91ea17764307896c7494daa03b4c
> https://github.com/ovn-org/ovn/commit/
> 50b3af8938c93491d429dcabe8f9902f0aa43426
> https://github.com/ovn-org/ovn/commit/
> 5d733bbe7b4f0750b3110e6073106dfdf4d9d5e4
> https://github.com/ovn-org/ovn/commit/
> 2b27eb3482a33136581796389d331d3be496a643
> 
> They don't apply cleanly to ovn-2021 because there were changes done to the
> binding I-P code in the meantime.  Backporting these seems risky.  Is it an
> option for OSP 16 to move to OVN 22.x instead?
> 
> Thanks,
> Dumitru

No, I don't think we'd get an approval from release delivery to change the major OVN version in OSP.

Comment 5 Jakub Libosvar 2023-01-10 15:15:00 UTC
(In reply to Jakub Libosvar from comment #4)
> (In reply to Dumitru Ceara from comment #3)
> > This is the series of commit that would need to be backported:
> > 
> > https://github.com/ovn-org/ovn/commit/
> > 46135d00233c91ea17764307896c7494daa03b4c
> > https://github.com/ovn-org/ovn/commit/
> > 50b3af8938c93491d429dcabe8f9902f0aa43426
> > https://github.com/ovn-org/ovn/commit/
> > 5d733bbe7b4f0750b3110e6073106dfdf4d9d5e4
> > https://github.com/ovn-org/ovn/commit/
> > 2b27eb3482a33136581796389d331d3be496a643
> > 
> > They don't apply cleanly to ovn-2021 because there were changes done to the
> > binding I-P code in the meantime.  Backporting these seems risky.  Is it an
> > option for OSP 16 to move to OVN 22.x instead?
> > 
> > Thanks,
> > Dumitru
> 
> No, I don't think we'd get an approval from release delivery to change the
> major OVN version in OSP.

However, we enable monitor-all by default from OSP 16.2.2. If we change our update process to execute puppet earlier in the process before ovn-controller is updated then we can workaround this bug with it.

Comment 18 Mark Michelson 2023-07-28 18:01:38 UTC
Closing this since the backport of issues was determined to be too risky. The issue is also now documented in OSP.