Bug 2224199
| Summary: | ovn-controller replace CT zone UUID names with LR/LS names | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Surya Seetharaman <surya> |
| Component: | ovn23.09 | Assignee: | Ales Musil <amusil> |
| Status: | MODIFIED --- | QA Contact: | Jianlin Shi <jishi> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | FDP 23.A | CC: | amusil, ctrautma, dcbw, dceara, jiji, mmichels, ovn-bot |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ovn23.09-23.09.0-alpha.102.el9fdp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Surya Seetharaman
2023-07-20 07:35:00 UTC
FWIW, I think the title is misleading. There's nothing blocking the host from using *any* CT zone it wants. IMO ovn-controller should flush all zones it uses. There is one special case, when ovn-controller shares a zone with the host. That's normally when LR.options:snat-ct-zone=<ZONE> is set in the NB. Only in that case it *might* be acceptable to not flush the zone, if the CMS explicitly requests that. Summarizing what we discussed offline: The real underlying problem is that ovn-controller was flushing CT zone 0 during an ovn-kubernetes migration from non-IC to IC deployments. That's because the SB database was being reconstructed and the datapath associated to the gateway router that had snat-ct-zone=0 set was changing UUID; the ovn-controller mechanism that avoids flushing zones that were already in use by OVN matches on datapath UUIDs and not names so ovn-controller was incorrectly assuming that the logical datapath had changed. It's not desirable to add more configuration knobs for avoiding CT zone flush (those would have to be per switch/router and would over-complicate the code); instead we can change ovn-controller to try matching both on UUID and on switch/router name when mapping required CT zones to already existing ones. Once ovn-kubernetes upgrades to a version of OVN that supports both mappings no flush will happen anymore when the SB UUID changes unless the SB datapath name changes too. That should properly fix any traffic disruption issues caused by conntrack flush when upgrading from non-IC to IC deployments. ovn23.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2227121 ovn23.06 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2227122 ovn23.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2227123 Upstream patch applied: http://patchwork.ozlabs.org/project/ovn/patch/20230726124239.66275-1-amusil@redhat.com/ *** Bug 2227121 has been marked as a duplicate of this bug. *** |