Bug 2314429
Summary: | [ML2/OVN] ovn_emit_need_to_frag should be enabled by default | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Lucas Alvares Gomes <lmartins> | |
Component: | openstack-neutron | Assignee: | OSP Team <rhos-maint> | |
Status: | CLOSED MIGRATED | QA Contact: | Fiorella Yanac <fyanac> | |
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | 17.1 (Wallaby) | CC: | bcafarel, bmv, chrisw, gregraka, ihrachys, mburns, mtomaska, scohen, ykarel | |
Target Milestone: | async | Keywords: | Triaged | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Known Issue | ||
Doc Text: |
Currently, when MTUs mismatch, the communicating peers are unaware of the discrepancy, and the Networking service (neutron) can silently drop the packets. OVN is the cause of this problem because it fails to emit the message, `ICMP Fragmentation Needed`. *Workaround:* the preferred method is to adjust the MTU value to prevent packets that are too large from being transmitted. An alternative method is to set the `OVNEmitNeedToFrag` option in the tripleo templates. For more information, see the Knowledgebase solution, link:https://access.redhat.com/solutions/7092922[Neutron ML2/OVN packet fragmentation problems].
|
Story Points: | --- | |
Clone Of: | ||||
: | 2322938 (view as bug list) | Environment: | ||
Last Closed: | 2025-01-10 09:51:38 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2322938 |
Description
Lucas Alvares Gomes
2024-09-24 13:03:19 UTC
@Lucas, so what's the implication of the TRAC discussion? Let me write down what I think we should, but please confirm I am not making things up. 1. Since the blocker is not accepted, and changing the default in 17.x may have unintended consequences, we are not going to backport the patch to 17.x. (There is also a kernel version concern, though I don't think it's valid.) 2. Instead, for 17.x, we are going to document the option in a configuration guide. 3. For 18.x, we are going to set the option in neutron-operator in the generated config. We are not going to backport neutron patch. 4. We are not going to backport it upstream either (it's not backportable, as per policy). --- That said, shouldn't the argument from (1) about unintended consequences apply to (3) too then? We have released 18 GA. Is it ok to change behavior in 18.x line? Or is 18.x case somehow different from 17.x? Is it because of 17.x age? (In reply to Ihar Hrachyshka from comment #4) > @Lucas, so what's the implication of the TRAC discussion? Let me write down > what I think we should, but please confirm I am not making things up. > > 1. Since the blocker is not accepted, and changing the default in 17.x may > have unintended consequences, we are not going to backport the patch to > 17.x. (There is also a kernel version concern, though I don't think it's > valid.) > > 2. Instead, for 17.x, we are going to document the option in a configuration > guide. > > 3. For 18.x, we are going to set the option in neutron-operator in the > generated config. We are not going to backport neutron patch. > > 4. We are not going to backport it upstream either (it's not backportable, > as per policy). > > --- > > That said, shouldn't the argument from (1) about unintended consequences > apply to (3) too then? We have released 18 GA. Is it ok to change behavior > in 18.x line? Or is 18.x case somehow different from 17.x? Is it because of > 17.x age? Whoever gets to change this default should check the kernel version delivered by our product to see if this is supported, the kernel change was merged upstream in 2019 so I think we are safe but, needs to be double-checked anyway. AFAIK there's no problem to have this option enabled as long as the kernel supports it. For point 2. I think we should document this in the migration guide since ML2/OVS and ML2/OVN have different behaviors for this specific feature. I agree with 3. and 4. I don't think we can backport changing a default upstream, so we need to change it in OSP directly. For documentation purposes, the kernel patch that is required for this feature to work is: ``` commit 4d5ec89fc8d14dcdab7214a0c13a1c7321dc6ea9 Author: Numan Siddique <nusiddiq> Date: Tue Mar 26 06:13:46 2019 +0530 net: openvswitch: Add a new action check_pkt_len This patch adds a new action - 'check_pkt_len' which checks the packet length and executes a set of actions if the packet length is greater than the specified length or executes another set of actions if the packet length is lesser or equal to. ``` The kernel patch was backported back in 2019 for rhel8 branch, as: ``` * Tue Oct 08 2019 Phillip Lougher <plougher> [4.18.0-147.5.el8] ``` So it should be safe to enable the feature for both branches. Greg, I provided a known issue text. Please adjust if needed. --- The current plan for the issue is: 1. Docs. a) deliver the Known Issue (with proposed workarounds) to 17.1.x customers. b) add a KCS on how and when to enable OVNEmitNeedToFrag. c) update https://docs.redhat.com/en/documentation/red_hat_openstack_platform/17.1/html-single/overcloud_parameters/index#ref_networking-neutron-parameters_overcloud_parameters so that it does NOT claim that OVNEmitNeedToFrag is for "host kernel (version >= 5.2)" because it's not valid for RHEL kernel that has extensive backports to older versions. In reality, all RHEL8/9 kernels support this feature. 2. 17.x. a) backport the fix to flip the default for ovn_emit_need_to_frag to True in neutron. b) set default for tripleo OVNEmitNeedToFrag to True in wallaby. 3. 18.x. a) backport the fix to flip the default for ovn_emit_need_to_frag to True in neutron. b) set ovn_emit_need_to_frag=True in neutron-operator default config template. This bz will be used for the following from the plan above: ``` 2. 17.x. a) backport the fix to flip the default for ovn_emit_need_to_frag to True in neutron. ``` The rest will get their own bzs / jiras. Final tally: 1. docs: known issue doc text updated in this bz; kcs request: https://bugzilla.redhat.com/show_bug.cgi?id=2318544 ; fix OVNEmitNeedToFrag description in docs: https://bugzilla.redhat.com/show_bug.cgi?id=2318545 2. neutron backport: this BZ; tripleo enable by default: https://bugzilla.redhat.com/show_bug.cgi?id=2318546 3. 18.x Jira to enable it in operator and in neutron: https://issues.redhat.com/browse/OSPRH-10684 I'm moving this to NEW since the backport was not posted. I also unassign myself to give a chance to the team to consider it for planning / someone else to pick it up. |