Bug 2077016
| Summary: | [OSP16.1] HA L3 router/keepalived stability issues (ML2/OVS) | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | ggrimaux | |
| Component: | openstack-neutron | Assignee: | Slawek Kaplonski <skaplons> | |
| Status: | CLOSED ERRATA | QA Contact: | Fiorella Yanac <fyanac> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 16.1 (Train) | CC: | ahyder, alolivei, averdagu, bcafarel, bdobreli, bperkins, bsawyers, bshephar, ccamposr, chrisw, cluster-maint, dalvarez, dhill, ekuris, eolivare, fleitner, jdolling, jhardee, jschluet, ldenny, ltamagno, mflusche, oblaut, pveiga, ralonsoh, rdiwakar, rohara, scohen, skaplons, sputhenp, sukar, takirby | |
| Target Milestone: | z9 | Keywords: | TestCannotAutomate, Triaged, ZStream | |
| Target Release: | 16.1 (Train on RHEL 8.2) | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-neutron-15.2.1-1.20220421073454.40d217c.el8ost | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1869355 | |||
| : | 2096223 (view as bug list) | Environment: | ||
| Last Closed: | 2022-12-07 20:28:59 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1869355 | |||
| Bug Blocks: | 2096223 | |||
Hello Lewis:
Please use "git log --pretty=fuller" to check not the "AuthorDate" but the "CommitDate" (that is actually the date the commit was merged):
commit 9d1a942729b7ea03c042bdceb161f3145cfac8c1
Author: Rodolfo Alonso Hernandez <ralonsoh>
AuthorDate: Tue Sep 15 16:04:45 2020 +0000
Commit: Rodolfo Alonso Hernandez <ralonsoh>
CommitDate: Thu Apr 21 07:22:31 2022 +0000
Add "vrrp_garp_master_delay" and "vrrp_garp_master_repeat" parameters
In 16.1 (and this is the goal of this BZ), the patch was merged in April. "openstack-neutron-15.2.1-1.20220112133420.el8ost.noarch" can't have it.
Regards.
oh great, thanks for the tip Rodolfo, that makes much more sense... sorry for the confusion! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenStack Platform 16.1.9 (openstack-neutron) security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8870 |
Hi Team, For now we should be able to tune the `ha_vrrp_advert_int` param to 15 seconds[1] to avoid this issue.[2] Checking downstream I can see the patch has been part of 16.1-truck-patches for a long time now: ``` [ldenny@redhat-jumpbox neutron]$ git log -S 'vrrp_garp_master_delay' commit 9d1a942729b7ea03c042bdceb161f3145cfac8c1 Author: Rodolfo Alonso Hernandez <ralonsoh> Date: Tue Sep 15 16:04:45 2020 +0000 ``` But checking the latest 16.1 neutron-server container we are shipping `openstack-neutron-15.2.1-1.20220112133420.el8ost.noarch` which is higher than the fixed in version of `openstack-neutron-12.1.1-38.el7ost` but the patch is indeed missing: ``` ❯ podman create registry.redhat.io/rhosp-rhel8/openstack-neutron-server:16.1.8-10 ❯containerfs=$(podman mount -l) ❯ grep -A8 'def _init_keepalived_manager' $containerfs/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py def _init_keepalived_manager(self, process_monitor): self.keepalived_manager = keepalived.KeepalivedManager( self.router['id'], keepalived.KeepalivedConf(), process_monitor, conf_path=self.agent_conf.ha_confs_path, namespace=self.ha_namespace, throttle_restart_value=( self.agent_conf.ha_vrrp_advert_int * THROTTLER_MULTIPLIER)) ``` @ralonsoh is this expected? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1869355#c46 [2] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/networking_guide/index#tune-keepalived_common-network-tasks