Bug 1792160
| Summary: | keepalived 2.0.10 goes into FAULT STATE when an interface is renamed | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Gregory Thiemonge <gthiemon> | |
| Component: | keepalived | Assignee: | Ryan O'Hara <rohara> | |
| Status: | CLOSED ERRATA | QA Contact: | Brandon Perkins <bperkins> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 8.1 | CC: | atragler, cfeist, cgoncalves, cluster-maint, redhat-bugzilla, rohara | |
| Target Milestone: | rc | Keywords: | ZStream | |
| Target Release: | 8.0 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | keepalived-2.0.10-9.el8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1801895 (view as bug list) | Environment: | ||
| Last Closed: | 2020-04-28 16:05:02 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1792157, 1801895 | |||
Here is a scratch build with the recommended patch. Please test ASAP. Assuming it works, I will do a proper build for 8.2 and 8.1.z. https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=26135578 Ryan, I've tested the package and I confirm that this scratch build fixes the issue. Thanks Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1753 |
Description of problem: Openstack Octavia project uses keepalived with VRRP in VMs to enable HA in load balancers. keepalived runs in a namespace and use eth1 as VRRP port. The octavia amphora-agent updates the namespace by adding new interfaces and renaming interfaces (these interfaces are used by haproxy). These interfaces are not related to VRRP and are not part of keepalived configuration. But when keepalived detects that an interface has been renamed, it goes into FAUL STATE. Version-Release number of selected component (if applicable): RHEL 8.1/keepalived 2.0.10 How reproducible: 100% Steps to Reproduce: I reproduced the issue using one Octavia Load Balancer with HA enabled, one listener and some iproute commands that simulate what amphora-agent does: Connect into the MASTER amphora, check VRRP ip address and VRRP state in the logs -> everything is ok bash-4.4# ip -n amphora-haproxy a 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000 link/ether fa:16:3e:1f:f5:b5 brd ff:ff:ff:ff:ff:ff inet 10.0.0.189/26 brd 10.0.0.191 scope global eth1 valid_lft forever preferred_lft forever inet 10.0.0.161/32 scope global eth1 valid_lft forever preferred_lft forever bash-4.4# journalctl -le | grep Keep [..] Jan 15 14:56:25 amphora-f3af9d57-790c-4471-ae1f-82ee48ec2f68.novalocal Keepalived_vrrp[1346]: (2877477fd523485ebb7edbb2bf0967e4) Entering BACKUP STATE Jan 15 14:56:28 amphora-f3af9d57-790c-4471-ae1f-82ee48ec2f68.novalocal Keepalived_vrrp[1346]: (2877477fd523485ebb7edbb2bf0967e4) received lower priority (90) advert from 10.0.0.173 - discarding Jan 15 14:56:29 amphora-f3af9d57-790c-4471-ae1f-82ee48ec2f68.novalocal Keepalived_vrrp[1346]: (2877477fd523485ebb7edbb2bf0967e4) Entering MASTER STATE Create a dummy interface, move it to amphora-haproxy ns, check VRRP ip address -> still ok bash-4.4# ip link add veth0 type veth peer veth1 bash-4.4# ip link set veth1 up bash-4.4# ip link set veth1 netns amphora-haproxy bash-4.4# ip -n amphora-haproxy a 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000 link/ether fa:16:3e:1f:f5:b5 brd ff:ff:ff:ff:ff:ff inet 10.0.0.189/26 brd 10.0.0.191 scope global eth1 valid_lft forever preferred_lft forever inet 10.0.0.161/32 scope global eth1 valid_lft forever preferred_lft forever 4: veth1@if5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether fa:0f:5b:55:44:f5 brd ff:ff:ff:ff:ff:ff link-netnsid 0 Rename the dummy interface, check VRRP ip address and the logs -> address has been removed, keepalived is in FAULT STATE bash-4.4# ip -n amphora-haproxy link set veth1 name foo0 bash-4.4# ip -n amphora-haproxy a 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000 link/ether fa:16:3e:1f:f5:b5 brd ff:ff:ff:ff:ff:ff inet 10.0.0.189/26 brd 10.0.0.191 scope global eth1 valid_lft forever preferred_lft forever 4: foo0@if5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether fa:0f:5b:55:44:f5 brd ff:ff:ff:ff:ff:ff link-netnsid 0 bash-4.4# journalctl -le | grep Keep [..] Jan 15 16:31:27 amphora-f3af9d57-790c-4471-ae1f-82ee48ec2f68.novalocal Keepalived_vrrp[1346]: Interface name has changed from veth1 to foo0 Jan 15 16:31:27 amphora-f3af9d57-790c-4471-ae1f-82ee48ec2f68.novalocal Keepalived_vrrp[1346]: (2877477fd523485ebb7edbb2bf0967e4) Entering FAULT STATE Jan 15 16:31:27 amphora-f3af9d57-790c-4471-ae1f-82ee48ec2f68.novalocal Keepalived_vrrp[1346]: (2877477fd523485ebb7edbb2bf0967e4) sent Actual results: keepalived goes into FAULT STATE, VRRP ip address is disabled Expected results: renaming an unused interface should not trigger anything in keepalived Additional info: I found out that the bug is not present in keepalived>=2.0.11 and that the commit that fixes the issue is https://github.com/acassen/keepalived/commit/30eeb48b1a0737dc7443fd421fd6613e0d55fd17 Can we backport this commit in 2.0.10?