1348929 – keepalived VIP becomes unreachable after a switch and a following higher prio advert

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1348929 - keepalived VIP becomes unreachable after a switch and a following higher prio advert

Summary: keepalived VIP becomes unreachable after a switch and a following higher pri...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	keepalived
Sub Component:
Version:	7.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Ryan O'Hara
QA Contact:	Brandon Perkins
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1420851
TreeView+	depends on / blocked

Reported:	2016-06-22 10:44 UTC by andrea
Modified:	2021-12-01 03:21 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-27 04:26:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	2978661	0	None	None	None	2017-06-19 00:03:24 UTC

Description andrea 2016-06-22 10:44:37 UTC

Description of problem:

In a 2 nodes keepalived cluster, (probably for a network problem) when a backup node becomes master and sends the gratuitous arp and right after it receive an higher prio advert falling back as backup, it releases the VIP, however the master node doesn't sends gratuitous arp causing a L3 problem resulting in VIP unreachable from outside the network.


Version-Release number of selected component (if applicable):

keepalived-1.2.13-7.el7.x86_64


How reproducible:

difficult to reproduce, it happens when for a network problem, the backup node becomes master and right after it receive an higher prio advert

Actual results:

logs on backup node
--------------------
JJun 21 19:56:13 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) Transition to MASTER STATE
Jun 21 19:56:14 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) Entering MASTER STATE
Jun 21 19:56:14 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) setting protocol VIPs.
Jun 21 19:56:14 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) Sending gratuitous ARPs on eno16777984 for 10.123.5.69
Jun 21 19:56:14 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) Received higher prio advert
Jun 21 19:56:14 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) Entering BACKUP STATE
Jun 21 19:56:14 srv-node1.mydomain.local Keepalived_vrrp[1385]: VRRP_Instance(prodsrv) removing protocol VIPs.


logs on current master node
----------------------------
no action logged


other informations:

workaround:

switching manually the cluster stopping/starting keepalived solves the issue.

keepalived.conf

********************************************************

global_defs {
  router_id haproxy1
}
vrrp_script haproxy {
  script "killall -0 haproxy"
  interval 2
  weight 2
}
vrrp_instance prodsrv {
  virtual_router_id 50
  advert_int 1
  priority 100
  state MASTER
  interface eno16777984
  virtual_ipaddress {
    10.123.5.69 dev eno16777984
  }
  authentication {
        auth_type PASS
        auth_pass mypassword
    }
  track_script {
    haproxy
  }
}
********************************************************

on http://www.keepalived.org/changelog.html release 1.2.22 seems to solve similar issues

Comment 1 Ryan O'Hara 2016-06-22 12:19:56 UTC

(In reply to andrea from comment #0)
> Description of problem:
> 
> In a 2 nodes keepalived cluster, (probably for a network problem) when a
> backup node becomes master and sends the gratuitous arp and right after it
> receive an higher prio advert falling back as backup, it releases the VIP,
> however the master node doesn't sends gratuitous arp causing a L3 problem
> resulting in VIP unreachable from outside the network.

Can you show that the new master is not sending the gratuitour arp? The logs you provided do not show a problem.

> on http://www.keepalived.org/changelog.html release 1.2.22 seems to solve
> similar issues

Can you please by specific? Which entry in the changelog seems solve this problem? Just to be clear, this behavior is occuring on RHEL7.2 with keepalived-1.2.13-7.el7.x86_64, correct? Is there a support case open for this issue? Have you attempted to use garp_master_delay or garp_master_refresh?

Comment 3 andrea 2016-06-22 13:28:49 UTC

(In reply to Ryan O'Hara from comment #1)

> Can you show that the new master is not sending the gratuitour arp? The logs
> you provided do not show a problem.

I will try to catch it by tcpdump, anyway the problem is really sporadic so it's difficult to track

> 
> > on http://www.keepalived.org/changelog.html release 1.2.22 seems to solve
> > similar issues
> 
> Can you please by specific? Which entry in the changelog seems solve this
> problem? 
I'm not sure but I thought that this could be related to

"vrrp: Fix transition to backup when receive equal priority advert from
  higher address.
  When a vrrp instance in master mode received an advert from another master
  that had equal priority, it wasn't comparing the addresses to determine
  whether it should treat the advert as higher priority, and hence the
  instance should fall back into backup state.
  When checking whether the advert is from a lower priority master, it now
  checks if the priorities are equal and then compares the addresses."

> Just to be clear, this behavior is occuring on RHEL7.2 with
> keepalived-1.2.13-7.el7.x86_64, correct? Is there a support case open for
> this issue? Have you attempted to use garp_master_delay or
> garp_master_refresh?

No there aren't support cases opened. The problem is on a centos 7, where keepalived is on repo base.

Comment 4 Ryan O'Hara 2016-06-22 13:45:35 UTC

(In reply to andrea from comment #3)
> (In reply to Ryan O'Hara from comment #1)
> 
> > Can you show that the new master is not sending the gratuitour arp? The logs
> > you provided do not show a problem.
> 
> I will try to catch it by tcpdump, anyway the problem is really sporadic so
> it's difficult to track
> 
> > 
> > > on http://www.keepalived.org/changelog.html release 1.2.22 seems to solve
> > > similar issues
> > 
> > Can you please by specific? Which entry in the changelog seems solve this
> > problem? 
> I'm not sure but I thought that this could be related to
> 
> "vrrp: Fix transition to backup when receive equal priority advert from
>   higher address.
>   When a vrrp instance in master mode received an advert from another master
>   that had equal priority, it wasn't comparing the addresses to determine
>   whether it should treat the advert as higher priority, and hence the
>   instance should fall back into backup state.
>   When checking whether the advert is from a lower priority master, it now
>   checks if the priorities are equal and then compares the addresses."

Do you keepalived nodes have the same priority? If so, try changing the priority on one of them and see if this resolves the problem.

> > Just to be clear, this behavior is occuring on RHEL7.2 with
> > keepalived-1.2.13-7.el7.x86_64, correct? Is there a support case open for
> > this issue? Have you attempted to use garp_master_delay or
> > garp_master_refresh?

You might consider the garp_* settings mentioned above.

Comment 5 andrea 2016-06-22 14:23:04 UTC

> You might consider the garp_* settings mentioned above.
Thanks a lot, I will try to play with garp_* ! However it should expect that before the master goes to backup for an higher prio advert, it should send a last advert to solicit the master to sends gratuitous arp. It's shows that on that master there isn't logs activity tracked, but I don't know if this is normal. In my case is true that everything occurs in a small amount of time.

Comment 6 andrea 2016-07-29 13:46:09 UTC

I've configured garp_master_refresh to 60 seconds on global_defs but this seems to not take action

listening by tcpdump on interface doesn't show gratuitous arp refreshing

Comment 7 Ryan O'Hara 2016-07-29 16:24:59 UTC

I cannot recreate this problem. Using RHEL7.2 with keepalived-1.2.13-7.el7.x86_64 on two nodes, I used the exact same configuration as posted in comment #0 (sans interface name and virtual IP address) the nodes correctly transition from master to backup.

I'm not sure where to go from here. Are you sure that your nodes are passing VRRP traffic? You have iptables setup correctly on both nodes to allow VRRP? It seems to me that your master node is not properly receiving VRRP advertisements. Also, you only posted the config for your master node. Is the backup node identical? Are you using "state BACKUP" and/or a different priority? See comment #4 for some questions I asked about the config.

Comment 8 andrea 2016-07-29 16:59:54 UTC

the firewalld is configured correctly because everything works fine since a node switch caused probably for a network problem. 
If the service bounce back after a switch, the idea is that the gratuitous arp aren't sent again. example: node A has got the vip, for a network problem node B take the VIP and send gratuitous arp, then the node A came back and take back the VIP; but this time it doesn't send back the gratuitous arp so, due to not refreshing arps VIP become unreachable. 

the parameter garp_master_refresh was used to mitigate this, but it seems that doesn't enter in action, almost checking the newtork traffic by tcpdump.

the configuration is the same on both nodes, what I've do now is adding (on both nodes) garp_master_refresh

in this way:

global_defs {
  router_id haproxy1
  garp_master_refresh 60

}
 
note that everything work well if a swicth the VIP from one node to the other either killing haproxy or using systemctl stop/start keepalived

Comment 9 Ryan O'Hara 2016-07-29 17:18:03 UTC

(In reply to andrea from comment #8)
> the firewalld is configured correctly because everything works fine since a
> node switch caused probably for a network problem.

Explain how you have firewalld/iptables setup for VRRP.

> If the service bounce back after a switch, the idea is that the gratuitous
> arp aren't sent again. example: node A has got the vip, for a network
> problem node B take the VIP and send gratuitous arp, then the node A came
> back and take back the VIP; but this time it doesn't send back the
> gratuitous arp so, due to not refreshing arps VIP become unreachable. 

It works fine in my environment. Perhaps you need to explain how you are creating this "network problem".

> the parameter garp_master_refresh was used to mitigate this, but it seems
> that doesn't enter in action, almost checking the newtork traffic by tcpdump.
> 
> the configuration is the same on both nodes, what I've do now is adding (on
> both nodes) garp_master_refresh

You should consider setting one node to 'state BACKUP'.

> in this way:
> 
> global_defs {
>   router_id haproxy1
>   garp_master_refresh 60
> 
> }
>  
> note that everything work well if a swicth the VIP from one node to the
> other either killing haproxy or using systemctl stop/start keepalived

OK, then as I stated above, I need details about the "network problem" that you are using the trigger this problem.

Comment 10 Ryan O'Hara 2016-07-29 17:23:45 UTC

(In reply to andrea from comment #8)
> the parameter garp_master_refresh was used to mitigate this, but it seems
> that doesn't enter in action, almost checking the newtork traffic by tcpdump.
> 
> the configuration is the same on both nodes, what I've do now is adding (on
> both nodes) garp_master_refresh
> 
> in this way:
> 
> global_defs {
>   router_id haproxy1
>   garp_master_refresh 60
> 
> }

According the the documentation for v1.2.13, the garp_* settings are only valid in a vrrp_instance block.

Comment 11 andrea 2016-07-29 20:50:26 UTC

> Explain how you have firewalld/iptables setup for VRRP.

firewall-cmd --list-all
public (default, active)
  interfaces: eno16777984
  sources: 
  services: dhcpv6-client ftp samba ssh
  ports: 443/tcp 80/tcp
  masquerade: no
  forward-ports: 
  icmp-blocks: 
  rich rules: 
	rule family="ipv4" source address="10.123.11.41" accept
	rule family="ipv4" source address="10.123.11.42" accept
	rule family="ipv4" source address="10.123.11.43" accept

10.123.11.41 is the node A ip
10.123.11.42 is the node B ip
10.123.11.43 is the VIP

> OK, then as I stated above, I need details about the "network problem" that
> you are using the trigger this problem.

it'a really rare issue that cames out less than 1 time per month. I was monitoring it from the last issue on 23 June and at now it didn't show up again

but it is shown in the logs as I wrote at begining. In summary.. Node A has the VIP and it sends VRRP Advertisement every second. When happens that (for this network problem :) ) node B didn't get this Advertisement, node B becomes master sending gratuitous arp, but as the node A cames back again  (1 or 2 second after) node A take again ownership and node B become slave again. The VIP is correctly set on node A but my doubt is that node A doesn't send gratuitous ARP again, so the network is fooled by the old mac of node B. 

so the problems are 2

1) there is a network problem (probably not depending from keepalived) that cause the VRRP Advertisement to fail and the VIP bouncing from A to B and to A again

2) this switch causes the gratuitous ARP from node B but when the VIP cames back to node A, my doubt is that no gratuitous arp are sent from node A (I was having under tcpdump the hosts from 23 June but the problems didn't raised so at now it remains a doubt)

>According the the documentation for v1.2.13, the garp_* settings are only >valid in a vrrp_instance block.

Thanks a lot, you was right. Under the vrrp_instance garp_master_refresh works as it should. This will work as workaround

Comment 12 Ryan O'Hara 2016-07-29 21:52:56 UTC

(In reply to andrea from comment #11)
> > Explain how you have firewalld/iptables setup for VRRP.
> 
> firewall-cmd --list-all
> public (default, active)
>   interfaces: eno16777984
>   sources: 
>   services: dhcpv6-client ftp samba ssh
>   ports: 443/tcp 80/tcp
>   masquerade: no
>   forward-ports: 
>   icmp-blocks: 
>   rich rules: 
> 	rule family="ipv4" source address="10.123.11.41" accept
> 	rule family="ipv4" source address="10.123.11.42" accept
> 	rule family="ipv4" source address="10.123.11.43" accept
> 
> 10.123.11.41 is the node A ip
> 10.123.11.42 is the node B ip
> 10.123.11.43 is the VIP

I typically recommend accepting VRRP protocol as a rich rule, but this is entirely up to you.

> > OK, then as I stated above, I need details about the "network problem" that
> > you are using the trigger this problem.
> 
> it'a really rare issue that cames out less than 1 time per month. I was
> monitoring it from the last issue on 23 June and at now it didn't show up
> again
> 
> but it is shown in the logs as I wrote at begining. In summary.. Node A has
> the VIP and it sends VRRP Advertisement every second. When happens that (for
> this network problem :) ) node B didn't get this Advertisement, node B
> becomes master sending gratuitous arp, but as the node A cames back again 
> (1 or 2 second after) node A take again ownership and node B become slave
> again. The VIP is correctly set on node A but my doubt is that node A
> doesn't send gratuitous ARP again, so the network is fooled by the old mac
> of node B. 

We are talking about two different things. I am asking about the "network problem" that causes this subsequent problem in keepalived. Can you explain what that is?

> so the problems are 2
> 
> 1) there is a network problem (probably not depending from keepalived) that
> cause the VRRP Advertisement to fail and the VIP bouncing from A to B and to
> A again

This is the network problem I am asking about -- the cause, not the symptom.

> 2) this switch causes the gratuitous ARP from node B but when the VIP cames
> back to node A, my doubt is that no gratuitous arp are sent from node A (I
> was having under tcpdump the hosts from 23 June but the problems didn't
> raised so at now it remains a doubt)

I can't reproduce this. I need a method to reproduce this, else I can't determine if this is a problem with your network or keepalived.

> >According the the documentation for v1.2.13, the garp_* settings are only >valid in a vrrp_instance block.
> 
> Thanks a lot, you was right. Under the vrrp_instance garp_master_refresh
> works as it should. This will work as workaround

I recommend you ask upstream for advice on this issue since I cannot reproduce it.

Comment 19 dearfriend 2017-06-16 15:09:42 UTC

It looks like BZ1425828

Comment 20 Jonathan Maxwell 2017-06-18 22:22:50 UTC

Thanks for your help on this Ryan. We worked around this as per the resolution in:

https://access.redhat.com/solutions/2978661

It worked for my customer.

Regards

Jon

Comment 22 Ryan O'Hara 2017-08-28 15:53:17 UTC

Can we close this?

Comment 23 John Ruemker 2017-08-28 17:55:00 UTC

(In reply to Ryan O'Hara from comment #22)
> Can we close this?

Is it accurate to say that garp_master_refresh is the recommended solution in environments where there is a risk of VRRP traffic being disrupted for any length of time?  

In that case: is the current default appropriate, or should we consider having this set to a value that accommodates these situations automatically?

Also: does this need a docs update to recommend such a setting be considered?

Comment 24 John Ruemker 2017-08-28 17:56:09 UTC

Also: is the logging around this situation adequate to inform users what is happening if they don't have garp_master_refresh set?  Would a user easily be able to find their way to setting that to solve this?

Comment 25 Jonathan Maxwell 2017-08-30 04:49:13 UTC

(In reply to John Ruemker from comment #24)
> Also: is the logging around this situation adequate to inform users what is
> happening if they don't have garp_master_refresh set?  Would a user easily
> be able to find their way to setting that to solve this?

Yes, it should be fairly obvious what is happening and therefore lead customers to the KCS. Also the client should usually re-arp again based on a timer and at that point it will find the correct VIP MAC address. In my test the client timed out out only until it re-arped. Then it started working again after a few seconds. There may be another issue in the customers case where the client did not re-arp and just continuously used the stale MAC address. That probably makes this issue quiet rare. In most cases it will recover.

> In that case: is the current default appropriate, or should we consider having this set to a value that accommodates these situations automatically?

I don't think we want to periodically send garps by default. It logs a message each time a garp is sent which pollutes the messages file and may break something else. Of course this can be filtered but still. 

TBH I reproduced this so long ago I'd forgotten the details. So I reproduced it again today and here is what happens:

1) Start firewalld on both nodes. The VRRP breaks but the network is otherwise operational. So the client has the arp entry for the MASTER node2 (higher priority). 
2) Node 1 thinks node2 is unavailable and takes the VIP and sends a GARP. Now both have the VIP (split brain).
3) The client get's node 1's MAC address in it's arp table due to the garp.
4) Stop firewalld and now node 1 gets higher priority VRRP from node2 and drops it's IP address. As node 2 always had the VIP it does NOT garp again.
5) Now the client is stuck with node 1's MAC Address until it re-arps.   

So really for this to be a major issue the following combination is required:

1) A VIP failover due to higher priority VRRP message is required, after some partial network outage. Most network outages will break so that client never get's node 1s garp in the 1st place, avoiding this issue.
2) The client then does not refresh it's arp entry for some reason. Or the switch does not update it's FIB table. 

The above seems quite a rare scenario to me and can be fixed by making the client re-arp. Or setting garp_master_refresh to avoid this as per the KCS.

keepalived does not foresee the above scenario and to add functionality to make it do so would be invasive and probably not worth the risk.

I agree that this can closed. 

Regards

Jon

Comment 34 Parth Patel 2021-12-01 03:21:05 UTC

For anyone tumbling on this thread from Google search: This happened to us even in 2021 with keepalive version 2.1.5 on RHEL 8.4. For some reason, this exact scenario got triggered by backups from Cohesity (our VM backup and restore platform) and the issue lasted 4 hours.

Note You need to log in before you can comment on or make changes to this bug.