Bug 1734560

Summary: Garp should be send when 1 of the keepalived servers stops being master
Product: Red Hat Enterprise Linux 7 Reporter: mvillagran
Component: keepalivedAssignee: Ryan O'Hara <rohara>
Status: CLOSED WONTFIX QA Contact: Brandon Perkins <bperkins>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.6CC: cluster-maint, mauricio.villagran
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-15 07:38:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mvillagran 2019-07-30 21:14:35 UTC
Description of problem:

In a Keepalived Master/backup config, on Vmware, sometimes when take a Snapshot on Master Node, it's freeze for a few seconds and Keepalived backup node entering in Master ROLE for fews seconds but don't inform this to  Master node  for what  resend garp VIP.


Version-Release number of selected component (if applicable):

keepalived-1.3.5-8.el7_6.x86_64

How reproducible:

server1 => Master
server2 => Backup

Steps to Reproduce:
1. Take Snapshot to server1
2. View syslog on server2. he become a master
3. server2, become a backup again.

Actual results:
server1 don't resend garp.

Expected results:
The server2 should send some inform to server1: 'I was master, please garp your VIP's again.

Additional info: https://github.com/acassen/keepalived/commit/da3fd4bd15be70cf6d2c24a76726dce20d971ce5

Comment 2 Ryan O'Hara 2019-08-26 14:19:37 UTC
Has customer support been contacted about this issue? That is the first step.

Second, you should be able to find a workaround by using the vrrp_garp_* settings. There are several that will allow you to have the master send multiple gratuitous ARPs, either after transition to MASTER or repeatedly while in MASTER state. There are also options to control the frequency, etc.

Comment 3 Ryan O'Hara 2020-01-02 18:38:20 UTC
Are you using VRRP sync groups? The patch you referenced only affects VRRP sync groups. Please attach your keepalived.conf file.

Comment 4 Klaus Brombeere 2020-01-27 18:01:25 UTC
(In reply to Ryan O'Hara from comment #3)
> Are you using VRRP sync groups? The patch you referenced only affects VRRP
> sync groups. Please attach your keepalived.conf file.

Yes I do...

This is my master cfg:

vrrp_script chk_haproxy {
  script "killall -0 haproxy" # check the haproxy process
  interval 2 # every 2 seconds
  weight 2 # add 2 points if OK
}

vrrp_instance VI_1 {
  interface ens192 # interface to monitor
  state MASTER # MASTER on haproxy-01, BACKUP on haproxy-02
  virtual_router_id 51
  priority 101 # 101 on haproxy-01, 100 on haproxy-02
  virtual_ipaddress {
        172.20.1.76
	172.20.1.106

  }
  track_script {
    chk_haproxy
  }
}

And this is my slave cfg:

vrrp_script chk_haproxy {
  script "killall -0 haproxy" # check the haproxy process
  interval 2 # every 2 seconds
  weight 2 # add 2 points if OK
}

vrrp_instance VI_1 {
  interface eno16780032 # interface to monitor
  state MASTER # MASTER on haproxy-01, MASTER on haproxy-01
  virtual_router_id 51
  priority 100 # 101 on haproxy-01, 100 on haproxy-02
  virtual_ipaddress {
        172.20.1.76
	172.20.1.106
  }
  track_script {
    chk_haproxy
  }
}


Thanks!

Comment 5 Ryan O'Hara 2020-02-11 18:53:47 UTC
I don't see any VRRP sync groups in the configuration file posted in comment #4. Check the man page for "VRRP synchronization group(s)" (keyword: vrrp_sync_group). My understanding is that the upstream commit referenced in comment #1 only changes the behavior if you are using VRRP sync groups.

https://github.com/acassen/keepalived/commit/da3fd4bd15be70cf6d2c24a76726dce20d971ce5

Check lines 1923-1926 in that commit. Without VRRP sync group, this patch will not change the behavior.

Comment 8 RHEL Program Management 2021-03-15 07:38:01 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.