Bug 1477587

Summary: Upgrade incompatibility: Can no longer mix IPv4/IPv6 in virtual_ipaddress when using IPv6 VRRP instance
Product: Red Hat Enterprise Linux 7 Reporter: Robert Scheck <redhat-bugzilla>
Component: keepalivedAssignee: Ryan O'Hara <rohara>
Status: CLOSED ERRATA QA Contact: Brandon Perkins <bperkins>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: aherr, anshockm, cfeist, cluster-maint, djansa, mlinden, robert.scheck, rohara
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: keepalived-1.3.5-5.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 18:15:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Scheck 2017-08-02 12:26:58 UTC
Description of problem:
Since updating from RHEL 7.3 to 7.4 (thus keepalived 1.2.x to 1.3.x), the
following configuration does no longer work:

-- snipp --
  native_ipv6
  unicast_src_ip 2001:db8::1
  unicast_peer {
    2001:db8::2
  }
  virtual_ipaddress {
    192.0.2.1/30 dev bond1.1000
    fe80::1/64 dev bond1.1000
    2001:db8:0:1000::1/64 dev bond1.1000
    192.0.2.250/29 dev bond0
    2001:db8:0:4003::2/64 dev bond0
  }
-- snapp --

Above leads to the following log output:

Aug  2 14:23:19 tux1 Keepalived_vrrp[7664]: (VRRP_INSTANCE): address family must match VRRP instance [192.0.2.1/30] - ignoring
Aug  2 14:23:19 tux1 Keepalived_vrrp[7664]: (VRRP_INSTANCE): address family must match VRRP instance [192.0.2.250/29] - ignoring

And this again leads to the fact that no IPv4 addresses are set up to bond0,
while this worked properly with keepalived 1.2.x before.

The important point is that our keepalived-internal communication happens
only via IPv6, while keepalived shall maintain IPv4 and IPv6 floating IPs;
to get this working "native_ipv6" was required for keepalived 1.2.x, but it
does not make any difference for 1.3.x if that option is (still) set or not.

Version-Release number of selected component (if applicable):
keepalived-1.3.5-1.el7.x86_64

How reproducible:
Everytime, see above.

Actual results:
virtual_ipaddress no longer accepts IPv4 addresses for IPv6 VRRP instance.

Expected results:
Working stuff with keepalived 1.3.x, like it was in 1.2.x.

Additional info:
This feels like a incompatibility/regression and should IMHO not happen
within a stable RHEL release.

Comment 2 Robert Scheck 2017-08-02 13:32:46 UTC
Cross-filed ticket 01903060 on the Red Hat customer portal.

Comment 3 Robert Scheck 2017-08-02 20:33:08 UTC
Configuration below (minus enable_script_security) works with keepalived
1.2.x, but not with 1.3.x (leads to errors mentioned before), thus all the
IPv4 addresses are not set up (obfuscated 192.0.2.1/30, 192.0.2.250/29):

--- snipp ---
global_defs {
    router_id tux1
    enable_script_security  # Keepalived yells about scripts?!
#    script_user root root  # Keepalived yells anyway?! RHBZ#1477563
    vrrp_iptables  # Empty to avoid iptables rules
#    vrrp_ipset  # Empty to avoid ipsets; does not work, RHBZ#1477572
#    vrrp_version 2  # tux2 still believes this is VRRP 3, RHBZ#1477552
}

vrrp_sync_group VRRP_GROUP {
    group {
        VRRP_INSTANCE
    }
    notify_master "/etc/conntrackd/primary-backup.sh primary"
    notify_backup "/etc/conntrackd/primary-backup.sh backup"
    notify_fault "/etc/conntrackd/primary-backup.sh fault"
}

vrrp_instance VRRP_INSTANCE {
    interface em2
    state BACKUP
    virtual_router_id 51
    priority 150
    track_interface {
        bond0
        bond1
    }
    native_ipv6  # keepalived 1.2.x hates IPv6 unicast_* w/o this option?!
    unicast_src_ip 2001:db8::1
    unicast_peer {
        2001:db8::2
    }
    virtual_ipaddress {
        192.0.2.1/30 dev bond1.1000
        fe80::1/64 dev bond1.1000
        2001:db8:0:1000::1/64 dev bond1.1000
        192.0.2.250/29 dev bond0
        2001:db8:0:4003::2/64 dev bond0
    }
    virtual_routes {
        blackhole 192.0.2.0/24
        blackhole 2001:db8::/32
    }
    advert_int 1
    nopreempt
    garp_master_delay 0
    dont_track_primary
}
--- snapp ---

Comment 4 Robert Scheck 2017-08-03 15:42:33 UTC
Bug #1477552 comment #8 lead to findings that also apply here:

https://github.com/acassen/keepalived/commit/485847cd30503c1ec2370713c2593a2216f19bb1#diff-bb37771a5dd629fb6332c05768e92a95R1606

keepalived-1.2.13-9.el7_3.x86_64 (RHEL 7.3) allowed this:

--- snipp ---
vrrp_instance VRRP_INSTANCE {
    # …
    native_ipv6
    unicast_src_ip 2001:db8::1
    unicast_peer {
        2001:db8::2
    }
    virtual_ipaddress {
        192.0.2.1/30 dev bond1.1000
        fe80::1/64 dev bond1.1000
        2001:db8:0:1000::1/64 dev bond1.1000
        192.0.2.250/29 dev bond0
        2001:db8:0:4003::2/64 dev bond0
    }
}
--- snipp ---

Using keepalived-1.3.5-1.el7.x86_64 (RHEL 7.4), the following happens and
applies:

- Keepalived in 1.3.5-1.el7 no longer allows mixing IPv4 and IPv6 addresses
  in virtual_ipaddress section
- In this specific case the inter-keepalived-communication is IPv6, thus
  IPv4 addresses can't be put into virtual_ipaddress section

It is NOT possible to make the configuration above working with keepalived
1.2.x AND 1.3.x, because of further differences mentioned in bug #1477552.

Any upgrade, when having IPv6 for inter-keepalived-communication, requires
a configuration change when upgrading from keepalived 1.2.x to 1.3.x. It is
not possible to run a 1.2.x and 1.3.x mixed keepalived cluster when having
IPv6 for inter-keepalived-communication.

Above configuration needs to be rewritten for keepalived-1.3.5-1.el7.x86_64 
(RHEL 7.4) like this:

--- snipp ---
vrrp_sync_group vrrp_group {
    # …
    group {
        vrrp_ipv4
        vrrp_ipv6
    }
}

vrrp_instance vrrp_ipv4 {
    # …
    unicast_src_ip 192.0.2.5
    unicast_peer {
        192.0.2.6
    }
    virtual_ipaddress {
        192.0.2.1/30 dev bond1.1000
        192.0.2.250/29 dev bond0
    }
}

vrrp_instance vrrp_ipv6 {
    # …
    unicast_src_ip 2001:db8::1
    unicast_peer {
        2001:db8::2
    }
    virtual_ipaddress {
        fe80::1/64 dev bond1.1000
        2001:db8:0:1000::1/64 dev bond1.1000
        2001:db8:0:4003::2/64 dev bond0
    }
}
--- snipp ---

Please update the RHEL 7.4 release notes to reflect these findings to help
other customers about this keepalived upstream incompatibility when upgrading
from RHEL 7.3.

Comment 5 Ryan O'Hara 2017-08-03 16:20:42 UTC
Just curious, in the configuration shown where you have separate VRRP instances (vrrp_ipv4 and vrrp_ipv6), do either have vrrp_version or native_ipv6 set?

Comment 6 Robert Scheck 2017-08-03 16:25:01 UTC
(In reply to Ryan O'Hara from comment #5)
> Just curious, in the configuration shown where you have separate VRRP
> instances (vrrp_ipv4 and vrrp_ipv6), do either have vrrp_version or
> native_ipv6 set?

No and no. And to be more verbose:

- Keyword "native_ipv6" is ignored according to keepalived 1.3.x source
- VRRP instance vrrp_ipv4 uses VRRP 2 (tested/verified)
- VRRP instance vrrp_ipv6 uses VRRP 3 (tested/verified)
- When setting "vrrp_version 3", both instances are using VRRP 3 (also
  tested/verified)

Comment 11 Ryan O'Hara 2017-12-04 17:39:49 UTC
I've been talking with upstream about this problem and now I have a better understanding of why this change was made, but I do agree that it should not have impacted the old IPv6 + VRRPv2 implementation.

With the old keepalived-1.2.13, setup a VRRP instance both IPv4 and IPv6 addresses. Then use tcpdump to watch the VRRP advertisements:

From keepalived.conf:

    virtual_ipaddress {
        192.168.102.201
        192.168.103.201
        192.168.104.201
        fe80::67d:7bff:fefc:201
    }

# tcpdump -i eth1 -v vrrp

10:46:18.589221 IP (tos 0xc0, ttl 255, id 10, offset 0, flags [none], proto VRRP (112), length 52)
    192.168.102.182 > vrrp.mcast.net: vrrp 192.168.102.182 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 201, prio 182, authtype none, intvl 2s, length 32, addrs(4): 192.168.102.201,192.168.103.201,192.168.104.201,0.0.0.0

So keepalived is sending VRRPv2 advertisements with 4 virtual IP addresses, but notice that the that IPv6 address isn't actually in the list. You can further check this by having only IPv6 addresses -- the count will be correct there won't actually be any address in the advertisements. Keep in mind that IPv6 behavior was not defined in the RFC for VRRPv2, so keepalived had its own ad hoc of IPv6 for VRRPv2 for many years. When VRRPv3 came around it stated that IPv4 and IPv6 VIPs should now be in the same advertisement. Now, while I agree that the change that introduced this incompatibility is technically correct, I don't think it should apply to the old VRRPv2+IPv6 implementation for backwards compatibility reasons.

There is a workaround but it does require config file changes unfortunately. If you want to mix both IPv4 and IPv6 in a single VRRP instance, move the IPv6 addresses to a "virtual_ipaddress_excluded" block. This will prevent them from being included in the advertisements, but still allow them to be created on the MASTER node. The above configuration snippet would become:

    virtual_ipaddress {
        192.168.102.201
        192.168.103.201
        192.168.104.201
    }
    virtual_ipaddress_excluded {
        fe80::67d:7bff:fefc:201
    }

I'm also looking into a quick fix where we can simply ignore the new mixed IPv4/IPv6 restriction for VRRPv2. I'll post information on this once I run more tests.

Comment 12 Ryan O'Hara 2017-12-05 16:06:22 UTC
I think I have a potential fix for this problem. In comment #11 I stated that this can be resolved by moving VIPs to "virtual_ipaddress_excluded" such that IPv4 and IPv6 addresses are not mixed. This is still valid, but we can have keepalived do this for us. Instead of ignoring mixed VIPs and printing "address family must match VRRP instance", print a warning message (TBD) and add the non-matching VIP(s) to the list of excluded VIP list with alloc_vrrp_evip(vec). This seems to work. Virtual IP addressed in the excluded list will simply not appear in the VRRP advertisements. They will still be created on the specified interface when the node is in MASTER state. My tests are promising, and I have a patch being reviewed.

Note that there is nothing preventing this from from work for VRRPv3, but it is technically wrong to do this since the RFC for VRRPv3 states that IPv4 and IPv6 VIPs should not appear in the same advertisement. At this point I see no reason to distinguish VRRPv2/v3 behavior, as this patch will almost definitely not be accepted upstream -- we simply want this behavior in RHEL to maintain compatibility.

Comment 14 Ryan O'Hara 2017-12-11 19:28:17 UTC
Description of old keepalived-1.2.13 behavior:

~~~
# cat /etc/keepalived/keepalived.conf
global_defs {
    router_id MESA_02
}

vrrp_instance VRRP_01 {
    state MASTER
    priority 182
    interface br2
    advert_int 2
    virtual_router_id 200
    virtual_ipaddress {
        192.168.102.201
        192.168.102.202
        192.168.102.203
        fe80::67d:7bff:fefc:201
        fe80::67d:7bff:fefc:202
        fe80::67d:7bff:fefc:203
    }
}
~~~

1. With old keepalived-1.2.13, start keepalived from the command-line such that we dump configuration and run in the foreground. Only VRRP is needed here (no IPVS) so we can also use -P.

# rpm -q keepalived
keepalived-1.2.13-8.el7.x86_64

# keepalived -DPRdln
...
------< VRRP Topology >------
 VRRP Instance = VRRP_01
   Want State = MASTER
   Runing on device = br2
   Virtual Router ID = 200
   Priority = 182
   Advert interval = 2sec
   Virtual IP = 6
     192.168.102.201/32 dev br2 scope global
     192.168.102.202/32 dev br2 scope global
     192.168.102.203/32 dev br2 scope global
     fe80::67d:7bff:fefc:201/128 dev br2 scope global
     fe80::67d:7bff:fefc:202/128 dev br2 scope global
     fe80::67d:7bff:fefc:203/128 dev br2 scope global
...

Parser has correctly read 6 VIPs (3 IPv4 addrs and 3 IPv6 addrs). Use tcpdump to inspect VRRP advertisements:

# tcpdump -i br2 -n -v vrrp
tcpdump: listening on br2, link-type EN10MB (Ethernet), capture size 262144 bytes
13:08:31.805568 IP (tos 0xc0, ttl 255, id 163, offset 0, flags [none], proto VRRP (112), length 60)
    192.168.102.182 > 224.0.0.18: vrrp 192.168.102.182 > 224.0.0.18: VRRPv2, Advertisement, vrid 200, prio 182, authtype none, intvl 2s, length 40, addrs(6): 192.168.102.201,192.168.102.202,192.168.102.203,0.0.0.0,0.0.0.0,0.0.0.0

Note that the advertisement claims to have 6 addresses in the list, but only the IPv4 addresses are valid -- the IPv6 addresses are listed as "0.0.0.0". This is a quirk in the ad hoc IPv6 implementation for VRRPv2 in keepalived. The VRRPv2 RFC did not define IPv6 behavior. Note that in keepalived-1.2.13, IPv6 VIPs would *never* appear in VRRP advertisements. This can be confirmed by using only IPv6 address in 'virtual_ipaddress" (ie. comment out the IPv4 addresses) and restart keepalived. The tcpdump will show an address count of 3, but the address list contain 3 "0.0.0.0" addresses:

# keepalived -DPRdln
...
------< VRRP Topology >------
 VRRP Instance = VRRP_01
   Want State = MASTER
   Runing on device = br2
   Virtual Router ID = 200
   Priority = 182
   Advert interval = 2sec
   Virtual IP = 3
     fe80::67d:7bff:fefc:201/128 dev br2 scope global
     fe80::67d:7bff:fefc:202/128 dev br2 scope global
     fe80::67d:7bff:fefc:203/128 dev br2 scope global
...

# tcpdump -i br2 -v -n vrrp
tcpdump: listening on br2, link-type EN10MB (Ethernet), capture size 262144 bytes
13:25:50.920719 IP (tos 0xc0, ttl 255, id 27, offset 0, flags [none], proto VRRP (112), length 48)
    192.168.102.182 > 224.0.0.18: vrrp 192.168.102.182 > 224.0.0.18: VRRPv2, Advertisement, vrid 200, prio 182, authtype none, intvl 2s, length 28, addrs(3): 0.0.0.0,0.0.0.0,0.0.0.0

Comment 15 Ryan O'Hara 2017-12-11 19:39:38 UTC
Description of new keepalived-1.3.5 behavior, without patch:

~~~
# cat /etc/keepalived/keepalived.conf
global_defs {
    router_id MESA_02
}

vrrp_instance VRRP_01 {
    state MASTER
    priority 182
    interface br2
    advert_int 2
    virtual_router_id 200
    virtual_ipaddress {
        192.168.102.201
        192.168.102.202
        192.168.102.203
        fe80::67d:7bff:fefc:201
        fe80::67d:7bff:fefc:202
        fe80::67d:7bff:fefc:203
    }
}
~~~

1. With keepalived-1.3.5, start keepalived from the command-line such that we dump configuration and run in the foreground. Only VRRP is needed here (no IPVS) so we can also use -P.

# rpm -q keepalived
keepalived-1.3.5-1.el7.x86_64

# keepalived -DPRdln
...
Opening file '/etc/keepalived/keepalived.conf'.
(VRRP_01): address family must match VRRP instance [fe80::67d:7bff:fefc:201] - ignoring
(VRRP_01): address family must match VRRP instance [fe80::67d:7bff:fefc:202] - ignoring
(VRRP_01): address family must match VRRP instance [fe80::67d:7bff:fefc:203] - ignoring
...
------< VRRP Topology >------
 VRRP Instance = VRRP_01
   Using VRRPv2
   Want State = MASTER
   Running on device = br2
   Skip checking advert IP addresses = no
   Enforcing strict VRRP compliance = no
   Using src_ip = 192.168.102.182
   Gratuitous ARP delay = 5
   Gratuitous ARP repeat = 5
   Gratuitous ARP refresh timer = 0
   Gratuitous ARP refresh repeat = 1
   Gratuitous ARP lower priority delay = 5
   Gratuitous ARP lower priority repeat = 5
   Send advert after receive lower priority advert = true
   Send advert after receive higher priority advert = false
   Virtual Router ID = 200
   Priority = 182
   Advert interval = 2 sec
   Accept enabled
   Promote_secondaries disabled
   Virtual IP = 3
     192.168.102.201/32 dev br2 scope global
     192.168.102.202/32 dev br2 scope global
     192.168.102.203/32 dev br2 scope global
...

Note that keepalived has ignored the 3 IPv6 VIPs because they do not match this instance family (IPv4). Log message is emitted for every address that does not match the family. Configuration dump (-d option) shows this instance has 3 IPv4 addresses only. Confirm with tcpdump:

# tcpdump -i br2 -v -n vrrp
tcpdump: listening on br2, link-type EN10MB (Ethernet), capture size 262144 bytes
13:35:13.854594 IP (tos 0xc0, ttl 255, id 94, offset 0, flags [none], proto VRRP (112), length 48)
    192.168.102.182 > 224.0.0.18: vrrp 192.168.102.182 > 224.0.0.18: VRRPv2, Advertisement, vrid 200, prio 182, authtype none, intvl 2s, length 28, addrs(3): 192.168.102.201,192.168.102.202,192.168.102.203

Most importantly, the IPv6 VIPs will *not* be created on the interface because they were ignored completely. In older versions of keepalived (see comment #14) all VIPs would be created on the interface.

# ip addr show dev br2
...
    inet 192.168.102.201/32 scope global br2
       valid_lft forever preferred_lft forever
    inet 192.168.102.202/32 scope global br2
       valid_lft forever preferred_lft forever
    inet 192.168.102.203/32 scope global br2
       valid_lft forever preferred_lft forever
...

Comment 16 Ryan O'Hara 2017-12-11 19:52:47 UTC
Desciption of new keepalived-1.3.5 behavior, with patch:

~~~
# cat /etc/keepalived/keepalived.conf
global_defs {
    router_id MESA_02
}

vrrp_instance VRRP_01 {
    state MASTER
    priority 182
    interface br2
    advert_int 2
    virtual_router_id 200
    virtual_ipaddress {
        192.168.102.201
        192.168.102.202
        192.168.102.203
        fe80::67d:7bff:fefc:201
        fe80::67d:7bff:fefc:202
        fe80::67d:7bff:fefc:203
    }
}
~~~

1. With keepalived-1.3.5, start keepalived from the command-line such that we dump configuration and run in the foreground. Only VRRP is needed here (no IPVS) so we can also use -P.

# rpm -q keepalived
keepalived-1.3.5-5.el7.x86_64

# keepalived -DPRdln
...
Opening file '/etc/keepalived/keepalived.conf'.
(VRRP_01): address family does not match VRRP instance [fe80::67d:7bff:fefc:201]
(VRRP_01): address family does not match VRRP instance [fe80::67d:7bff:fefc:202]
(VRRP_01): address family does not match VRRP instance [fe80::67d:7bff:fefc:203]
...
------< VRRP Topology >------
 VRRP Instance = VRRP_01
   Using VRRPv2
   Want State = MASTER
   Running on device = br2
   Skip checking advert IP addresses = no
   Enforcing strict VRRP compliance = no
   Using src_ip = 192.168.102.182
   Gratuitous ARP delay = 5
   Gratuitous ARP repeat = 5
   Gratuitous ARP refresh timer = 0
   Gratuitous ARP refresh repeat = 1
   Gratuitous ARP lower priority delay = 5
   Gratuitous ARP lower priority repeat = 5
   Send advert after receive lower priority advert = true
   Send advert after receive higher priority advert = false
   Virtual Router ID = 200
   Priority = 182
   Advert interval = 2 sec
   Accept enabled
   Promote_secondaries disabled
   Virtual IP = 3
     192.168.102.201/32 dev br2 scope global
     192.168.102.202/32 dev br2 scope global
     192.168.102.203/32 dev br2 scope global
   Virtual IP Excluded = 3
     fe80::67d:7bff:fefc:201/128 dev br2 scope global
     fe80::67d:7bff:fefc:202/128 dev br2 scope global
     fe80::67d:7bff:fefc:203/128 dev br2 scope global
...

Note that with the patch, the 3 IPv6 now appear in the 'Virtual IP Excluded' list, meaning the will not be included in the VRRP advertisements (as before) but they *will* be created on the interface when keepalived is in MASTER state. Use tcpdump to see that mismatched VIPs are not in the advertisements:

# tcpdump -i br2 -v -n vrrp
tcpdump: listening on br2, link-type EN10MB (Ethernet), capture size 262144 bytes
13:50:45.487045 IP (tos 0xc0, ttl 255, id 99, offset 0, flags [none], proto VRRP (112), length 48)
    192.168.102.182 > 224.0.0.18: vrrp 192.168.102.182 > 224.0.0.18: VRRPv2, Advertisement, vrid 200, prio 182, authtype none, intvl 2s, length 28, addrs(3): 192.168.102.201,192.168.102.202,192.168.102.203

Verify that the IPv6 VIPs (excluded, not ignored) *and* the IPv4 VIPs are created on the interface:

# ip addr show dev br2
...
    inet 192.168.102.201/32 scope global br2
       valid_lft forever preferred_lft forever
    inet 192.168.102.202/32 scope global br2
       valid_lft forever preferred_lft forever
    inet 192.168.102.203/32 scope global br2
       valid_lft forever preferred_lft forever
    inet6 fe80::67d:7bff:fefc:203/128 scope link deprecated nodad 
       valid_lft forever preferred_lft 0sec
    inet6 fe80::67d:7bff:fefc:202/128 scope link deprecated nodad 
       valid_lft forever preferred_lft 0sec
    inet6 fe80::67d:7bff:fefc:201/128 scope link deprecated nodad 
       valid_lft forever preferred_lft 0sec
...

This is correct.

Comment 21 errata-xmlrpc 2018-04-10 18:15:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0972