Description of problem: When the default route is changed (from NIC to a linux bridge), keepalived doesn't update its configuration (move its IPs on top of the bridge). Due to that, we lose connectivity to API server. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Move IP from NIC to bridge Actual results: Bridge doesn't get KeepAlive IPs. Connectivity to the API server is lost. Expected results: API server is still reachable. KeepAlive IPs are moved to the bridge. Additional notes: Antoni Segura Puimedon suggested adding liveness probe to the KeepAlive pod, restarting and reconfiguring if linked interface doesn't have an IP anymore.
*** Bug 1744560 has been marked as a duplicate of this bug. ***
Bug fix verified by setting linux bridge over the external interface: nmcli con add type bridge ifname br10 nmcli con add type bridge-slave ifname ens4 master br10 nmcli con up bridge-slave-ens4 As expected, keepalived monitor observed the network change and rendered valid config file, meanwhile DNS_VIP and INGRESS_VIP migrated to other host in the cluster
I observed this issue again on OpenShift 4.4. I reconfigured the host, so the interface that was originally carrying VIPs is attached to an OVS bridge. While the keepalived-monitor observed the change and rendered new config, keepalived container failed with 'permanent error CONFIG'. I have a cluster available in case you want to debug it there. keepalived-monitor logs: time="2020-03-27T10:54:03Z" level=info msg="Config change detected" new config="{{ostest test.metalkube.org 192.168.111.5 14 192.168.111.2 10 192.168.111.4 93 24 0 } {0 0 0 [] } 192.168.111.23 worker-0 brcnv [1 92.168.111.1]}" time="2020-03-27T10:54:03Z" level=info msg="Runtimecfg rendering template" path=/etc/keepalived/keepalived.conf keepalived.conf: vrrp_script chk_ingress { script "/usr/bin/curl -o /dev/null -kLs http://0:1936/healthz" interval 1 weight 50 } vrrp_instance ostest_INGRESS { state BACKUP interface brcnv virtual_router_id 93 priority 40 advert_int 1 authentication { auth_type PASS auth_pass cluster_uuid_ingress_vip } virtual_ipaddress { 192.168.111.4/24 } track_script { chk_ingress } } keepalived logs: The client sent: reload Opening file '/etc/keepalived/keepalived.conf'. Stopped Keepalived_vrrp exited with permanent error CONFIG. Terminating Stopping Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Sorry for the noise. I will clone this instead.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409