+++ This bug was initially created as a clone of Bug #1775119 +++ Description of problem: The keepalived static pod on the bootstrap machine keeps crashlooping. This is because the VRID we calculate occasionally ends up being 0 which keepalived considers invalid. How reproducible: random Actual results: The keepalived static pod does not start up properly. Expected results: The keepalived pod should always start up. --- Additional comment from Tomas Sedovic on 2019-11-21 12:24:48 UTC --- Keepalived logs on the bootstrap machine: Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 Opening file '/etc/keepalived/keepalived.conf'. Starting VRRP child process, pid=7 Registering Kernel netlink reflector Registering Kernel netlink command channel Registering gratuitous ARP shared channel Opening file '/etc/keepalived/keepalived.conf'. VRRP Error : VRID not valid - must be between 1 & 255. reconfigure ! Truncating auth_pass to 8 characters Truncating auth_pass to 8 characters VRRP_Instance(c3rs517m-90437_API) the virtual id must be set! Stopped Keepalived_vrrp exited with permanent error CONFIG. Terminating Stopping Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 --- Additional comment from Tomas Sedovic on 2019-11-21 12:25:40 UTC --- Github issue: https://github.com/openshift/baremetal-runtimecfg/issues/21 --- Additional comment from Tomas Sedovic on 2019-11-21 12:26:44 UTC --- Fixed by: https://github.com/openshift/baremetal-runtimecfg/pull/23 --- Additional comment from Tomas Sedovic on 2019-11-21 12:27:07 UTC --- We're no longer seeing these issues in our CI.
https://github.com/openshift/baremetal-runtimecfg/pull/35
Verified on 4.3.0-0.nightly-2019-12-13-180405 # crictl logs -f 6cc8c7a189023 Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 Opening file '/etc/keepalived/keepalived.conf'. Starting VRRP child process, pid=7 Registering Kernel netlink reflector Registering Kernel netlink command channel Registering gratuitous ARP shared channel Opening file '/etc/keepalived/keepalived.conf'. Truncating auth_pass to 8 characters Truncating auth_pass to 8 characters VRRP_Instance(mrnd-13-43-no_API) removing protocol VIPs. VRRP_Instance(mrnd-13-43-no_DNS) removing protocol VIPs. Using LinkWatch kernel netlink reflector... VRRP_Instance(mrnd-13-43-no_API) Entering BACKUP STATE VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(9,10)] VRRP_Instance(mrnd-13-43-no_DNS) Transition to MASTER STATE VRRP_Instance(mrnd-13-43-no_DNS) Entering MASTER STATE VRRP_Instance(mrnd-13-43-no_DNS) setting protocol VIPs. Sending gratuitous ARP on ens3 for 192.168.0.6 VRRP_Instance(mrnd-13-43-no_DNS) Sending/queueing gratuitous ARPs on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 VRRP_Instance(mrnd-13-43-no_API) Transition to MASTER STATE VRRP_Instance(mrnd-13-43-no_API) Entering MASTER STATE VRRP_Instance(mrnd-13-43-no_API) setting protocol VIPs. Sending gratuitous ARP on ens3 for 192.168.0.5 VRRP_Instance(mrnd-13-43-no_API) Sending/queueing gratuitous ARPs on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.6 VRRP_Instance(mrnd-13-43-no_DNS) Sending/queueing gratuitous ARPs on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.6 Sending gratuitous ARP on ens3 for 192.168.0.5 VRRP_Instance(mrnd-13-43-no_API) Sending/queueing gratuitous ARPs on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5 Sending gratuitous ARP on ens3 for 192.168.0.5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0014