Description of problem: The keepalived static pod on the bootstrap machine keeps crashlooping. This is because the VRID we calculate occasionally ends up being 0 which keepalived considers invalid. How reproducible: random Actual results: The keepalived static pod does not start up properly. Expected results: The keepalived pod should always start up.
Keepalived logs on the bootstrap machine: Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 Opening file '/etc/keepalived/keepalived.conf'. Starting VRRP child process, pid=7 Registering Kernel netlink reflector Registering Kernel netlink command channel Registering gratuitous ARP shared channel Opening file '/etc/keepalived/keepalived.conf'. VRRP Error : VRID not valid - must be between 1 & 255. reconfigure ! Truncating auth_pass to 8 characters Truncating auth_pass to 8 characters VRRP_Instance(c3rs517m-90437_API) the virtual id must be set! Stopped Keepalived_vrrp exited with permanent error CONFIG. Terminating Stopping Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Github issue: https://github.com/openshift/baremetal-runtimecfg/issues/21
Fixed by: https://github.com/openshift/baremetal-runtimecfg/pull/23
We're no longer seeing these issues in our CI.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062