Description of problem: The bootstrap node now unconditionally sets the ENABLE_UNICAST environment variable to yes on all on-prem platforms, causing it to use unicast_src_ip and unicast_peer settings in keeepalived.conf, while the control plane nodes rely on output of the onPremPlatformKeepalivedEnableUnicast function, that is returning yes only for baremetal and kubevirt platforms. This issue was introduced in the template de-duplication effort in MCO by https://github.com/openshift/machine-config-operator/pull/2071. I noticed this issue while looking at why OpenStack CI in 4.7 was failing frequently with "Timeout waiting for bootstrap to initialize", and I'm hoping this is the cause. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Checked with 4.7.0-0.nightly-2020-12-04-013308, and it's fixed now for OpenStack platform. $ oc get pods -n openshift-openstack-infra -l app=openstack-infra-vrrp -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES keepalived-wj47ios1208b-xn57g-master-0 2/2 Running 0 6h19m 192.168.2.236 wj47ios1208b-xn57g-master-0 <none> <none> keepalived-wj47ios1208b-xn57g-master-1 2/2 Running 0 6h19m 192.168.2.90 wj47ios1208b-xn57g-master-1 <none> <none> keepalived-wj47ios1208b-xn57g-master-2 2/2 Running 0 6h18m 192.168.3.121 wj47ios1208b-xn57g-master-2 <none> <none> keepalived-wj47ios1208b-xn57g-worker-0-9h7xm 1/2 Running 0 6h3m 192.168.0.23 wj47ios1208b-xn57g-worker-0-9h7xm <none> <none> keepalived-wj47ios1208b-xn57g-worker-0-ktskl 1/2 Running 0 6h6m 192.168.3.118 wj47ios1208b-xn57g-worker-0-ktskl <none> <none> keepalived-wj47ios1208b-xn57g-worker-0-vw297 1/2 Running 0 6h7m 192.168.3.119 wj47ios1208b-xn57g-worker-0-vw297 <none> <none> $ oc get pods -n openshift-openstack-infra -l app=openstack-infra-vrrp -o json | jq -r '.items[].spec.containers[].env[]|select(.name=="ENABLE_UNICAST")' { "name": "ENABLE_UNICAST", "value": "no" } { "name": "ENABLE_UNICAST", "value": "no" } { "name": "ENABLE_UNICAST", "value": "no" } { "name": "ENABLE_UNICAST", "value": "no" } { "name": "ENABLE_UNICAST", "value": "no" } { "name": "ENABLE_UNICAST", "value": "no" }
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633