Bug 1901472

Summary: [OSP] Bootstrap and master nodes use different keepalived unicast setting
Product: OpenShift Container Platform Reporter: Martin André <m.andre>
Component: Machine Config OperatorAssignee: Martin André <m.andre>
Status: CLOSED ERRATA QA Contact: weiwei jiang <wjiang>
Severity: high Docs Contact:
Priority: high    
Version: 4.7CC: bperkins, pprinett
Target Milestone: ---Keywords: UpcomingSprint
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:35:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin André 2020-11-25 11:18:11 UTC
Description of problem:

The bootstrap node now unconditionally sets the ENABLE_UNICAST environment variable to yes on all on-prem platforms, causing it to use unicast_src_ip and unicast_peer settings in keeepalived.conf, while the control plane nodes rely on output of the onPremPlatformKeepalivedEnableUnicast function, that is returning yes only for baremetal and kubevirt platforms.

This issue was introduced in the template de-duplication effort in MCO by
https://github.com/openshift/machine-config-operator/pull/2071.

I noticed this issue while looking at why OpenStack CI in 4.7 was failing frequently with "Timeout waiting for bootstrap to initialize", and I'm hoping this is the cause.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 weiwei jiang 2020-12-08 08:55:07 UTC
Checked with 4.7.0-0.nightly-2020-12-04-013308, and it's fixed now for OpenStack platform.

$ oc get pods -n openshift-openstack-infra  -l app=openstack-infra-vrrp  -o wide
NAME                                           READY   STATUS    RESTARTS   AGE     IP              NODE                                NOMINATED NODE   READINESS GATES
keepalived-wj47ios1208b-xn57g-master-0         2/2     Running   0          6h19m   192.168.2.236   wj47ios1208b-xn57g-master-0         <none>           <none>
keepalived-wj47ios1208b-xn57g-master-1         2/2     Running   0          6h19m   192.168.2.90    wj47ios1208b-xn57g-master-1         <none>           <none>
keepalived-wj47ios1208b-xn57g-master-2         2/2     Running   0          6h18m   192.168.3.121   wj47ios1208b-xn57g-master-2         <none>           <none>
keepalived-wj47ios1208b-xn57g-worker-0-9h7xm   1/2     Running   0          6h3m    192.168.0.23    wj47ios1208b-xn57g-worker-0-9h7xm   <none>           <none>
keepalived-wj47ios1208b-xn57g-worker-0-ktskl   1/2     Running   0          6h6m    192.168.3.118   wj47ios1208b-xn57g-worker-0-ktskl   <none>           <none>
keepalived-wj47ios1208b-xn57g-worker-0-vw297   1/2     Running   0          6h7m    192.168.3.119   wj47ios1208b-xn57g-worker-0-vw297   <none>           <none>
$ oc get pods -n openshift-openstack-infra  -l app=openstack-infra-vrrp -o json  | jq -r '.items[].spec.containers[].env[]|select(.name=="ENABLE_UNICAST")'
{
  "name": "ENABLE_UNICAST",
  "value": "no"
}
{
  "name": "ENABLE_UNICAST",
  "value": "no"
}
{
  "name": "ENABLE_UNICAST",
  "value": "no"
}
{
  "name": "ENABLE_UNICAST",
  "value": "no"
}
{
  "name": "ENABLE_UNICAST",
  "value": "no"
}
{
  "name": "ENABLE_UNICAST",
  "value": "no"
}

Comment 7 errata-xmlrpc 2021-02-24 15:35:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633