Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1881182

Summary: dual-stack bare metal install fails to create workers due to CSR approval failure
Product: OpenShift Container Platform Reporter: Dan Winship <danw>
Component: InstallerAssignee: Beth White <beth.white>
Installer sub component: OpenShift on Bare Metal IPI QA Contact: Shelly Miron <smiron>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: high CC: brad, derekh, rbartal, rbryant, stbenjam, zbitter
Version: 4.6Keywords: OtherQA, Triaged, UpcomingSprint
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:43:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
machine-approver-controller log
none
oc get bmh -n openshift-machine-api -o yaml
none
oc get machine -n openshift-machine-api -o yaml
none
oc get node -o yaml none

Description Dan Winship 2020-09-21 17:36:48 UTC
Bringing up a dual-stack bare metal cluster with dev-scripts master and latest OCP plus https://github.com/openshift/ovn-kubernetes/pull/278 plus https://github.com/openshift/machine-config-operator/pull/2108...

The masters come up and things are progressing but eventually the install fails because the workers are never created

  [root@dwinship1 dev-scripts]# oc get machines --all-namespaces
  NAMESPACE               NAME                          PHASE         TYPE   REGION   ZONE   AGE
  openshift-machine-api   ostest-jk6xs-master-0         Running                              88m
  openshift-machine-api   ostest-jk6xs-master-1         Running                              88m
  openshift-machine-api   ostest-jk6xs-master-2         Running                              88m
  openshift-machine-api   ostest-jk6xs-worker-0-f6wfk   Provisioned                          53m
  openshift-machine-api   ostest-jk6xs-worker-0-vt5mk   Provisioned                          53m

machine approver logs show, eg:

  I0921 16:55:11.547658       1 main.go:147] CSR csr-pqwcl added
  I0921 16:55:11.559958       1 main.go:182] CSR csr-pqwcl not authorized: failed to find machine for node worker-0
  I0921 16:55:11.559980       1 main.go:218] Error syncing csr csr-pqwcl: failed to find machine for node worker-0

CSR approval seems busted:

  [root@dwinship1 dev-scripts]# oc get csr
  NAME        AGE     SIGNERNAME                                    REQUESTOR                                                                   CONDITION
  csr-6rxx9   22m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-jt9n4   53m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-ldzqt   37m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-mbs9w   68m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-nsrpf   6m57s   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-p27lz   68m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-pqwcl   22m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-q6dp9   6m42s   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-x9q5m   53m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  csr-xw5rm   37m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending

Manually approving the pending CSRs allows the cluster to come up.

Comment 1 Dan Winship 2020-09-21 17:37:22 UTC
Created attachment 1715574 [details]
machine-approver-controller log

Comment 2 Dan Winship 2020-09-21 17:38:01 UTC
Created attachment 1715575 [details]
oc get bmh -n openshift-machine-api -o yaml

Comment 3 Dan Winship 2020-09-21 17:38:23 UTC
Created attachment 1715576 [details]
oc get machine -n openshift-machine-api -o yaml

Comment 4 Dan Winship 2020-09-21 17:38:54 UTC
Created attachment 1715577 [details]
oc get node -o yaml

(attachments are all from _after_ manually approving the pending CSRs)

Comment 5 Russell Bryant 2020-09-21 18:01:14 UTC
Notes on what I see in the debug info:

BareMetalHosts:
 - we only have 1 IP per network interface.  This is a gap in the metal3 baremetal-operator.  This doesn't seem to be causing this problem, though.  https://github.com/metal3-io/baremetal-operator/issues/458

In the cluster-machine-approver log, we have:

>   I0921 16:55:11.559980       1 main.go:218] Error syncing csr csr-pqwcl: failed to find machine for node worker-0

so it's looking for a Machine resource with an InternalDNS name of "worker-0".  However, the hostnames we have on the worker Machines (which comes from the BareMetalHost) are:

 - worker-0.ostest.test.metalkube.org
 - worker-1.ostest.test.metalkube.org

One interesting note is that the hostnames we have for the masters are:

 - master-0
 - master-1
 - master-2

So the root cause of this failure seems to be a mismatch in how we're collecting and reporting the hostname for these workers, and how the hostname is determined by kubelet and put in its CSR.

Comment 6 Zane Bitter 2020-09-21 18:34:40 UTC
The error message appears to be coming from here:
https://github.com/openshift/cluster-machine-approver/blob/master/csr_check.go#L269-L272

Here is the Machine (ostest-jk6xs-worker-0-vt5mk) status:

  status:
    addresses:
    - address: 192.168.111.23
      type: InternalIP
    - address: fd00:1101::e45d:2711:3ff3:5c2b
      type: InternalIP
    - address: worker-0.ostest.test.metalkube.org
      type: Hostname
    - address: worker-0.ostest.test.metalkube.org
      type: InternalDNS
    lastUpdated: "2020-09-21T17:14:16Z"
    nodeRef:
      kind: Node
      name: worker-0
      uid: 6c67cd35-a5d4-4d67-874f-95d66825531f
    phase: Running

it's linked to the correct Node, but the cluster-machine-approver doesn't use the nodeRef to match the node, it uses the internal DNS name, which must match the Node name:
https://github.com/openshift/cluster-machine-approver/blob/master/csr_check.go#L343-L345

So the proximate cause of the issue is that the Machine has its internal DNS name set to the fully-qualified "worker-0.ostest.test.metalkube.org" instead of the usual "worker-0".

The hostname is populated from the HarwareDetails in the BareMetalHost:
https://github.com/openshift/cluster-api-provider-baremetal/blob/master/pkg/cloud/baremetal/actuators/machine/actuator.go#L794-L803
which gets it directly from ironic-inspector:
https://github.com/openshift/baremetal-operator/blob/master/pkg/provisioner/ironic/hardwaredetails/hardwaredetails.go#L22
which presumably gets it directly from IPA running during inspection after the Host is created:
https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/netutils.py#L233-L237
which calls Python's socket.gethostname():
https://docs.python.org/3/library/socket.html#socket.gethostname

It's a reasonable bet that this is set based on DHCP. Notably this is only happening on the workers, so when the masters are inspected (by the installer) it's getting just the hostname instead of a FQDN.

Comment 7 Dan Winship 2020-09-21 20:45:00 UTC
Sep 21 15:55:25 worker-0 NetworkManager[776]: <info>  [1600703725.1629] dhcp4 (enp2s0): option domain_name          => 'ostest.test.metalkube.org'
Sep 21 15:55:25 worker-0 NetworkManager[776]: <info>  [1600703725.1629] dhcp4 (enp2s0): option host_name            => 'worker-0'

[core@worker-0 ~]$ hostname
worker-0
[core@worker-0 ~]$ hostname -f
worker-0.ostest.test.metalkube.org

so, it knows its FQDN, but it's not reporting it by default as the hostname

Comment 8 Dan Winship 2020-09-21 20:47:26 UTC
(The dhcpv6 results don't have domain_name or host_name set, which may be why we get different results (FQDN nodenames) with single-stack IPv6.)

Comment 9 Zane Bitter 2020-09-22 16:50:03 UTC
It appears we implemented a hack to make the FQDN show up on IPv6, since that was necessary to make single-stack IPv6 work: bug 1806001

I guess we need to disable this on dual-stack.

Comment 10 Dan Winship 2020-09-22 17:27:49 UTC
AFAICT the code from https://github.com/openshift/machine-config-operator/pull/1494 is not running on either single-stack IPv6 or dual-stack. On single-stack IPv6, the hostname is being set based on DNS:

  Sep 22 16:06:10 localhost NetworkManager[1753]: <info>  [1600790770.1800] manager: startup complete
  Sep 22 16:06:10 localhost NetworkManager[1753]: <info>  [1600790770.1819] policy: set-hostname: set hostname to 'master-2.ostest.test.metalkube.org' (from address lookup)

whereas on single-stack IPv4 and dual-stack, it gets set by some unknown process early during NM startup:

  Sep 22 16:59:30 localhost ignition[794]: GET http://169.254.169.254/openstack/latest/user_data: attempt #2
  Sep 22 16:59:29 master-0 NetworkManager[823]: <info>  [1600793969.9615] dhcp-init: Using DHCP client 'internal'

  Sep 22 16:31:43 localhost NetworkManager[766]: <info>  [1600792303.8342] manager: rfkill: WWAN enabled
 by radio killswitch; enabled by state file
  Sep 22 16:31:43 worker-1 ignition[742]: GET http://169.254.169.254/openstack/latest/user_data: attempt #2

(It's not clear from the journal output exactly what is causing the hostname to change there)

Also in both single-stack IPv4 and dual-stack, we later get a "host_name" option from the DHCPv4 server, while in single-stack IPv6 (and dual-stack) we don't get any such option from the DHCPv6 server.

The different in hostname behavior betweeen single-stack IPv4 and single-stack IPv6 seems entirely explained by the fact that our DHCP server is returning a host_name option and our DHCPv6 server is not.

But in the dual-stack case, we _do_ have the host_name option, and the logs show that it's being used, so I'm not sure how the BareMetalHost is ending up wrong. Nothing ever sets the hostname to "worker-1.ostest.test.metalkube.org".

Comment 11 Derek Higgins 2020-09-23 09:42:22 UTC
(In reply to Zane Bitter from comment #9)
> It appears we implemented a hack to make the FQDN show up on IPv6, since
> that was necessary to make single-stack IPv6 work: bug 1806001
> 
> I guess we need to disable this on dual-stack.

(In reply to Dan Winship from comment #10)
> AFAICT the code from
> https://github.com/openshift/machine-config-operator/pull/1494 is not
> running on either single-stack IPv6 or dual-stack. On single-stack IPv6, the
> hostname is being set based on DNS:

We also have a similar hack in IPA to set the hostname(during inspection I think)
https://github.com/openshift/ironic-ipa-downloader/pull/27/files
is this what your look for?

Comment 12 Dan Winship 2020-09-23 12:46:00 UTC
Hm... does that code run in an earlier boot or in a pre-boot context? The journal that is available when the host comes up shows that it never actually has the hostname "worker-0.ostest.test.metalkube.org. But if the code you pointed out runs at some point that wouldn't be in that journal, then that could be the culprit. I think the fix would be to make it only use the DHCP6 hostname if the hostname hadn't already been set based on the DHCP4 response?

Comment 13 Zane Bitter 2020-09-23 13:56:04 UTC
Thanks Derek, I had the wrong bug. (I should have linked bug 1798272.)

The code does indeed run on an earlier boot (when we do introspection on the host, prior to provisioning it), so it won't show up in the cluster logs.

Comment 17 errata-xmlrpc 2020-10-27 16:43:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196