Bug 1881182
| Summary: | dual-stack bare metal install fails to create workers due to CSR approval failure | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Dan Winship <danw> | ||||||||||
| Component: | Installer | Assignee: | Beth White <beth.white> | ||||||||||
| Installer sub component: | OpenShift on Bare Metal IPI | QA Contact: | Shelly Miron <smiron> | ||||||||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||||||||
| Severity: | urgent | ||||||||||||
| Priority: | high | CC: | brad, derekh, rbartal, rbryant, stbenjam, zbitter | ||||||||||
| Version: | 4.6 | Keywords: | OtherQA, Triaged, UpcomingSprint | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | 4.6.0 | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2020-10-27 16:43:35 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Dan Winship
2020-09-21 17:36:48 UTC
Created attachment 1715574 [details]
machine-approver-controller log
Created attachment 1715575 [details]
oc get bmh -n openshift-machine-api -o yaml
Created attachment 1715576 [details]
oc get machine -n openshift-machine-api -o yaml
Created attachment 1715577 [details]
oc get node -o yaml
(attachments are all from _after_ manually approving the pending CSRs)
Notes on what I see in the debug info: BareMetalHosts: - we only have 1 IP per network interface. This is a gap in the metal3 baremetal-operator. This doesn't seem to be causing this problem, though. https://github.com/metal3-io/baremetal-operator/issues/458 In the cluster-machine-approver log, we have: > I0921 16:55:11.559980 1 main.go:218] Error syncing csr csr-pqwcl: failed to find machine for node worker-0 so it's looking for a Machine resource with an InternalDNS name of "worker-0". However, the hostnames we have on the worker Machines (which comes from the BareMetalHost) are: - worker-0.ostest.test.metalkube.org - worker-1.ostest.test.metalkube.org One interesting note is that the hostnames we have for the masters are: - master-0 - master-1 - master-2 So the root cause of this failure seems to be a mismatch in how we're collecting and reporting the hostname for these workers, and how the hostname is determined by kubelet and put in its CSR. The error message appears to be coming from here: https://github.com/openshift/cluster-machine-approver/blob/master/csr_check.go#L269-L272 Here is the Machine (ostest-jk6xs-worker-0-vt5mk) status: status: addresses: - address: 192.168.111.23 type: InternalIP - address: fd00:1101::e45d:2711:3ff3:5c2b type: InternalIP - address: worker-0.ostest.test.metalkube.org type: Hostname - address: worker-0.ostest.test.metalkube.org type: InternalDNS lastUpdated: "2020-09-21T17:14:16Z" nodeRef: kind: Node name: worker-0 uid: 6c67cd35-a5d4-4d67-874f-95d66825531f phase: Running it's linked to the correct Node, but the cluster-machine-approver doesn't use the nodeRef to match the node, it uses the internal DNS name, which must match the Node name: https://github.com/openshift/cluster-machine-approver/blob/master/csr_check.go#L343-L345 So the proximate cause of the issue is that the Machine has its internal DNS name set to the fully-qualified "worker-0.ostest.test.metalkube.org" instead of the usual "worker-0". The hostname is populated from the HarwareDetails in the BareMetalHost: https://github.com/openshift/cluster-api-provider-baremetal/blob/master/pkg/cloud/baremetal/actuators/machine/actuator.go#L794-L803 which gets it directly from ironic-inspector: https://github.com/openshift/baremetal-operator/blob/master/pkg/provisioner/ironic/hardwaredetails/hardwaredetails.go#L22 which presumably gets it directly from IPA running during inspection after the Host is created: https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/netutils.py#L233-L237 which calls Python's socket.gethostname(): https://docs.python.org/3/library/socket.html#socket.gethostname It's a reasonable bet that this is set based on DHCP. Notably this is only happening on the workers, so when the masters are inspected (by the installer) it's getting just the hostname instead of a FQDN. Sep 21 15:55:25 worker-0 NetworkManager[776]: <info> [1600703725.1629] dhcp4 (enp2s0): option domain_name => 'ostest.test.metalkube.org' Sep 21 15:55:25 worker-0 NetworkManager[776]: <info> [1600703725.1629] dhcp4 (enp2s0): option host_name => 'worker-0' [core@worker-0 ~]$ hostname worker-0 [core@worker-0 ~]$ hostname -f worker-0.ostest.test.metalkube.org so, it knows its FQDN, but it's not reporting it by default as the hostname (The dhcpv6 results don't have domain_name or host_name set, which may be why we get different results (FQDN nodenames) with single-stack IPv6.) It appears we implemented a hack to make the FQDN show up on IPv6, since that was necessary to make single-stack IPv6 work: bug 1806001 I guess we need to disable this on dual-stack. AFAICT the code from https://github.com/openshift/machine-config-operator/pull/1494 is not running on either single-stack IPv6 or dual-stack. On single-stack IPv6, the hostname is being set based on DNS: Sep 22 16:06:10 localhost NetworkManager[1753]: <info> [1600790770.1800] manager: startup complete Sep 22 16:06:10 localhost NetworkManager[1753]: <info> [1600790770.1819] policy: set-hostname: set hostname to 'master-2.ostest.test.metalkube.org' (from address lookup) whereas on single-stack IPv4 and dual-stack, it gets set by some unknown process early during NM startup: Sep 22 16:59:30 localhost ignition[794]: GET http://169.254.169.254/openstack/latest/user_data: attempt #2 Sep 22 16:59:29 master-0 NetworkManager[823]: <info> [1600793969.9615] dhcp-init: Using DHCP client 'internal' Sep 22 16:31:43 localhost NetworkManager[766]: <info> [1600792303.8342] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file Sep 22 16:31:43 worker-1 ignition[742]: GET http://169.254.169.254/openstack/latest/user_data: attempt #2 (It's not clear from the journal output exactly what is causing the hostname to change there) Also in both single-stack IPv4 and dual-stack, we later get a "host_name" option from the DHCPv4 server, while in single-stack IPv6 (and dual-stack) we don't get any such option from the DHCPv6 server. The different in hostname behavior betweeen single-stack IPv4 and single-stack IPv6 seems entirely explained by the fact that our DHCP server is returning a host_name option and our DHCPv6 server is not. But in the dual-stack case, we _do_ have the host_name option, and the logs show that it's being used, so I'm not sure how the BareMetalHost is ending up wrong. Nothing ever sets the hostname to "worker-1.ostest.test.metalkube.org". (In reply to Zane Bitter from comment #9) > It appears we implemented a hack to make the FQDN show up on IPv6, since > that was necessary to make single-stack IPv6 work: bug 1806001 > > I guess we need to disable this on dual-stack. (In reply to Dan Winship from comment #10) > AFAICT the code from > https://github.com/openshift/machine-config-operator/pull/1494 is not > running on either single-stack IPv6 or dual-stack. On single-stack IPv6, the > hostname is being set based on DNS: We also have a similar hack in IPA to set the hostname(during inspection I think) https://github.com/openshift/ironic-ipa-downloader/pull/27/files is this what your look for? Hm... does that code run in an earlier boot or in a pre-boot context? The journal that is available when the host comes up shows that it never actually has the hostname "worker-0.ostest.test.metalkube.org. But if the code you pointed out runs at some point that wouldn't be in that journal, then that could be the culprit. I think the fix would be to make it only use the DHCP6 hostname if the hostname hadn't already been set based on the DHCP4 response? Thanks Derek, I had the wrong bug. (I should have linked bug 1798272.) The code does indeed run on an earlier boot (when we do introspection on the host, prior to provisioning it), so it won't show up in the cluster logs. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |