Hi, I'm opening this BZ following a comment from BZ#1929160 (https://bugzilla.redhat.com/show_bug.cgi?id=1929160#c66). Description of problem ---------------------- When trying to deploy a BM-IPI cluster using OVNKubernetes backend (required for dual stack IPv4/IPv6 support). All worker nodes are registering as `localhost.localdomain` and their CSR are never approved. Version ------- I'm facing this issue with OCP 4.10 and 4.11 nightlies using OVNKubernetes. OCP 4.9 + OVNKubernetes is working fine and OCP 4.10/4.11 + OpenShiftSDN is also working properly.
Please attach NM logs at trace level, thank you.
Right, the problem is that the resolution returns "localhost,localdomain". NetworkManager uses the glibc resolver to resolve the hostname. Please check which NSS modules are installed with "grep hosts: /etc/nsswitch.conf". By default they should be: hosts: files dns myhostname meaning that first /etc/hosts is tried, then DNS, then the "myhostname" module will return the current hostname for any local address. Can you also try the following commands: getent hosts $addr getent -s dns hosts $addr dig -x $addr where $addr is the address on br-ex (e.g. 10.1.156.46). They should all give something different from "localhost.localdomain".
I made a new tentative with OCP-4.11.0-0.nightly-2022-03-08-191358 and the hostname issue didn't happen (however the CSR is still not approved properly). Attaching the logs for information.
Regarding the localhost issue, I noticed that it is now also happening in OCP 4.9. I managed to "bisect" that the issue has been introduced in OCP 4.9.24. The relevant changes between 4.9.23 and 4.9.24 are https://github.com/openshift/machine-config-operator/compare/b64dc344aa5a202080189abe7b1ec92bac286c06...a4fba0f500ff1fdfced1919d81253f147fea02de @jcaamano I suspect one of the backports in ovs-configure to be the cause of this issue.
Okay, I've proposed a backport of the hostname fix based on https://bugzilla.redhat.com/show_bug.cgi?id=2058030#c24 . Can someone give that a /lgtm so I can chase approvers to get it merged? TIA.
QE verified this bug based on https://bugzilla.redhat.com/show_bug.cgi?id=2058030#c61
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069