See also https://bugzilla.redhat.com/show_bug.cgi?id=1828458 for the NM side fix to allow customization of the nm-online timeout.
BUT, I can't think of any reason that we would ever want kubelet in an OCP cluster to attempt to register itself with default hostnames like localhost.localdomain or localhost6.localdomain6.
Perhaps the systemd unit could just fail if that happens and let it get restarted, and eventually it would have the right hostname and succeed?
Bug 1817774 is asking for alerting (or some other automated reporting) around localhost nodes.
Is the bug here that the hostname eventually gets set correctly? I think we could add an ExecStartPre script to check for a `localhost` fqdn, but does the fqdn eventually get set?
(In reply to Ryan Phillips from comment #3)
> Is the bug here that the hostname eventually gets set correctly? I think we
> could add an ExecStartPre script to check for a `localhost` fqdn, but does
> the fqdn eventually get set?
Yes, I believe the FQDN does get set eventually. But kubelet doesn't listen for hostname changes, it tries to register whatever is set when it starts. Sequence is something like:
1) machines starts
2) NM begins DHCP on interfaces
3) DHCP takes longer than the default 30-second nm-online timeout because Enterprise Hardware
4) systemd network-wait-online unit times out, boot proceeds
5) kubelet systemd unit is now allowed to proceed
6) kubelet starts, sees hostname of localhost.localdomain
7) shortly thereafter, DHCP completes and NM sets the machine hostname to the correct FQDN
Ryan is on leave
Should be fixed by https://github.com/openshift/machine-config-operator/pull/1914
If this is specific to getting this fixed in 4.3.z this should depend on https://bugzilla.redhat.com/show_bug.cgi?id=1855878, correct?
If so this needs to be detached from the 4.6.0 errata which was triggered by moving it MODIFIED.
If this is not specific to getting this fixed in 4.3.z shouldn't this just be closed as a dupe of 1855878 ?
*** This bug has been marked as a duplicate of bug 1853584 ***