Description of problem: After one outage on the hypervisor, the node was rebooted, and the kubelet could not start due to the hostname being localhost instead of the actual hostname of the node. Version-Release number of selected component (if applicable): OpenShift Container Platform 4.3.18 How reproducible: Shutdown the node turn and start right after, sometime the kubelet will start without the right hostname. Steps to Reproduce: 1. Reboot the node and check if the hostname is localhost 2. Kubelet will not work as expected 3. Set the right hostname 4. Restart the kubelet. Actual results: 18 13:54:10 localhost crio[1340]: time="2020-05-18 13:54:10.358249819Z" level=error msg="CNI network \"\" not found" May 18 13:54:10 localhost systemd[1]: Started Open Container Initiative Daemon. May 18 13:54:10 localhost systemd[1]: Starting Kubernetes Kubelet... May 18 13:54:11 localhost hyperkube[1895]: Flag --minimum-container-ttl-duration has been deprecated, Use --eviction-hard or --eviction-soft instead. Will be removed in a future version. May 18 13:54:11 localhost hyperkube[1895]: I0518 13:54:11.204217 1895 flags.go:33] FLAG: --add-dir-header="false" May 18 13:54:11 localhost hyperkube[1895]: I0518 13:54:11.204308 1895 flags.go:33] FLAG: --address="0.0.0.0" May 18 13:54:11 localhost hyperkube[1895]: I0518 13:54:11.204314 1895 flags.go:33] FLAG: --allowed-unsafe-sysctls="[]" May 18 13:54:11 localhost hyperkube[1895]: I0518 13:54:11.204319 1895 flags.go:33] FLAG: --alsologtostderr="false" May 18 13:54:11 localhost hyperkube[1895]: I0518 13:54:11.204322 1895 flags.go:33] FLAG: --anonymous-auth="true" Expected results: May 18 13:54:13 localhost hyperkube[1895]: E0518 13:54:13.301465 1895 kubelet.go:2278] node "localhost" not found May 18 13:54:13 localhost NetworkManager[1211]: <info> [1589810053.3015] policy: set-hostname: set hostname to 'server3.example.com' (from address lookup) May 18 13:54:13 server3.example.com systemd-hostnamed[1243]: Changed host name to 'server3.example.com' Additional info: The node is running on VMware.
I don't know if this could be related to the https://bugzilla.redhat.com/show_bug.cgi?id=1803962 .
*** This bug has been marked as a duplicate of bug 1803962 ***
This should be fixed as of e.g. 4.3.22: ``` $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.22-x86_64 | grep machine-config machine-config-operator https://github.com/openshift/machine-config-operator c6a1e9b3d022671cef735d55eb277c140556b301 $ cd ~/src/machine-config-operator $ git shortlog --no-merges c6a1e9b3d022671cef735d55eb277c140556b301 --grep=network-online Ryan Phillips (1): Bug 1763700: kubelet: add dependency on network-online.target W. Trevor King (1): templates/_base/master/units/etcd-member: Block on network-online.target ``` IOW please check that ``` [root@api ~]# grep network-online /etc/systemd/system/kubelet.service Wants=rpc-statd.service network-online.target crio.service After=network-online.target crio.service [root@api ~]# ```