Description of problem: Trigger a GCP IPI install, set a cluster name which length equals 19, e.g. yunjiang00123456789 Following error occurred: level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete, waiting on authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete" level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete" level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" level=error msg="Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingEndpoints::RouteStatus_FailedHost: IngressStateEndpointsDegraded: No endpoints found for oauth-server\nRouteStatusDegraded: route is not available at canonical host oauth-openshift.apps.yunjiang00123456789.qe.gcp.devcluster.openshift.com: []" level=info msg="Cluster operator authentication Progressing is Unknown with NoData: " level=info msg="Cluster operator authentication Available is Unknown with NoData: " level=info msg="Cluster operator console Progressing is True with DefaultRouteSync_FailedAdmitDefaultRoute::OAuthClientSync_FailedHost: DefaultRouteSyncProgressing: route \"console\" is not available at canonical host []\nOAuthClientSyncProgressing: route \"console\" is not available at canonical host []" level=info msg="Cluster operator console Available is Unknown with NoData: " level=info msg="Cluster operator image-registry Available is False with NoReplicasAvailable: Available: The deployment does not have available replicas\nImagePrunerAvailable: Pruner CronJob has been created" level=info msg="Cluster operator image-registry Progressing is True with DeploymentNotCompleted: Progressing: The deployment has not completed" level=info msg="Cluster operator ingress Available is False with IngressUnavailable: Not all ingress controllers are available." level=info msg="Cluster operator ingress Progressing is True with Reconciling: Not all ingress controllers are available." level=error msg="Cluster operator ingress Degraded is True with IngressControllersDegraded: Some ingresscontrollers are degraded: default" level=info msg="Cluster operator insights Disabled is False with : " level=info msg="Cluster operator kube-storage-version-migrator Available is False with _NoMigratorPod: Available: deployment/migrator.openshift-kube-storage-version-migrator: no replicas are available" level=info msg="Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack." level=error msg="Cluster operator monitoring Degraded is True with UpdatingAlertmanagerFailed: Failed to rollout the stack. Error: running task Updating Alertmanager failed: waiting for Alertmanager Route to become ready failed: waiting for RouteReady of alertmanager-main: no status available for alertmanager-main" level=info msg="Cluster operator monitoring Available is False with : " level=fatal msg="failed to initialize the cluster: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring” Errors on worker node: Jul 03 06:24:36 yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal hyperkube[12012]: E0703 06:24:36.741154 12012 kubelet_node_status.go:92] Unable to register node "yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal" with API server: Node "yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal" is invalid: metadata.labels: Invalid value: "yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal": must be no more than 63 characters How reproducible: Always OCP version: 4.5.0-0.nightly-2020-07-02-021110 Reproduce: Trigger an IPI install on GCP, make sure the length of cluster name equals 19 Expected results:: Installation succeeds Actual results: Installation fails Additional info: length of cluster name small than 19 or greater than 19 are both ok. the error only occurred on GCP platform
Created attachment 1699796 [details] install log
Moving to RHCOS team since they are allowing hostname length to be <= 64 when the hostname that become node name is only allowed to be max 63 Why does it happen? If you look at the journal logs you will see that, network manager accepts hostnames upto 64 chars and only fails when it is longer than that and therefore setting it self to localhost. At that point MCO change to truncate triggers as it skip if hostname is not set to local host. Hopefully this helps RHCOS team fix it. Also lowering severity as it happens ONLY when to total length of hostname (internal DNS name for a machine defined by GCP) is exactly 64. This is not a regression, we have this bug since the beginning as machine name is controlled I length by installer but project name is not and we BIGGER problem previously and we fixed it except this exact case of 64 .
This looks like it should be handled by https://github.com/openshift/machine-config-operator/pull/1813 and/or https://github.com/openshift/machine-config-operator/pull/1711 Targetting for 4.6 with medium priority, pending additional triage.
Ugh. Yeah, this is a bug in the NetworkManager dispatch script.
Actually, after spending way more time than I care to confess, I tracked this down to dhclient _truncating_ the hostname. # printf $(</proc/sys/kernel/hostname) | wc -m 64 # cat /proc/sys/kernel/hostname c30dabda1c28ac3a52fecc76a49da356c89c09b08d8c2765752f6dd8926bfe2. On 4.6 RHCOS is using NetworkManager in the initramfs. Pre 4.6, first-boot uses NetworkManager. What's unclear to me is _when_ in the boot this happening. On second boot the 90-long-hostname NM patch takes effect and would prevent the hostname from being longer than 63 characters. I spend the afternoon on a deep dive and found that 4.6 is unaffected; this should be targeted at 4.5.z z-stream. Can we get a must-gather and a console log?
I uploaded logs to https://drive.google.com/drive/folders/1lXrHiLjv7e1AakDUL5s1-O09b9P1CgLY?usp=sharing
Proposed PR fix: https://github.com/openshift/machine-config-operator/pull/1914 The root cause is not the network-dispatcher, but a race condition between when the Network Manager script runs that truncates the hostname and the dhclient in the initramfs that truncates the hostname at 64 characters.
Verified on 4.6.0-0.nightly-2020-07-15-170241 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-07-15-170241 True False 44m Cluster version is 4.6.0-0.nightly-2020-07-15-170241 IPI installed successful ----------------------------------------------- DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 94% complete DEBUG Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, cloud-credential, cluster-autoscaler, console, csi-snapshot-controller, image-registry, ingress, insights, kube-storage-version-migrator, machine-api, marketplace, monitoring, openshift-samples, storage DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 94% complete DEBUG Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, cloud-credential, cluster-autoscaler, console, csi-snapshot-controller, image-registry, ingress, insights, kube-storage-version-migrator, machine-api, marketplace, monitoring, openshift-samples, storage DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 96% complete DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 97% complete DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 97% complete DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 98% complete, waiting on authentication, cluster-autoscaler, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, machine-api, monitoring DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 99% complete DEBUG Cluster is initialized INFO Waiting up to 10m0s for the openshift-console route to be created... DEBUG Route found in openshift-console namespace: console DEBUG Route found in openshift-console namespace: downloads DEBUG OpenShift console route is created INFO Install complete! Hostname is truncated successfully ----------------------------------------------- $ oc get nodes NAME STATUS ROLES AGE VERSION yunjiang00123456789-cvndz-master-0 Ready master 59m v1.18.3+a34fde4 yunjiang00123456789-cvndz-master-1 Ready master 59m v1.18.3+a34fde4 yunjiang00123456789-cvndz-master-2 Ready master 59m v1.18.3+a34fde4 yunjiang00123456789-cvndz-worker-a-gbzhl Ready worker 33m v1.18.3+a34fde4 yunjiang00123456789-cvndz-worker-b-p42w5 Ready worker 33m v1.18.3+a34fde4 yunjiang00123456789-cvndz-worker-c-8pxv8 Ready worker 32m v1.18.3+a34fde4 $ oc debug node/yunjiang00123456789-cvndz-worker-a-gbzhl Starting pod/yunjiang00123456789-cvndz-worker-a-gbzhl-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# ls bin boot dev etc home lib lib64 media mnt opt ostree proc root run sbin srv sys sysroot tmp usr var sh-4.4# sh-4.4# journalctl -u node-valid-hostname.service -- Logs begin at Thu 2020-07-16 02:02:42 UTC, end at Thu 2020-07-16 02:40:57 UTC. -- Jul 16 02:03:00 localhost systemd[1]: Starting Ensure the node hostname is valid for the cluster... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster. Jul 16 02:04:19 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Stopped Ensure the node hostname is valid for the cluster. Jul 16 02:04:19 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: node-valid-hostname.service: Consumed 21ms CPU time -- Reboot -- Jul 16 02:04:46 localhost systemd[1]: Starting Ensure the node hostname is valid for the cluster... Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster. Jul 16 02:38:31 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Stopped Ensure the node hostname is valid for the cluster. Jul 16 02:38:31 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: node-valid-hostname.service: Consumed 0 CPU time -- Reboot -- Jul 16 02:39:23 localhost systemd[1]: Starting Ensure the node hostname is valid for the cluster... Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster. sh-4.4# journalctl --no-pager | grep -20 "using trunacated hostname" Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8890] dhcp4 (ens4): option rfc3442_classless_static_routes => '10.0.32.1/32 0.0.0.0 0.0.0.0/0 10.0.32.1' Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8890] dhcp4 (ens4): option routers => '10.0.32.1' Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8890] dhcp4 (ens4): option subnet_mask => '255.255.255.255' Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8890] dhcp4 (ens4): state changed unknown -> bound Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8909] device (ens4): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8945] device (ens4): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8948] device (ens4): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8954] manager: NetworkManager state is now CONNECTED_LOCAL Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8963] manager: NetworkManager state is now CONNECTED_SITE Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8964] policy: set 'Wired Connection' (ens4) as default for IPv4 routing and DNS Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8966] policy: set-hostname: set hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' (from DHCPv4) Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8974] device (ens4): Activation: successful, device activated. Jul 16 02:03:01 localhost dbus-daemon[1402]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.7' (uid=0 pid=1578 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0") Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8982] manager: NetworkManager state is now CONNECTED_GLOBAL Jul 16 02:03:01 localhost dbus-daemon[1402]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found. Jul 16 02:03:01 localhost NetworkManager[1578]: <info> [1594864981.8991] manager: startup complete Jul 16 02:03:01 localhost NetworkManager[1578]: <warn> [1594864981.8996] hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' using hostnamed: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid hostname 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' Jul 16 02:03:01 localhost NetworkManager[1578]: <warn> [1594864981.8996] policy: set-hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal': (1) Operation not permitted Jul 16 02:03:01 localhost NetworkManager[1578]: <warn> [1594864981.8996] policy: set-hostname: you should use hostnamed when systemd hardening is in effect! Jul 16 02:03:01 localhost systemd[1]: Started Network Manager Wait Online. Jul 16 02:03:01 localhost 90-long-hostname[1648]: yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal is longer than 63 characters, using trunacated hostname Jul 16 02:03:01 localhost 90-long-hostname[1650]: setting transient hostname to yunjiang00123456789-cvndz-worker-a-gbzhl Jul 16 02:03:01 yunjiang00123456789-cvndz-worker-a-gbzhl systemd-hostnamed[1589]: Changed host name to 'yunjiang00123456789-cvndz-worker-a-gbzhl' Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl bash[1656]: node identified as yunjiang00123456789-cvndz-worker-a-gbzhl Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Network is Online. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Generate /run/issue.d/console-login-helper-messages.issue... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting NFS status monitor for NFSv2/3 locking.... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Machine Config Daemon Pull... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Resize root partition... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Generate /run/issue.d/console-login-helper-messages.issue. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting RPC Bind... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Permit User Sessions... Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Permit User Sessions. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Getty on tty1. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Serial Getty on ttyS0. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1697]: Version 2.3.3 starting Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1697]: Flags: TI-RPC Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Login Prompts. Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started RPC Bind. -- Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9009] dhcp4 (ens4): option rfc3442_classless_static_routes => '10.0.32.1/32 0.0.0.0 0.0.0.0/0 10.0.32.1' Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9009] dhcp4 (ens4): option routers => '10.0.32.1' Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9009] dhcp4 (ens4): option subnet_mask => '255.255.255.255' Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9009] dhcp4 (ens4): state changed unknown -> bound Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9026] device (ens4): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9047] device (ens4): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9050] device (ens4): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9055] manager: NetworkManager state is now CONNECTED_LOCAL Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9067] manager: NetworkManager state is now CONNECTED_SITE Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9069] policy: set 'Wired Connection' (ens4) as default for IPv4 routing and DNS Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9072] policy: set-hostname: set hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' (from DHCPv4) Jul 16 02:04:47 localhost dbus-daemon[1192]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.6' (uid=0 pid=1364 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0") Jul 16 02:04:47 localhost dbus-daemon[1192]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found. Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9097] device (ens4): Activation: successful, device activated. Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9105] manager: NetworkManager state is now CONNECTED_GLOBAL Jul 16 02:04:47 localhost NetworkManager[1364]: <info> [1594865087.9114] manager: startup complete Jul 16 02:04:47 localhost NetworkManager[1364]: <warn> [1594865087.9119] hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' using hostnamed: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid hostname 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' Jul 16 02:04:47 localhost NetworkManager[1364]: <warn> [1594865087.9119] policy: set-hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal': (1) Operation not permitted Jul 16 02:04:47 localhost NetworkManager[1364]: <warn> [1594865087.9119] policy: set-hostname: you should use hostnamed when systemd hardening is in effect! Jul 16 02:04:47 localhost systemd[1]: Started Network Manager Wait Online. Jul 16 02:04:47 localhost 90-long-hostname[1438]: yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal is longer than 63 characters, using trunacated hostname Jul 16 02:04:47 localhost 90-long-hostname[1440]: setting transient hostname to yunjiang00123456789-cvndz-worker-a-gbzhl Jul 16 02:04:47 yunjiang00123456789-cvndz-worker-a-gbzhl systemd-hostnamed[1379]: Changed host name to 'yunjiang00123456789-cvndz-worker-a-gbzhl' Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl bash[1446]: node identified as yunjiang00123456789-cvndz-worker-a-gbzhl Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Network is Online. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting NFS status monitor for NFSv2/3 locking.... Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting OpenSSH server daemon... Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting CRI-O Auto Update Script... Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting RPC Bind... Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1449]: Server listening on 0.0.0.0 port 22. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1449]: Server listening on :: port 22. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started OpenSSH server daemon. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started RPC Bind. Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1452]: Version 2.3.3 starting Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1452]: Flags: TI-RPC Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started NFS status monitor for NFSv2/3 locking.. Jul 16 02:04:49 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1450]: time="2020-07-16 02:04:49.071548782Z" level=info msg="Version file /var/run/crio/version not found: open /var/run/crio/version: no such file or directory: triggering wipe of containers" Jul 16 02:04:49 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1450]: time="2020-07-16 02:04:49.071757721Z" level=info msg="Version file /var/lib/crio/version not found: open /var/lib/crio/version: no such file or directory: triggering wipe of images" Jul 16 02:04:49 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started CRI-O Auto Update Script. -- Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4740] dhcp4 (ens4): option rfc3442_classless_static_routes => '10.0.32.1/32 0.0.0.0 0.0.0.0/0 10.0.32.1' Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4740] dhcp4 (ens4): option routers => '10.0.32.1' Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4740] dhcp4 (ens4): option subnet_mask => '255.255.255.255' Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4740] dhcp4 (ens4): state changed unknown -> bound Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4764] device (ens4): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4797] device (ens4): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4800] device (ens4): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4807] manager: NetworkManager state is now CONNECTED_LOCAL Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4821] manager: NetworkManager state is now CONNECTED_SITE Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4822] policy: set 'Wired Connection' (ens4) as default for IPv4 routing and DNS Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4824] policy: set-hostname: set hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' (from DHCPv4) Jul 16 02:39:24 localhost dbus-daemon[1140]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.6' (uid=0 pid=1422 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0") Jul 16 02:39:24 localhost dbus-daemon[1140]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found. Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4900] device (ens4): Activation: successful, device activated. Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4908] manager: NetworkManager state is now CONNECTED_GLOBAL Jul 16 02:39:24 localhost NetworkManager[1422]: <info> [1594867164.4915] manager: startup complete Jul 16 02:39:24 localhost NetworkManager[1422]: <warn> [1594867164.4920] hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' using hostnamed: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid hostname 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' Jul 16 02:39:24 localhost NetworkManager[1422]: <warn> [1594867164.4920] policy: set-hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal': (1) Operation not permitted Jul 16 02:39:24 localhost NetworkManager[1422]: <warn> [1594867164.4920] policy: set-hostname: you should use hostnamed when systemd hardening is in effect! Jul 16 02:39:24 localhost systemd[1]: Started Network Manager Wait Online. Jul 16 02:39:24 localhost 90-long-hostname[1539]: yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal is longer than 63 characters, using trunacated hostname Jul 16 02:39:24 localhost 90-long-hostname[1541]: setting transient hostname to yunjiang00123456789-cvndz-worker-a-gbzhl Jul 16 02:39:24 yunjiang00123456789-cvndz-worker-a-gbzhl systemd-hostnamed[1436]: Changed host name to 'yunjiang00123456789-cvndz-worker-a-gbzhl' Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl bash[1547]: node identified as yunjiang00123456789-cvndz-worker-a-gbzhl Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Network is Online. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting OpenSSH server daemon... Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting CRI-O Auto Update Script... Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting NFS status monitor for NFSv2/3 locking.... Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1549]: Server listening on 0.0.0.0 port 22. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1549]: Server listening on :: port 22. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started OpenSSH server daemon. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting RPC Bind... Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started RPC Bind. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1553]: Version 2.3.3 starting Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1553]: Flags: TI-RPC Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started NFS status monitor for NFSv2/3 locking.. Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1550]: time="2020-07-16 02:39:25.511604712Z" level=info msg="Version file /var/run/crio/version not found: open /var/run/crio/version: no such file or directory: triggering wipe of containers" Jul 16 02:39:26 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1550]: time="2020-07-16 02:39:26.192069648Z" level=info msg="Wiping containers" Jul 16 02:39:26 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1550]: time="2020-07-16 02:39:26.216986174Z" level=info msg="Deleted container 7e95ba6ed3e233d5e4e4bbc6ac0288f0146b8be5d2036825f34400a58eda59e2" sh-4.4# sh-4.4# hostname yunjiang00123456789-cvndz-worker-a-gbzhl
*** Bug 1829453 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196