1853584 – Network Dispatcher long hostname character count is incorrect

Bug 1853584 - Network Dispatcher long hostname character count is incorrect

Summary: Network Dispatcher long hostname character count is incorrect

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Machine Config Operator
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Ben Howard
QA Contact:	Michael Nguyen
Docs Contact:
URL:
Whiteboard:	coreos
Duplicates (1):	1829453 (view as bug list)
Depends On:
Blocks:	1855877
TreeView+	depends on / blocked

Reported:	2020-07-03 08:44 UTC by Yunfei Jiang
Modified:	2023-10-06 20:58 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	1855877 (view as bug list)
Environment:
Last Closed:	2020-10-27 16:12:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
install log (138.14 KB, text/plain) 2020-07-03 08:47 UTC, Yunfei Jiang	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift machine-config-operator pull 1914	0	None	closed	Bug 1853584: templates: node hostnames: ensure RHCOS hostnames are mostly valid	2021-02-10 22:32:10 UTC
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 16:12:26 UTC

Description Yunfei Jiang 2020-07-03 08:44:44 UTC

Description of problem:
Trigger a GCP IPI install, set a cluster name which length equals 19, e.g. yunjiang00123456789 

Following error occurred:

level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete, waiting on authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" 
level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete" 
level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete" 
level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" 
level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-07-02-021110: 86% complete" 
level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" 
level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" 
level=error msg="Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingEndpoints::RouteStatus_FailedHost: IngressStateEndpointsDegraded: No endpoints found for oauth-server\nRouteStatusDegraded: route is not available at canonical host oauth-openshift.apps.yunjiang00123456789.qe.gcp.devcluster.openshift.com: []" 
level=info msg="Cluster operator authentication Progressing is Unknown with NoData: " 
level=info msg="Cluster operator authentication Available is Unknown with NoData: " 
level=info msg="Cluster operator console Progressing is True with DefaultRouteSync_FailedAdmitDefaultRoute::OAuthClientSync_FailedHost: DefaultRouteSyncProgressing: route \"console\" is not available at canonical host []\nOAuthClientSyncProgressing: route \"console\" is not available at canonical host []" 
level=info msg="Cluster operator console Available is Unknown with NoData: " 
level=info msg="Cluster operator image-registry Available is False with NoReplicasAvailable: Available: The deployment does not have available replicas\nImagePrunerAvailable: Pruner CronJob has been created" 
level=info msg="Cluster operator image-registry Progressing is True with DeploymentNotCompleted: Progressing: The deployment has not completed" 
level=info msg="Cluster operator ingress Available is False with IngressUnavailable: Not all ingress controllers are available." 
level=info msg="Cluster operator ingress Progressing is True with Reconciling: Not all ingress controllers are available." 
level=error msg="Cluster operator ingress Degraded is True with IngressControllersDegraded: Some ingresscontrollers are degraded: default" 
level=info msg="Cluster operator insights Disabled is False with : " 
level=info msg="Cluster operator kube-storage-version-migrator Available is False with _NoMigratorPod: Available: deployment/migrator.openshift-kube-storage-version-migrator: no replicas are available" 
level=info msg="Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack." 
level=error msg="Cluster operator monitoring Degraded is True with UpdatingAlertmanagerFailed: Failed to rollout the stack. Error: running task Updating Alertmanager failed: waiting for Alertmanager Route to become ready failed: waiting for RouteReady of alertmanager-main: no status available for alertmanager-main" 
level=info msg="Cluster operator monitoring Available is False with : " 
level=fatal msg="failed to initialize the cluster: Some cluster operators are still updating: authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring” 

Errors on worker node: 

Jul 03 06:24:36 yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal hyperkube[12012]: E0703 06:24:36.741154   12012 kubelet_node_status.go:92] Unable to register node "yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal" with API server: Node "yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal" is invalid: metadata.labels: Invalid value: "yunjiang00123456789-cx4m4-worker-a-5ltln.c.openshift-qe.internal": must be no more than 63 characters 

How reproducible:  
Always  

OCP version: 
4.5.0-0.nightly-2020-07-02-021110 

Reproduce: 
Trigger an IPI install on GCP,  make sure the length of cluster name equals 19 

Expected results:: 
Installation succeeds 

Actual results: 
Installation fails 

Additional info:
length of cluster name small than 19 or greater than 19 are both ok.
the error only occurred on GCP platform

Comment 1 Yunfei Jiang 2020-07-03 08:47:51 UTC

Created attachment 1699796 [details]
install log

Comment 3 Abhinav Dahiya 2020-07-04 03:04:56 UTC

Moving to RHCOS team since they are allowing hostname length to be <= 64 when the hostname that become node name is only allowed to be max 63

Why does it happen?

If you look at the journal logs you will see that, network manager accepts hostnames upto 64 chars and only fails when it is longer than that and therefore setting it self to localhost.

At that point MCO change to truncate triggers as it skip if hostname is not set to local host.
Hopefully this helps RHCOS team fix it.


Also lowering severity as it happens ONLY when to total length of hostname (internal DNS name for a machine defined by GCP)  is exactly 64.

This is not a regression, we have this bug since the beginning as machine name is controlled I length by installer but project name is not and we BIGGER problem previously and we fixed it except this exact case of 64 .

Comment 4 Micah Abbott 2020-07-06 17:39:02 UTC

This looks like it should be handled by https://github.com/openshift/machine-config-operator/pull/1813 and/or https://github.com/openshift/machine-config-operator/pull/1711

Targetting for 4.6 with medium priority, pending additional triage.

Comment 5 Ben Howard 2020-07-06 18:54:38 UTC

Ugh. Yeah, this is a bug in the NetworkManager dispatch script.

Comment 6 Ben Howard 2020-07-06 22:07:43 UTC

Actually, after spending way more time than I care to confess, I tracked this down to dhclient _truncating_ the hostname. 

# printf $(</proc/sys/kernel/hostname) | wc -m
64

# cat /proc/sys/kernel/hostname
c30dabda1c28ac3a52fecc76a49da356c89c09b08d8c2765752f6dd8926bfe2.

On 4.6 RHCOS is using NetworkManager in the initramfs. Pre 4.6, first-boot uses NetworkManager. What's unclear to me is _when_ in the boot this happening. On second boot the 90-long-hostname NM patch takes effect and would prevent the hostname from being longer than 63 characters. 

I spend the afternoon on a deep dive and found that 4.6 is unaffected; this should be targeted at 4.5.z z-stream. 

Can we get a must-gather and a console log?

Comment 7 Yunfei Jiang 2020-07-07 10:50:46 UTC

I uploaded logs to https://drive.google.com/drive/folders/1lXrHiLjv7e1AakDUL5s1-O09b9P1CgLY?usp=sharing

Comment 8 Ben Howard 2020-07-09 14:25:09 UTC

Proposed PR fix: 
https://github.com/openshift/machine-config-operator/pull/1914


The root cause is not the network-dispatcher, but a race condition between when the Network Manager script runs that truncates the hostname and the dhclient in the initramfs that truncates the hostname at 64 characters.

Comment 11 Michael Nguyen 2020-07-16 02:55:30 UTC

Verified on 4.6.0-0.nightly-2020-07-15-170241

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-07-15-170241   True        False         44m     Cluster version is 4.6.0-0.nightly-2020-07-15-170241


IPI installed successful
-----------------------------------------------
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 94% complete 
DEBUG Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, cloud-credential, cluster-autoscaler, console, csi-snapshot-controller, image-registry, ingress, insights, kube-storage-version-migrator, machine-api, marketplace, monitoring, openshift-samples, storage 
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 94% complete 
DEBUG Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, cloud-credential, cluster-autoscaler, console, csi-snapshot-controller, image-registry, ingress, insights, kube-storage-version-migrator, machine-api, marketplace, monitoring, openshift-samples, storage 
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 96% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 97% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 97% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 98% complete, waiting on authentication, cluster-autoscaler, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, machine-api, monitoring 
DEBUG Still waiting for the cluster to initialize: Working towards 4.6.0-0.nightly-2020-07-15-170241: 99% complete 
DEBUG Cluster is initialized                       
INFO Waiting up to 10m0s for the openshift-console route to be created... 
DEBUG Route found in openshift-console namespace: console 
DEBUG Route found in openshift-console namespace: downloads 
DEBUG OpenShift console route is created           
INFO Install complete!                       


Hostname is truncated successfully
-----------------------------------------------
$ oc get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
yunjiang00123456789-cvndz-master-0         Ready    master   59m   v1.18.3+a34fde4
yunjiang00123456789-cvndz-master-1         Ready    master   59m   v1.18.3+a34fde4
yunjiang00123456789-cvndz-master-2         Ready    master   59m   v1.18.3+a34fde4
yunjiang00123456789-cvndz-worker-a-gbzhl   Ready    worker   33m   v1.18.3+a34fde4
yunjiang00123456789-cvndz-worker-b-p42w5   Ready    worker   33m   v1.18.3+a34fde4
yunjiang00123456789-cvndz-worker-c-8pxv8   Ready    worker   32m   v1.18.3+a34fde4

$ oc debug node/yunjiang00123456789-cvndz-worker-a-gbzhl
Starting pod/yunjiang00123456789-cvndz-worker-a-gbzhl-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# ls
bin  boot  dev	etc  home  lib	lib64  media  mnt  opt	ostree	proc  root  run  sbin  srv  sys  sysroot  tmp  usr  var
sh-4.4# 
sh-4.4# journalctl -u node-valid-hostname.service
-- Logs begin at Thu 2020-07-16 02:02:42 UTC, end at Thu 2020-07-16 02:40:57 UTC. --
Jul 16 02:03:00 localhost systemd[1]: Starting Ensure the node hostname is valid for the cluster...
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster.
Jul 16 02:04:19 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Stopped Ensure the node hostname is valid for the cluster.
Jul 16 02:04:19 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: node-valid-hostname.service: Consumed 21ms CPU time
-- Reboot --
Jul 16 02:04:46 localhost systemd[1]: Starting Ensure the node hostname is valid for the cluster...
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster.
Jul 16 02:38:31 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Stopped Ensure the node hostname is valid for the cluster.
Jul 16 02:38:31 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: node-valid-hostname.service: Consumed 0 CPU time
-- Reboot --
Jul 16 02:39:23 localhost systemd[1]: Starting Ensure the node hostname is valid for the cluster...
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster.
sh-4.4# journalctl --no-pager | grep -20 "using trunacated hostname"
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8890] dhcp4 (ens4): option rfc3442_classless_static_routes => '10.0.32.1/32 0.0.0.0 0.0.0.0/0 10.0.32.1'
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8890] dhcp4 (ens4): option routers              => '10.0.32.1'
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8890] dhcp4 (ens4): option subnet_mask          => '255.255.255.255'
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8890] dhcp4 (ens4): state changed unknown -> bound
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8909] device (ens4): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8945] device (ens4): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8948] device (ens4): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8954] manager: NetworkManager state is now CONNECTED_LOCAL
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8963] manager: NetworkManager state is now CONNECTED_SITE
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8964] policy: set 'Wired Connection' (ens4) as default for IPv4 routing and DNS
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8966] policy: set-hostname: set hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' (from DHCPv4)
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8974] device (ens4): Activation: successful, device activated.
Jul 16 02:03:01 localhost dbus-daemon[1402]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.7' (uid=0 pid=1578 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0")
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8982] manager: NetworkManager state is now CONNECTED_GLOBAL
Jul 16 02:03:01 localhost dbus-daemon[1402]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found.
Jul 16 02:03:01 localhost NetworkManager[1578]: <info>  [1594864981.8991] manager: startup complete
Jul 16 02:03:01 localhost NetworkManager[1578]: <warn>  [1594864981.8996] hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' using hostnamed: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid hostname 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal'
Jul 16 02:03:01 localhost NetworkManager[1578]: <warn>  [1594864981.8996] policy: set-hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal': (1) Operation not permitted
Jul 16 02:03:01 localhost NetworkManager[1578]: <warn>  [1594864981.8996] policy: set-hostname: you should use hostnamed when systemd hardening is in effect!
Jul 16 02:03:01 localhost systemd[1]: Started Network Manager Wait Online.
Jul 16 02:03:01 localhost 90-long-hostname[1648]: yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal is longer than 63 characters, using trunacated hostname
Jul 16 02:03:01 localhost 90-long-hostname[1650]: setting transient hostname to yunjiang00123456789-cvndz-worker-a-gbzhl
Jul 16 02:03:01 yunjiang00123456789-cvndz-worker-a-gbzhl systemd-hostnamed[1589]: Changed host name to 'yunjiang00123456789-cvndz-worker-a-gbzhl'
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl bash[1656]: node identified as yunjiang00123456789-cvndz-worker-a-gbzhl
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Network is Online.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Generate /run/issue.d/console-login-helper-messages.issue...
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting NFS status monitor for NFSv2/3 locking....
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Machine Config Daemon Pull...
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Resize root partition...
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Generate /run/issue.d/console-login-helper-messages.issue.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting RPC Bind...
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting Permit User Sessions...
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Permit User Sessions.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Getty on tty1.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Serial Getty on ttyS0.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1697]: Version 2.3.3 starting
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1697]: Flags: TI-RPC
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Login Prompts.
Jul 16 02:03:02 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started RPC Bind.
--
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9009] dhcp4 (ens4): option rfc3442_classless_static_routes => '10.0.32.1/32 0.0.0.0 0.0.0.0/0 10.0.32.1'
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9009] dhcp4 (ens4): option routers              => '10.0.32.1'
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9009] dhcp4 (ens4): option subnet_mask          => '255.255.255.255'
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9009] dhcp4 (ens4): state changed unknown -> bound
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9026] device (ens4): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9047] device (ens4): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9050] device (ens4): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9055] manager: NetworkManager state is now CONNECTED_LOCAL
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9067] manager: NetworkManager state is now CONNECTED_SITE
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9069] policy: set 'Wired Connection' (ens4) as default for IPv4 routing and DNS
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9072] policy: set-hostname: set hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' (from DHCPv4)
Jul 16 02:04:47 localhost dbus-daemon[1192]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.6' (uid=0 pid=1364 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0")
Jul 16 02:04:47 localhost dbus-daemon[1192]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found.
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9097] device (ens4): Activation: successful, device activated.
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9105] manager: NetworkManager state is now CONNECTED_GLOBAL
Jul 16 02:04:47 localhost NetworkManager[1364]: <info>  [1594865087.9114] manager: startup complete
Jul 16 02:04:47 localhost NetworkManager[1364]: <warn>  [1594865087.9119] hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' using hostnamed: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid hostname 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal'
Jul 16 02:04:47 localhost NetworkManager[1364]: <warn>  [1594865087.9119] policy: set-hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal': (1) Operation not permitted
Jul 16 02:04:47 localhost NetworkManager[1364]: <warn>  [1594865087.9119] policy: set-hostname: you should use hostnamed when systemd hardening is in effect!
Jul 16 02:04:47 localhost systemd[1]: Started Network Manager Wait Online.
Jul 16 02:04:47 localhost 90-long-hostname[1438]: yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal is longer than 63 characters, using trunacated hostname
Jul 16 02:04:47 localhost 90-long-hostname[1440]: setting transient hostname to yunjiang00123456789-cvndz-worker-a-gbzhl
Jul 16 02:04:47 yunjiang00123456789-cvndz-worker-a-gbzhl systemd-hostnamed[1379]: Changed host name to 'yunjiang00123456789-cvndz-worker-a-gbzhl'
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl bash[1446]: node identified as yunjiang00123456789-cvndz-worker-a-gbzhl
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Network is Online.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting NFS status monitor for NFSv2/3 locking....
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting OpenSSH server daemon...
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting CRI-O Auto Update Script...
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting RPC Bind...
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1449]: Server listening on 0.0.0.0 port 22.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1449]: Server listening on :: port 22.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started OpenSSH server daemon.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started RPC Bind.
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1452]: Version 2.3.3 starting
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1452]: Flags: TI-RPC
Jul 16 02:04:48 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started NFS status monitor for NFSv2/3 locking..
Jul 16 02:04:49 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1450]: time="2020-07-16 02:04:49.071548782Z" level=info msg="Version file /var/run/crio/version not found: open /var/run/crio/version: no such file or directory: triggering wipe of containers"
Jul 16 02:04:49 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1450]: time="2020-07-16 02:04:49.071757721Z" level=info msg="Version file /var/lib/crio/version not found: open /var/lib/crio/version: no such file or directory: triggering wipe of images"
Jul 16 02:04:49 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started CRI-O Auto Update Script.
--
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4740] dhcp4 (ens4): option rfc3442_classless_static_routes => '10.0.32.1/32 0.0.0.0 0.0.0.0/0 10.0.32.1'
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4740] dhcp4 (ens4): option routers              => '10.0.32.1'
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4740] dhcp4 (ens4): option subnet_mask          => '255.255.255.255'
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4740] dhcp4 (ens4): state changed unknown -> bound
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4764] device (ens4): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4797] device (ens4): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4800] device (ens4): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4807] manager: NetworkManager state is now CONNECTED_LOCAL
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4821] manager: NetworkManager state is now CONNECTED_SITE
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4822] policy: set 'Wired Connection' (ens4) as default for IPv4 routing and DNS
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4824] policy: set-hostname: set hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' (from DHCPv4)
Jul 16 02:39:24 localhost dbus-daemon[1140]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.6' (uid=0 pid=1422 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0")
Jul 16 02:39:24 localhost dbus-daemon[1140]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found.
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4900] device (ens4): Activation: successful, device activated.
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4908] manager: NetworkManager state is now CONNECTED_GLOBAL
Jul 16 02:39:24 localhost NetworkManager[1422]: <info>  [1594867164.4915] manager: startup complete
Jul 16 02:39:24 localhost NetworkManager[1422]: <warn>  [1594867164.4920] hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal' using hostnamed: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid hostname 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal'
Jul 16 02:39:24 localhost NetworkManager[1422]: <warn>  [1594867164.4920] policy: set-hostname: couldn't set the system hostname to 'yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal': (1) Operation not permitted
Jul 16 02:39:24 localhost NetworkManager[1422]: <warn>  [1594867164.4920] policy: set-hostname: you should use hostnamed when systemd hardening is in effect!
Jul 16 02:39:24 localhost systemd[1]: Started Network Manager Wait Online.
Jul 16 02:39:24 localhost 90-long-hostname[1539]: yunjiang00123456789-cvndz-worker-a-gbzhl.c.openshift-gce-devel-ci.internal is longer than 63 characters, using trunacated hostname
Jul 16 02:39:24 localhost 90-long-hostname[1541]: setting transient hostname to yunjiang00123456789-cvndz-worker-a-gbzhl
Jul 16 02:39:24 yunjiang00123456789-cvndz-worker-a-gbzhl systemd-hostnamed[1436]: Changed host name to 'yunjiang00123456789-cvndz-worker-a-gbzhl'
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl bash[1547]: node identified as yunjiang00123456789-cvndz-worker-a-gbzhl
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started Ensure the node hostname is valid for the cluster.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Reached target Network is Online.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting OpenSSH server daemon...
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting CRI-O Auto Update Script...
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting NFS status monitor for NFSv2/3 locking....
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1549]: Server listening on 0.0.0.0 port 22.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl sshd[1549]: Server listening on :: port 22.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started OpenSSH server daemon.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Starting RPC Bind...
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started RPC Bind.
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1553]: Version 2.3.3 starting
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl rpc.statd[1553]: Flags: TI-RPC
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl systemd[1]: Started NFS status monitor for NFSv2/3 locking..
Jul 16 02:39:25 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1550]: time="2020-07-16 02:39:25.511604712Z" level=info msg="Version file /var/run/crio/version not found: open /var/run/crio/version: no such file or directory: triggering wipe of containers"
Jul 16 02:39:26 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1550]: time="2020-07-16 02:39:26.192069648Z" level=info msg="Wiping containers"
Jul 16 02:39:26 yunjiang00123456789-cvndz-worker-a-gbzhl crio[1550]: time="2020-07-16 02:39:26.216986174Z" level=info msg="Deleted container 7e95ba6ed3e233d5e4e4bbc6ac0288f0146b8be5d2036825f34400a58eda59e2"
sh-4.4# 
sh-4.4# hostname     
yunjiang00123456789-cvndz-worker-a-gbzhl

Comment 12 Seth Jennings 2020-08-10 20:30:52 UTC

*** Bug 1829453 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2020-10-27 16:12:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.