Bug 1921937 - During upgrade /etc/hostname becomes a directory, nodes are set with kubernetes.io/hostname=localhost label [NEEDINFO]
Summary: During upgrade /etc/hostname becomes a directory, nodes are set with kubernet...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.8.0
Assignee: Peter Hunt
QA Contact: MinLi
URL:
Whiteboard:
: 1960691 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-28 21:57 UTC by Matt Bargenquast
Modified: 2021-10-12 09:47 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Pods mounting /etc/hosts when there isn't a static /etc/hosts file Consequence: Nodes fail to come up on a reboot, as CRI-O creates non-existent mount sources as directories, which /etc/hosts isn't allowed to be. Fix: Add special handling in CRI-O around /etc/hosts that prevents the creation of the directory if the mount source isn't present Result: Even if a user mounts /etc/hosts (which they shouldn't do), and it doesn't exist on the node, the node is protected by CRI-O, and can reboot
Clone Of:
Environment:
Last Closed: 2021-07-27 22:37:34 UTC
Target Upstream Version:
miabbott: needinfo? (behoward)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github cri-o cri-o pull 4844 0 None closed add absent_mount_sources_to_reject option 2021-05-06 13:50:30 UTC
Github cri-o cri-o pull 4857 0 None closed [release-1.21] add absent_mount_sources_to_reject option 2021-05-11 12:53:05 UTC
Github openshift machine-config-operator pull 2575 0 None open Bug 1921937: crio: reject /etc/hostname mount if absent 2021-05-11 13:07:36 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:37:54 UTC

Description Matt Bargenquast 2021-01-28 21:57:05 UTC
Description of problem:

During a recent cluster upgrade from v4.5.11 to v4.5.16, we have noticed a situation occur which seems to be very similar to BZ 1746968. We have also observed this at least one time in the past on a separate cluster again, again on v4.5.

The upgrade got stuck as etcd was unable to schedule properly, due to multiple master nodes having the label "kubernetes.io/hostname=localhost", which impacted etcd's anti-affinity scheduling.

On further investigation of the nodes with the problem (multiple masters and workers), it was discovered that /etc/hostname was a directory and that systemd-hostname was not running properly.

As a remediation, we were able to follow the steps of BZ1746968:
- Rm /etc/hostname
- Restart systemd-hostnamed
- Restart kubelet

We suspect that bug may be reoccuring due to the similarity of the symptoms.

Version-Release number of selected component (if applicable):

Upgrading from 4.5.11 to 4.5.16


How reproducible:

This problem has occurred in at least two separate cluster upgrades on v4.5.

Comment 3 Jeff Cantrill 2021-01-29 15:29:38 UTC
Moving to the containers team as the collector no longer mounts `/etc/hostname` per https://github.com/openshift/cluster-logging-operator/pull/239 associated with the bug in question.  Additional confirmation of the mounts from the daemonset:

      /etc/docker from dockerdaemoncfg (ro)
      /etc/fluent/configs.d/secure-forward from secureforwardconfig (ro)
      /etc/fluent/configs.d/syslog from syslogconfig (ro)
      /etc/fluent/configs.d/user from config (ro)
      /etc/fluent/keys from certs (ro)
      /etc/fluent/metrics from collector-metrics (rw)
      /etc/localtime from localtime (ro)
      /etc/ocp-forward from secureforwardcerts (ro)
      /etc/ocp-syslog from syslogcerts (ro)
      /etc/pki/ca-trust/extracted/pem/ from fluentd-trusted-ca-bundle (ro)
      /etc/sysconfig/docker from dockercfg (ro)
      /opt/app-root/src/run.sh from entrypoint (ro,path="run.sh")
      /run/log/journal from runlogjournal (rw)
      /var/lib/docker from varlibdockercontainers (ro)
      /var/lib/fluentd from filebufferstorage (rw)
      /var/log from varlog (rw)
      /var/run/ocp-collector/secrets/fluentd from clo-default-output-es (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from logcollector-token-k5vkk (ro)

and volumes:

Volumes:
  runlogjournal:
    Type:          HostPath (bare host directory volume)
    Path:          /run/log/journal
    HostPathType:  
  varlog:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log
    HostPathType:  
  varlibdockercontainers:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/docker
    HostPathType:  
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      fluentd
    Optional:  false
  secureforwardconfig:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      secure-forward
    Optional:  true
  secureforwardcerts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  secure-forward
    Optional:    true
  syslogconfig:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      syslog
    Optional:  true
  syslogcerts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  syslog
    Optional:    true
  entrypoint:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      fluentd
    Optional:  false
  certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  fluentd
    Optional:    false
  localtime:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/localtime
    HostPathType:  
  dockercfg:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/sysconfig/docker
    HostPathType:  
  dockerdaemoncfg:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/docker
    HostPathType:  
  filebufferstorage:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/fluentd
    HostPathType:  
  collector-metrics:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  fluentd-metrics
    Optional:    false
  clo-default-output-es:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  fluentd
    Optional:    false
  fluentd-trusted-ca-bundle:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      fluentd-trusted-ca-bundle
    Optional:  false
  logcollector-token-k5vkk:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  logcollector-token-k5vkk
    Optional:    false

Comment 4 Peter Hunt 2021-02-03 20:44:29 UTC
oops wrong one

Comment 5 Peter Hunt 2021-02-03 20:52:27 UTC
For docker compatibility, cri-o needs to create a directory at a volume host path if it does not currently exist. I think the only way to fix this would be to have cri-o start after `systemd-hostnamed.service` (or mandate no containers bind-mount /etc/hostname, which is not as simple to do...)

Comment 8 Michael Gugino 2021-03-03 21:57:16 UTC
I grepped through the must gather, 'etc/hostname' does not show up anywhere.  I grepped through our pods and deployments from a recent AWS CI run, no occurrence there either.

I took a look at a master and worker in 4.7.  Neither of these even has /etc/hostname present, everything is running fine.  systemd-hostnamed was found not running.  I restarted systemd-hostnamed, and no change, now stays running.

So, I created /etc/hostname as an empty dir, and restarted systemd-hostnamed, and that caused it to fail.  I remove the dir, restarted, and it works again.

Unfortunately, I don't think ensuring systemd-hostnamed starts first will accomplish what we want.  We need to ensure we put a file there so a directory doesn't get created instead.  I created an empty file, systemd-hostnamed didn't populate the file with anything, but it also didn't complain either.

Comment 9 Mrunal Patel 2021-03-03 23:49:05 UTC
This is what I see on master:

sh-4.4# cat /etc/hostname 
ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc
sh-4.4# ls -l /etc/hostname 
-rw-r--r--. 1 root root 41 Mar  3 20:36 /etc/hostname
sh-4.4# systemctl status systemd-hostnamed
● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed

Mar 03 20:36:43 ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc.c.openshift-gce-devel-c systemd[1]: Starting Hostname Service...
Mar 03 20:36:43 ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc.c.openshift-gce-devel-c systemd[1]: Started Hostname Service.
Mar 03 20:36:43 ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc systemd-hostnamed[1310]: Changed static host name to 'ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc'
Mar 03 20:37:13 ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc systemd[1]: systemd-hostnamed.service: Succeeded.
Mar 03 20:37:13 ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc systemd[1]: systemd-hostnamed.service: Consumed 270ms CPU time
sh-4.4# hostname
ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc
sh-4.4# uname -a
Linux ci-ln-c02zkjt-f76d1-nksxx-worker-b-bblgc 4.18.0-240.15.1.el8_3.x86_64 #1 SMP Wed Feb 3 03:12:15 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
sh-4.4# exit
exit
sh-4.4# exit
exit

Comment 10 Mrunal Patel 2021-03-03 23:50:05 UTC
Digged into the systemd code a bit and this is the function that is getting called: https://github.com/systemd/systemd/blob/v239/src/hostname/hostnamed.c#L443.

Comment 11 Mrunal Patel 2021-03-03 23:51:36 UTC
https://github.com/systemd/systemd/blob/v239/src/hostname/hostnamed.c#L488 writes to /etc/hostname

Comment 13 Mrunal Patel 2021-03-04 00:52:45 UTC
https://github.com/openshift/machine-config-operator/blob/522f0fa36cc7b952c6e98e120c58c66e6d795544/templates/common/_base/files/usr-local-sbin-set-valid-hostname.yaml#L64 calls hostnamectl --static only if /etc/hostname exists. If instead, we always have it call hostnamectl --static in addition to the transient call then we will have a valid /etc/hostname. Anyone see any downsides to that?

Comment 14 Michael Gugino 2021-03-04 02:47:47 UTC
Did you chroot to /host before checking that file?  It was present in the debug container, but not on the system /etc/hostname

Here's my terminal from the time.  I saw the same on a master and worker:

$ ./node-debug.sh ip-10-0-152-85.us-west-2.compute.internal
Starting pod/ip-10-0-152-85us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.152.85
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# systemctl status systemd-hostnamed
● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed

Mar 03 15:15:59 localhost systemd[1]: Starting Hostname Service...
Mar 03 15:15:59 localhost systemd[1]: Started Hostname Service.
Mar 03 15:15:59 ip-10-0-152-85 systemd-hostnamed[1283]: Changed host name to 'ip-10-0-152-85'
Mar 03 15:16:30 ip-10-0-152-85 systemd[1]: systemd-hostnamed.service: Succeeded.
Mar 03 15:16:30 ip-10-0-152-85 systemd[1]: systemd-hostnamed.service: Consumed 228ms CPU time
sh-4.4# ls /etc/host
host.conf  hosts       <---- this is tab complete output to terminal
sh-4.4# ls /etc/host
ls: cannot access '/etc/host': No such file or directory
sh-4.4# mount | grep /etc/host
tmpfs on /var/lib/containers/storage/overlay/bc2382db9bc5d480ba8c5596e0357a80235918339ea88378ee459d020e149db8/merged/etc/hostname type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
# ^^^ this is the debug container
sh-4.4# mkdir /etc/hostname
sh-4.4# systemctl restart systemd-hostnamed
sh-4.4# systemctl status systemd-hostnamed
● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2021-03-03 21:52:59 UTC; 12s ago
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed
  Process: 264918 ExecStart=/usr/lib/systemd/systemd-hostnamed (code=exited, status=1/FAILURE)
 Main PID: 264918 (code=exited, status=1/FAILURE)
      CPU: 568ms

Mar 03 21:52:58 ip-10-0-152-85 systemd[1]: Starting Hostname Service...
Mar 03 21:52:59 ip-10-0-152-85 systemd-hostnamed[264918]: Failed to read hostname and machine information: Is a directory
Mar 03 21:52:59 ip-10-0-152-85 systemd[1]: Started Hostname Service.
Mar 03 21:52:59 ip-10-0-152-85 systemd[1]: systemd-hostnamed.service: Main process exited, code=exited, status=1/FAILURE
Mar 03 21:52:59 ip-10-0-152-85 systemd[1]: systemd-hostnamed.service: Failed with result 'exit-code'.
Mar 03 21:52:59 ip-10-0-152-85 systemd[1]: systemd-hostnamed.service: Consumed 568ms CPU time
sh-4.4# hostnamectl 
Failed to query system properties: Message recipient disconnected from message bus without replying
sh-4.4# hostname
ip-10-0-152-85
sh-4.4# rmdir /etc/hostname 
sh-4.4# systemctl restart systemd-hostnamed
sh-4.4# systemctl status systemd-hostnamed
● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
   Active: active (running) since Wed 2021-03-03 21:55:27 UTC; 3s ago
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed
 Main PID: 266420 (systemd-hostnam)
    Tasks: 1 (limit: 100092)
   Memory: 2.4M
      CPU: 531ms
   CGroup: /system.slice/systemd-hostnamed.service
           └─266420 /usr/lib/systemd/systemd-hostnamed

Mar 03 21:55:27 ip-10-0-152-85 systemd[1]: Starting Hostname Service...
Mar 03 21:55:27 ip-10-0-152-85 systemd[1]: Started Hostname Service.
sh-4.4# ls /etc/hostname
ls: cannot access '/etc/hostname': No such file or directory
sh-4.4# touch /etc/hostname
sh-4.4# systemctl restart systemd-hostnamed
sh-4.4# systemctl status systemd-hostnamed
● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
   Active: active (running) since Wed 2021-03-03 21:56:01 UTC; 2s ago
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed
 Main PID: 266792 (systemd-hostnam)
    Tasks: 1 (limit: 100092)
   Memory: 2.4M
      CPU: 589ms
   CGroup: /system.slice/systemd-hostnamed.service
           └─266792 /usr/lib/systemd/systemd-hostnamed

Mar 03 21:56:00 ip-10-0-152-85 systemd[1]: Starting Hostname Service...
Mar 03 21:56:01 ip-10-0-152-85 systemd[1]: Started Hostname Service.
sh-4.4# cat /etc/hostname 
sh-4.4#

Comment 15 Micah Abbott 2021-03-10 22:25:22 UTC
(In reply to Mrunal Patel from comment #13)
> https://github.com/openshift/machine-config-operator/blob/
> 522f0fa36cc7b952c6e98e120c58c66e6d795544/templates/common/_base/files/usr-
> local-sbin-set-valid-hostname.yaml#L64 calls hostnamectl --static only if
> /etc/hostname exists. If instead, we always have it call hostnamectl
> --static in addition to the transient call then we will have a valid
> /etc/hostname. Anyone see any downsides to that?

@behoward Since you did the implementation of this dispatcher script, could you weigh in on the suggestion?

Comment 16 Peter Hunt 2021-03-19 19:02:26 UTC
still working through a solution

Comment 18 Peter Hunt 2021-04-30 20:13:06 UTC
I need to check in with folks about the state of this. EOD Friday is probably not the best time to expect that to happen.

Comment 19 Colin Walters 2021-05-03 19:56:10 UTC
Basically, privileged containers shouldn't mount `/etc/hostname`.  Instead, read `/proc/sys/kernel/hostname` to find the dynamic hostname value.

I could imagine the containers stack special casing the /etc/hostname mount and not making it into an empty directory.  But host-side workarounds for this would involve *another* thing being in the hostname business which we really don't want.

So my recommendation is basically close this WONTFIX.

Comment 20 Peter Hunt 2021-05-04 16:13:51 UTC
I talked with Mrunal about this, and we came up with a compromise. The problem is, we really shouldn't be bricking the nodes if we can avoid it. However, we don't want to add another entity to bicker about the hostname.

We came up with a solution where we special-case /etc/hostname in cri-o. If we find /etc/hostname and see that it does not exist on the host, we fail on container/pod creation. That way, a user gets a helpful message explaining how they messed up, but it doesn't brick the node. They can either mount /proc/sys/kernel/hostname, or create /etc/hostname themselves to work around it.

Comment 21 Peter Hunt 2021-05-05 20:35:50 UTC
attached PR, plus a PR in MCO, will implement the above behavior

Comment 22 Peter Hunt 2021-05-06 13:50:52 UTC
4.8 version attached (will still need a MCO PR once this merges)

Comment 23 Colin Walters 2021-05-17 13:58:34 UTC
*** Bug 1960691 has been marked as a duplicate of this bug. ***

Comment 24 Colin Walters 2021-05-17 15:38:49 UTC
Anyone affected by this should be able to work around it with something like this:

```
$ for node in $(oc get nodes -o name); do (set -x; oc debug -q $node -- rmdir /host/etc/hostname); done
```

`rmdir` will (harmlessly and correctly) fail if it's not a directory, so only broken nodes should be changed.

Comment 25 Colin Walters 2021-05-17 16:42:03 UTC
*** Bug 1960691 has been marked as a duplicate of this bug. ***

Comment 27 Sunil Choudhary 2021-06-11 09:30:49 UTC
Checked on 4.8.0-0.nightly-2021-06-10-224448.

crio conf now have set absent_mount_sources_to_reject = "/etc/hostname"

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-06-10-224448   True        False         3h27m   Cluster version is 4.8.0-0.nightly-2021-06-10-224448


sh-4.4# cat /etc/crio/crio.conf.d/00-default                                         
[crio]
internal_wipe = true

[crio.api]
stream_address = ""
stream_port = "10010"

[crio.runtime]
conmon = "/usr/libexec/crio/conmon"
conmon_cgroup = "pod"
default_env = [
    "NSS_SDB_USE_CACHE=no",
]
log_level = "info"
cgroup_manager = "systemd"
default_sysctls = [
    "net.ipv4.ping_group_range=0 2147483647",
]
hooks_dir = [
    "/etc/containers/oci/hooks.d",
    "/run/containers/oci/hooks.d",
]
manage_ns_lifecycle = true
absent_mount_sources_to_reject = [
    "/etc/hostname",
]

[crio.image]
global_auth_file = "/var/lib/kubelet/config.json"
pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d3dc6f2e6de660f92199ac81e9160f28518471fc04faffde6addc3f4998c8a88"
pause_image_auth_file = "/var/lib/kubelet/config.json"
pause_command = "/usr/bin/pod"

[crio.network]
network_dir = "/etc/kubernetes/cni/net.d/"
plugin_dirs = [
    "/var/lib/cni/bin",
    "/usr/libexec/cni",
]

[crio.metrics]
enable_metrics = true
metrics_port = 9537

Comment 30 errata-xmlrpc 2021-07-27 22:37:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.