Bug 1802675
Summary: | [IPI][Baremetal] sometimes Mdns-publisher (infra pod) advertise node's name as 'localhost' | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Antoni Segura Puimedon <asegurap> |
Component: | Machine Config Operator | Assignee: | Yossi Boaron <yboaron> |
Status: | CLOSED WONTFIX | QA Contact: | Nataf Sharabi <nsharabi> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.3.0 | CC: | achernet, acomabon, asegurap, bschmaus, kgarriso, kni-bugs, rgregory, rhhi-next-mgmt-qe, rsandu, scuppett, vvoronko, wsun, yboaron |
Target Milestone: | --- | ||
Target Release: | 4.3.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1790823 | Environment: | |
Last Closed: | 2020-03-24 23:43:53 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1790823 | ||
Bug Blocks: |
Comment 1
Antoni Segura Puimedon
2020-02-17 11:26:38 UTC
Hi, This bug was verified on OCP4.4 with on BZ-1790823. The scenario doesn't work on 4.3. After talking to Yossi & tried to understand how to reproduce... We've seen the following changes in the scenarios: When trying to make the dhcp to give name localhost to one of the masters, kubelet fails to start: [root@localhost ~]# systemctl status kubelet.service ● kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset:> Drop-In: /etc/systemd/system/kubelet.service.d └─10-default-env.conf, 20-nodenet.conf Active: activating (auto-restart) (Result: exit-code) since Mon 2020-03-23 1> Process: 9547 ExecStart=/usr/bin/hyperkube kubelet --config=/etc/kubernetes/k> Process: 9545 ExecStartPre=/bin/rm -f /var/lib/kubelet/cpu_manager_state (cod> Process: 9543 ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests (co> Main PID: 9547 (code=exited, status=255) CPU: 278ms Mar 23 11:02:54 localhost.localdomain systemd[1]: kubelet.service: Consumed 278> -------------------------------------------------------------------------------------- We can see that the verify-hostname script is exiting: [root@localhost ~]# sudo crictl ps -a | grep verify 53ad966386e63 fabb83d6707761415d7bc20744a8975a704c1fc61475890473483be31ad27b69 About an hour ago Exited verify-hostname 8 3f0d2a756a482 d4e5bfc13c607 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c21b0b480483bb5f42aeec39a8c6a802d1c940bb15561ee84f9df3317fb24c37 2 days ago Exited verify-hostname 0 cad4d8986a2c2 -------------------------------------------------------------------------------------- [root@localhost ~]# crictl logs 53ad966386e63 function get_hostname() { if [[ -s $RUNTIMECFG_HOSTNAME_PATH ]]; then cat $RUNTIMECFG_HOSTNAME_PATH else # if hostname wasn't updated by NM script, read hostname hostname fi } while [[ "$(get_hostname)" =~ ^localhost(.localdomain)?$ ]]; do echo "XXhostname is still set to a default value" sleep 1 done ++ get_hostname ++ [[ -s /etc/mdns/hostname ]] ++ cat /etc/mdns/hostname + [[ localhost.ocp-edge-cluster.qe.lab.redhat.com =~ ^localhost(.localdomain)?$ ]] ---------------------------------------------------------------------------------- should get from the above the following output: "hostname is still set to a default value" ---------------------------------------------------------------------------------- In addition: Mar 23 01:00:04 localhost root[1367]: NM mdns-hostname triggered by hostname. Mar 23 01:00:04 localhost nm-dispatcher[1351]: <13>Mar 23 01:00:04 root: NM mdn> Mar 23 01:00:04 localhost root[1371]: Hostname changed: localhost Mar 23 01:00:04 localhost nm-dispatcher[1351]: <13>Mar 23 01:00:04 root: Hostna> Mar 23 01:00:04 localhost dhclient[1365]: DHCPDISCOVER on enp4s0 to 255.255.255> Mar 23 01:00:04 localhost dhclient[1368]: DHCPDISCOVER on enp5s0 to 255.255.255> Mar 23 01:00:07 localhost dhclient[1368]: DHCPDISCOVER on enp5s0 to 255.255.255> Mar 23 01:00:08 localhost dhclient[1365]: DHCPDISCOVER on enp4s0 to 255.255.255> Mar 23 01:00:10 localhost dhclient[1368]: DHCPDISCOVER on enp5s0 to 255.255.255> Mar 23 01:00:14 localhost systemd[1]: NetworkManager-dispatcher.service: Consum> Mar 23 01:00:16 localhost dhclient[1368]: DHCPDISCOVER on enp5s0 to 255.255.255> Mar 23 01:00:18 localhost dhclient[1365]: DHCPDISCOVER on enp4s0 to 255.255.255> Mar 23 01:00:23 localhost dhclient[1368]: DHCPDISCOVER on enp5s0 to 255.255.255> Mar 23 01:00:25 localhost dhclient[1365]: DHCPDISCOVER on enp4s0 to 255.255.255> Mar 23 01:00:33 localhost dhclient[1365]: DHCPDISCOVER on enp4s0 to 255.255.255> Mar 23 01:00:34 localhost systemd[1]: NetworkManager-wait-online.service: Main > Mar 23 01:00:34 localhost systemd[1]: NetworkManager-wait-online.service: Faile> Mar 23 01:00:34 localhost systemd[1]: Failed to start Network Manager Wait Onli NetworkManager-wait-online.service: Main process exited, code=exited, status=1/FA Therefore I cannot verify this bug, Since I'm not able to reproduce the scenario. |