Bug 1838625
Summary: | After upgrade from OCP v4.3.0 to v4.3.18 one worker node is NotReady and additional localhost with the same IP is present as a node | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Radomir Ludva <rludva> |
Component: | RHCOS | Assignee: | Ben Howard <behoward> |
Status: | CLOSED DUPLICATE | QA Contact: | Michael Nguyen <mnguyen> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.3.z | CC: | aos-bugs, bbreard, dornelas, eparis, imcleod, jligon, jokerman, miabbott, nstielau |
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-06-10 16:58:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1186913 |
Description
Radomir Ludva
2020-05-21 13:11:24 UTC
Reassigning to RHCOS. There is a bug regarding localhost being set in certain situations. Additional info: ================ The oc delete localhost and worker-node plus restart of worker-node did not help to solve it. Worker-node was connected to the cluster like localhost. But this time this localhost node is regular part of the cluster with IP for the worker-node. DNS and PTR records are set correctly. After restart of another worker node this restarted worker node is not ready with localhost: $ openssl x509 -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep CN Issuer: CN = kube-csr-signer_@1588779788 Subject: O = system:nodes, CN = system:node:localhost From the logs of network-online: It looks like it is set correctly but after 4 seconds back as localhost: -------------------------------- [debug node] $ journalctl -u network-online.target -- Logs begin at Thu 2020-05-21 10:30:01 UTC, end at Fri 2020-05-22 13:43:38 UTC. -- May 22 10:40:21 worker-04.example.com systemd[1]: Stopped target Network is Online. -- Reboot -- May 22 10:43:11 localhost systemd[1]: Reached target Network is Online. Setting priority as medium and targeted for 4.6. There are a handful of other BZs related to how the hostname is handled that may be related to this one. We will investigate and do more diligent triage of this issue when capacity allows. This is a duplicate of 1809345 Backport was released via https://github.com/openshift/machine-config-operator/commit/0b2741b3c0d735446cedb3d2494d85a4cbd74b90 *** This bug has been marked as a duplicate of bug 1809345 *** |