Bug 1790823
| Summary: | [IPI][Baremetal] sometimes Mdns-publisher (infra pod) advertise node's name as 'localhost' | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Yossi Boaron <yboaron> | |
| Component: | Machine Config Operator | Assignee: | Yossi Boaron <yboaron> | |
| Status: | CLOSED ERRATA | QA Contact: | Nataf Sharabi <nsharabi> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.4 | CC: | acomabon, asegurap, bschmaus, kgarriso, kni-bugs, obockows, rgregory, rhhi-next-mgmt-qe, rsandu, scuppett, steven.barre, vvoronko, wsun | |
| Target Milestone: | --- | |||
| Target Release: | 4.4.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1802675 (view as bug list) | Environment: | ||
| Last Closed: | 2020-05-04 11:24:10 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1802675 | |||
|
Description
Yossi Boaron
2020-01-14 10:31:25 UTC
Assigning to baremetal team as this doesn't seem directly MCO related. Actually, I'm not even sure this should be filed under MCO at all.. Talked to Brad and he suggested I move this to KNI-Deployment. The reason mdns-publisher advertise node's name as 'localhost.ostest.test.metalkube.org' is: In case the MCO re-configures the DHCP client, which means that 'hostname' is not set yet on the first boot while baremetal-runtimecfg runs on first boot and configures mdns-publisher with the wrong hostname. In order to test the fix on virtual env. I did the following:
1.Install fully functional environment
2.Configure root access to one of the master nodes:
from baremetal -> ssh kni@provisionhost -> ssh core@master-0 -> sudo -s -> passwd
3.via virtual machine manager/ virsh console master-0 (from baremetal) shutdown ens4
4.enter master consle and delete the following: /etc/mdns/hostname, /etc/mdns/config.hcl
5.edit the baremetal xml (virtual machine manager/cli) & configure the dhcp to give the master the name "localhost" [1]
6.Enable back ens4
7.restart the required master node
8.via console 'cat /etc/mdns/hostname' -> Should be 'localhost'
9.Run the following:
sudo crictl ps -a | grep verify
2f55e57daee90 891ab60d7c8933530ae270a93b54ea6bd7201638c99b528d73bcaedade9198e9 38 seconds ago Running verify-hostname
sudo crictl logs 2f55e57daee90
hostname is still localhost
hostname is still localhost
This show the fix is working & mdns won't work till it will be different then 'localhost'
10.Change back section 5 to original name [2]
11.In order to restart the configuration on baremetal run:
service libvirtd restart
12.In order to the dhcp name & make mdns work run on master node:
dhclient -r
dhclient
13.Wait approx. 5 minutes
14.From the master node:
cat /etc/mdns/hostname -> Should be master-X
15.From the master node:
cat /etc/mdns/config.hcl -> should be [3]
[1]
<dhcp>
<range start="192.168.123.100" end="192.168.123.150"/>
<host mac="52:54:00:30:5f:ea" name="localhost" ip="192.168.123.146"/>
<host mac="52:54:00:eb:0f:4b" name="master-1" ip="192.168.123.121"/>
<host mac="52:54:00:fe:02:78" name="master-2" ip="192.168.123.128"/>
<host mac="52:54:00:9b:b3:1e" name="worker-0" ip="192.168.123.140"/>
<host mac="52:54:00:08:05:59" name="worker-1" ip="192.168.123.118"/>
<host mac="52:54:00:6b:55:89" name="provisionhost-0" ip="192.168.123.141"/>
</dhcp>
</ip>
[2]
<dhcp>
<range start="192.168.123.100" end="192.168.123.150"/>
<host mac="52:54:00:30:5f:ea" name="master-0" ip="192.168.123.146"/>
<host mac="52:54:00:eb:0f:4b" name="master-1" ip="192.168.123.121"/>
<host mac="52:54:00:fe:02:78" name="master-2" ip="192.168.123.128"/>
<host mac="52:54:00:9b:b3:1e" name="worker-0" ip="192.168.123.140"/>
<host mac="52:54:00:08:05:59" name="worker-1" ip="192.168.123.118"/>
<host mac="52:54:00:6b:55:89" name="provisionhost-0" ip="192.168.123.141"/>
</dhcp>
</ip>
[3]cat config.hcl
bind_address = "192.168.123.121"
collision_avoidance = "hostname"
service {
name = "ocp-edge-cluster Etcd"
host_name = "etcd-1.local."
type = "_etcd-server-ssl._tcp"
domain = "local."
port = 2380
ttl = 3200
}
service {
name = "ocp-edge-cluster Workstation"
host_name = "master-1.local."
type = "_workstation._tcp"
domain = "local."
port = 42424
ttl = 3200
}
service {
name = "ocp-edge-cluster EtcdWorkstation"
host_name = "etcd-1.local."
type = "_workstation._tcp"
domain = "local."
port = 42424
ttl = 300
}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |