> After deployment, the machine configured with nmstate in install-config didn’t get a proper name (presumably because the naming is done by the dhcp server) Note that by default NetworkManager will try to derive the hostname from DHCP, or a reverse DNS lookup. So in the case where it's not provided via DHCP, you'll need to ensure there's a PTR record to map the static IP to the DNS name (which should then be used by NM to set the hostname AFAIK)
I was just reminded that last week we were able to login to the host even after reboot. It seems like the newer version has different behavior. The host doesn't receive an IP address on enp0s4 after reboot: [root@sealusa3 ~]# virsh console master-0-0 Connected to domain master-0-0 Escape character is ^] (Ctrl + ]) [59351.953782] overlayfs: unrecognized mount option "volatile" or missing value Password: Login incorrect localhost login: [59407.951065] overlayfs: unrecognized mount option "volatile" or missing value login: timed out after 60 seconds Red Hat Enterprise Linux CoreOS 410.84.202201081937-0 (Ootpa) 4.10 Ignition: ran on 2022/01/09 10:34:58 UTC (at least 2 boots ago) Ignition: user-provided config was applied SSH host key: SHA256:KxzDDMYTb4c6e24AK0Jh7zwhQcV8Szs1w0JS8PQbfsQ (ECDSA) SSH host key: SHA256:f/aphkZM8I5wva45/fLXeYAzbWoPfAtj1DAdadlnAAk (ED25519) SSH host key: SHA256:HEPcJSxfynjZkP+aEPS8WUo0PFNgk/OpPVHMeRwNZJI (RSA) enp0s3: 172.22.0.59 fe80::5054:ff:fe26:d659 enp0s4: localhost login: [59463.984038] overlayfs: unrecognized mount option "volatile" or missing value
Thanks Zane. Back to the hostname. Following the suggestion by @shardy, we ran a deployment with dns configuration for the host that gets static ip. This doesn't seem to have changed the behavior. Suggestions or comments on wrong configuration are welcome. [root@sealusa3 ~]# virsh net-edit baremetal-0 ..... <domain name='ocp-edge-cluster-0.qe.lab.redhat.com' localOnly='yes'/> <dns enable='yes'> <forwarder domain='apps.ocp-edge-cluster-0.qe.lab.redhat.com' addr='127.0.0.1'/> <forwarder domain='api.ocp-edge-cluster-0.qe.lab.redhat.com' addr='127.0.0.1'/> <host ip='192.168.123.1'> <hostname>registry</hostname> <hostname>hypervisor</hostname> </host> <host ip='192.168.123.11'> <hostname>master-0-0</hostname> <hostname>master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com</hostname> </host> .... install-config.yaml: ..... networkConfig: | routes: config: - destination: 0.0.0.0/0 next-hop-address: 192.168.123.1 next-hop-interface: enp0s4 dns-resolver: config: server: - 192.168.123.1 interfaces: - name: enp0s4 type: ethernet state: up ipv4: address: - ip: "192.168.123.11" prefix-length: 24 enabled: true ..... [kni@provisionhost-0-0 ~]$ oc get nodes NAME STATUS ROLES AGE VERSION localhost.localdomain Ready master 54m v1.22.1+6859754 master-0-1 Ready master 61m v1.22.1+6859754 master-0-2 Ready master 61m v1.22.1+6859754 worker-0-0 Ready worker 32m v1.22.1+6859754 worker-0-1 Ready worker 30m v1.22.1+6859754 [kni@provisionhost-0-0 ~]$ sh-4.4# nmcli dev show br-ex GENERAL.DEVICE: br-ex GENERAL.TYPE: ovs-interface GENERAL.HWADDR: 52:54:00:EE:99:D8 GENERAL.MTU: 1500 GENERAL.STATE: 100 (connected) GENERAL.CONNECTION: ovs-if-br-ex GENERAL.CON-PATH: /org/freedesktop/NetworkManager/ActiveConnection/8 IP4.ADDRESS[1]: 192.168.123.11/24 IP4.GATEWAY: 192.168.123.1 IP4.ROUTE[1]: dst = 192.168.123.0/24, nh = 0.0.0.0, mt = 800 IP4.ROUTE[2]: dst = 169.254.169.0/30, nh = 192.168.123.1, mt = 0 IP4.ROUTE[3]: dst = 172.30.0.0/16, nh = 192.168.123.1, mt = 0 IP4.ROUTE[4]: dst = 0.0.0.0/0, nh = 192.168.123.1, mt = 800 IP4.DNS[1]: 192.168.123.1 IP6.GATEWAY: -- sh-4.4#
On the host that's coming up as localhost, please can you check the reverse DNS lookup, e.g dig -x 192.168.123.11 in the example above We need to confirm there is a PTR record for NM to derive the hostname from the IP Also please can you save the journal (either for all services, or at least for NetworkManager) somewhere? Thanks!
Hmm, I tried reproducing this in my dev env and was not able to. The node I assigned a static IP to came back up with the same IP and hostname as before. I'll have to take a closer look at your setup to see if I can find any differences.
I was able to reproduce this today (I'm not positive what changed from yesterday, but looking into that too) and it looks like it's something to do with configure-ovs. In the logs after the reboot, I first see: Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: + nmcli -g all c show Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: Wired Connection:fb66f30a-840d-4bc4-aad8-8e3642322600:802-3-ethernet:1642614036:Wed Jan 19 17\:40\:36 2022:yes:0:no:/org/freedesktop/NetworkManager/Settings/1:ye> [snip] Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: enp2s0:60ed6e8e-7990-4691-b563-dad469da1faf:802-3-ethernet:1642546018:Tue Jan 18 22\:46\:58 2022:yes:0:no:/org/freedesktop/NetworkManager/Settings/2:no:::::/etc/> Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: + ip -d address show [snip] Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: 3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000 Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: link/ether 00:63:9d:e0:32:a8 brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535 Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: openvswitch_slave numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 [snip] Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: 8: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: link/ether 00:63:9d:e0:32:a8 brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535 Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: openvswitch numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: inet 192.168.111.30/24 brd 192.168.111.255 scope global noprefixroute br-ex Jan 19 17:40:43 master-0.ostest.test.metalkube.org configure-ovs.sh[2516]: valid_lft forever preferred_lft forever This is pretty much what I would expect to see. Note the .30 address on br-ex, which is the static IP I configured on the node. However, after configure-ovs tears down the existing bridge so it can re-create it, I see: Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: + nmcli -g all c show Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: Wired Connection:fb66f30a-840d-4bc4-aad8-8e3642322600:802-3-ethernet:1642614043:Wed Jan 19 17\:40\:43 2022:yes:0:no:/org/freedesktop/NetworkManager/Settings/1:yes:enp1s0:activated:/org/fr> Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: Wired Connection:fb66f30a-840d-4bc4-aad8-8e3642322600:802-3-ethernet:1642614043:Wed Jan 19 17\:40\:43 2022:yes:0:no:/org/freedesktop/NetworkManager/Settings/1:yes:enp2s0:activated:/org/fr> Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: enp2s0:60ed6e8e-7990-4691-b563-dad469da1faf:802-3-ethernet:1642546018:Tue Jan 18 22\:46\:58 2022:yes:0:no:/org/freedesktop/NetworkManager/Settings/2:no:::::/etc/NetworkManager/system-conn> Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: + ip -d address show [snip] Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: 3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: link/ether 00:63:9d:e0:32:a8 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: inet 192.168.111.20/24 brd 192.168.111.255 scope global dynamic noprefixroute enp2s0 Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: valid_lft 3600sec preferred_lft 3600sec Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: inet6 fe80::263:9dff:fee0:32a8/64 scope link tentative noprefixroute Jan 19 17:40:43 master-0 configure-ovs.sh[2516]: valid_lft forever preferred_lft forever Now enp2s0 has reverted to the DHCP address. Also note that in the nmcli output "Wired Connection" has changed and there is now a separate one for enp1s0 and enp2s0. I think that must be overriding the enp2s0 connection that we created on day 1. Note that this did _not_ happen the first time I rebooted this node, which is why it came up correctly. It's not clear to me what is triggering this behavior yet though.
Created attachment 1852594 [details] config and logs from reboot Okay, I'm still not sure how to fix this, but it's definitely a problem with the process where configure-ovs tears down the old bridge and replaces it with a new one. The node initiallly comes up with the correct address, but after NetworkManager is restarted it seems to revert the interface to DHCP. I'm going to have to ask for help from the NetworkManager and/or SDN teams to figure this out.
This has been reproduced today on real baremetal
Changing component to SDN.
@bnemec There is two interfaces on the node: enp1s0 and enp2s0 There is a NM connection profile 60ed6e8e-7990-4691-b563-dad469da1faf for enp2s0 to configure static ip. The node is booted with karg ip=dhcp, which will cause generation of profile fb66f30a-840d-4bc4-aad8-8e3642322600 to configure *ANY* interface with dhcp. This means that enp2s0 can be indistinctly activated with profile 60ed6e8e-7990-4691-b563-dad469da1faf or fb66f30a-840d-4bc4-aad8-8e3642322600. When you look at this configuration statically, it makes no sense. You should either configure karg ip=enp1s0:dhcp or configure profile 60ed6e8e-7990-4691-b563-dad469da1faf with more priority than the default. Looking a things dynamically, there is something that makes the node boot always with enp2s0 activated the expected profile. And then when configure-ovs reloads NM it triggers some kind of round robin which makes the profile for enp2s0 to switch from 60ed6e8e-7990-4691-b563-dad469da1faf to fb66f30a-840d-4bc4-aad8-8e3642322600.
I was able to make this work with this machine-config: apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: 10-static-workaround-master spec: config: ignition: version: 3.2.0 systemd: units: - contents: | [Unit] Description=Static IP Workaround Wants=NetworkManager-wait-online.service After=NetworkManager-wait-online.service Before=ovs-configuration.service [Service] Type=oneshot ExecStart=/bin/bash -c "for i in $(nmcli --fields NAME,UUID -t con show | grep 'Wired Connection' | awk -F : '{print $2}'); do nmcli con modify $i match.interface-name '!enp2s0'; done" [Install] WantedBy=multi-user.target enabled: true name: static-workaround.service I _think_ a more proper fix would be to set connection.autoconnect-priority to something >0 on the static connection profile, but I'm not sure nmstate exposes that option. I'm checking with that team to see if there's some way to do it. Note that we can't change boot parameters because the provisioning layer doesn't understand the network config, it just passes it through to the host. There's no automated way for us to tell which interfaces should be excluded from DHCP.
One thing we could do is not pass any ip= arguments if the user has specified some network config. A corollary of this is that if the user specifies *any* network config, they'd be responsible for providing *all* of the network config necessary for the Node to come up. (So e.g. if they have an IPv6 cluster they'll need to explicitly specify that the interface must wait for IPv6, something that we do for them if they don't provide any network config.) Maybe that is OK? If the alternative is that the user has to manually set the priority (if that is even possible) on each interface then that is not much better. I guess we could manually adjust the generated NetworkManager keyfiles to set the priorities for all interfaces higher than for the default, but that's a last resort.
- How upgrade-able would be forcing users to create profiles in a given manner when they might have already created them? Are all profiles re-generated on an upgrade procedure? - Maybe we can handle increasing the priority in configure-ovs automatically? - Should we handle this for SDN as well? While configure-ovs does not reload NM for SDN, some other workflow could do it and trigger the problem there as well.
AFAIK, NM already generates default dhcp profiles for wired connections without the need for a ip=dhcp kernel argument. It also generates them with -999 priority and generates a separate one for each device. Would this be sufficient?
(In reply to Ben Nemec from comment #20) > I was able to make this work with this machine-config: > > apiVersion: machineconfiguration.openshift.io/v1 > kind: MachineConfig > metadata: > labels: > machineconfiguration.openshift.io/role: master > name: 10-static-workaround-master > spec: > config: > ignition: > version: 3.2.0 > systemd: > units: > - contents: | > [Unit] > Description=Static IP Workaround > Wants=NetworkManager-wait-online.service > After=NetworkManager-wait-online.service > Before=ovs-configuration.service > [Service] > Type=oneshot > ExecStart=/bin/bash -c "for i in $(nmcli --fields NAME,UUID -t con > show | grep 'Wired Connection' | awk -F : '{print $2}'); do nmcli con modify > $i match.interface-name '!enp2s0'; done" > [Install] > WantedBy=multi-user.target > enabled: true > name: static-workaround.service > > I _think_ a more proper fix would be to set connection.autoconnect-priority > to something >0 on the static connection profile, but I'm not sure nmstate > exposes that option. I'm checking with that team to see if there's some way > to do it. > > Note that we can't change boot parameters because the provisioning layer > doesn't understand the network config, it just passes it through to the > host. There's no automated way for us to tell which interfaces should be > excluded from DHCP. I have successfully reproduced this on my environment.
Calrification: [core@master-0-0 ~]$ ip route |grep static default via 192.168.123.1 dev br-ex proto static metric 800 after applying the machine config, the route is no longer "auto", but static, as it should be. rebooting node after applying the machineconfig the node preserves its IP after reboot. @vpickard there is still a discussion to be made on if this can be integrated into the installation process, but if not it needs to be documented for the customers.
The issue is reproduced on a node with day1 networking that is added to a cluster: networking secret: before node reboot: [core@openshift-worker-3 ~]$ nmcli con show ovs-if-br-ex .............. ipv4.method: manual ipv4.dns: 10.46.0.31 ipv4.dns-search: -- ipv4.dns-options: -- ipv4.dns-priority: 40 ipv4.addresses: 10.46.29.136/25 ipv4.gateway: -- ipv4.routes: { ip = 0.0.0.0/0, nh = 10.46.29.254 table=254 } ipv4.route-metric: -1 ipv4.route-table: 0 (unspec) ipv4.routing-rules: -- after node reboot: [core@openshift-worker-3 ~]$ nmcli con show ovs-if-br-ex .......... ipv4.method: auto ipv4.dns: -- ipv4.dns-search: -- ipv4.dns-options: -- ipv4.dns-priority: 0 ipv4.addresses: -- ipv4.gateway: -- ipv4.routes: -- ipv4.route-metric: 49 ipv4.route-table: 0 (unspec) ipv4.routing-rules: --
It looks like the issue does not reproduce on an environment with no dhcp server: [core@master-0-0 ~]$ nmcli con show ovs-if-br-ex connection.id: ovs-if-br-ex connection.uuid: 34872a8f-3b68-4e30-b34b-66dee095395d connection.stable-id: -- connection.type: ovs-interface connection.interface-name: br-ex connection.autoconnect: yes connection.autoconnect-priority: 0 ........ ipv4.method: manual ipv4.dns: 192.168.123.1 ipv4.dns-search: -- ipv4.dns-options: -- ipv4.dns-priority: 40 ipv4.addresses: 192.168.123.11/24 ipv4.gateway: -- ipv4.routes: { ip = 0.0.0.0/0, nh = 192.168.123.1 table=254 } ipv4.route-metric: -1 ipv4.route-table: 0 (unspec) ipv4.routing-rules: -- ipv4.ignore-auto-routes: no ipv4.ignore-auto-dns: no ipv4.dhcp-client-id: mac ipv4.dhcp-iaid: -- [core@master-0-0 ~]$ sudo reboot Connection to master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com closed by remote host. Connection to master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com closed. [kni@provisionhost-0-0 ~]$ ssh core.qe.lab.redhat.com Red Hat Enterprise Linux CoreOS 410.84.202202142040-0 Part of OpenShift 4.10, RHCOS is a Kubernetes native operating system managed by the Machine Config Operator (`clusteroperator/machine-config`). WARNING: Direct SSH access to machines is not recommended; instead, make configuration changes via `machineconfig` objects: https://docs.openshift.com/container-platform/4.10/architecture/architecture-rhcos.html --- Last login: Wed Feb 16 17:04:49 2022 from 192.168.123.56 [systemd] Failed Units: 1 NetworkManager-wait-online.service [core@master-0-0 ~]$ nmcli con show ovs-if-br-ex connection.id: ovs-if-br-ex connection.uuid: be29452f-5b54-40d4-b75f-20fe043e0166 connection.stable-id: -- connection.type: ovs-interface connection.interface-name: br-ex connection.autoconnect: yes connection.autoconnect-priority: 0 ..................... ipv4.method: manual ipv4.dns: 192.168.123.1 ipv4.dns-search: -- ipv4.dns-options: -- ipv4.dns-priority: 40 ipv4.addresses: 192.168.123.11/24 ipv4.gateway: -- ipv4.routes: { ip = 0.0.0.0/0, nh = 192.168.123.1 table=254 } ipv4.route-metric: -1 ipv4.route-table: 0 (unspec) ipv4.routing-rules: -- [core@master-0-0 ~]$
@bnemec I just tried to apply the networkConfig workaround to a cluster that was deployed with 4.8, then upgraded to 4.10 and scaled up with the new node having static ip. It doesn't seem to have worked. must-gather: http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/must-gather.local.2689457161302658140.tar.gz ClusterID: da2388a3-ca07-43f5-a002-4a47081c6276 ClusterVersion: Stable at "4.10.0-0.nightly-2022-03-01-224543" ClusterOperators: clusteroperator/dns is progressing: DNS "default" reports Progressing=True: "Have 5 available node-resolver pods, want 6." clusteroperator/machine-config is not available (Cluster not available for [{operator 4.10.0-0.nightly-2022-03-01-224543}]) because Failed to resync 4.10.0-0.nightly-2022-03-01-224543 because: failed to apply machine config daemon manifests: timed out waiting for the condition during waitForDaemonsetRollout: Daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1) clusteroperator/monitoring is not available (Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.) because Failed to rollout the stack. Error: updating alertmanager: waiting for Alertmanager object changes failed: waiting for Alertmanager openshift-monitoring/main: expected 2 replicas, got 1 updated replicas updating prometheus-k8s: waiting for Prometheus object changes failed: waiting for Prometheus openshift-monitoring/k8s: expected 2 replicas, got 1 updated replicas clusteroperator/network is degraded because DaemonSet "openshift-multus/multus" rollout is not making progress - last change 2022-03-06T08:05:38Z DaemonSet "openshift-multus/multus-additional-cni-plugins" rollout is not making progress - last change 2022-03-06T08:05:38Z DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2022-03-06T08:05:38Z
Regarding adina's previous comment, note that testing the workaround on a normally provisioned cluster (one not upgraded from 4.8) seemed to have succeeded. tests were done on IPI on virtual baremetal
Do you mean rebooting the scaled up node fails, or that the scaleup itself fails?
(In reply to Ben Nemec from comment #36) > Do you mean rebooting the scaled up node fails, or that the scaleup itself > fails? I mean that applying the workaround failed. The above error is seen after trying to create the machine-config for workers. Again, please note that this is a cluster that was deployed with 4.8, then upgraded to 4.10 and then scaled up with day1 network configuration.
I used real baremetal to reproduce this issue. I noticed two points: 1. After the machine is restarted, the static IP address is overwritten by DHCP only on the worker, not on the maser. 2. After the worker's static IP address is overwritten by DHCP, its corresponding Machine CR still displayed the manual configuration IP in Status.Addresses.Address field. In addition, the network set by day1 networking will eventually be saved to [BMH.Spec.Provisioningnetworkdataname](https://github.com/metal3-io/baremetal-operator/blob/main/apis/metal3.io/v1alpha1/baremetalhost_types.go#L376-L380), and I'm not sure if it should continue to work after the BMH provisioned, as there is no longer this configuration in BMH.Spec.
(In reply to Adina Wolff from comment #34) > @bnemec I just tried to apply the networkConfig workaround to a > cluster that was deployed with 4.8, then upgraded to 4.10 and scaled up with > the new node having static ip. > It doesn't seem to have worked. It's not possible to use network config when scaling up clusters installed pre-4.10. These clusters will still be installing the QCOW (not from the live ISO) so they won't get any of the network config applied to the nodes. Manual intervention is needed to change the image type in the MachineSet before scaling up will be able to use network config. Given that it's not possible to have installed 4.8 (or 4.9) in an environment where static IPs are required, I'm not sure that this is a test case we need to worry about.
@zbitter Indeed, I updated the image in the machineSet before I was able to properly scale-up. (I got instructions from @shardy) If you are saying that this scenario shouldn't be supported at all, we can stop testing it.
(In reply to Adina Wolff from comment #40) > @zbitter Indeed, I updated the image in the machineSet before I > was able to properly scale-up. (I got instructions from @shardy) Ah, OK. If the MachineSet is updated then it should work the same as a fresh 4.10 cluster, so if not then it is indeed a bug.
Thanks @zbitter so would you suggest opening a seperate bug to track this, or is the comment here sufficient?
Here is probably sufficient if there's no evidence of a separate cause.
*** Bug 2082962 has been marked as a duplicate of this bug. ***
A better workaround for more recent NM versions can be found in https://bugzilla.redhat.com/show_bug.cgi?id=1934122#c24
Closing -- there is a workaround documented.