Hey sperezto. Unfortunately you can't define networking information in multiple ways and have them merged together. You provide some via Ignition: ``` [ 61.790396] ignition[1827]: INFO : files: createFilesystemsFiles: createFiles: op(1c): [started] writing file "/sysroot/etc/sysconfig/network-scripts/route-bond0" ``` and others via kernel command line (i.e. ip= bond=). If the system detects any networking was provided via Ignition it will not propagate any initramfs networking configuration. ``` [ OK ] Reached target Switch Root. [ 67.299615] coreos-teardown-initramfs[1983]: info: networking config is defined in the real root Starting Switch Root... [ 67.317105] coreos-teardown-initramfs[1983]: info: will not attempt to propagate initramfs networking ``` Unfortunately you'd need to provide the default route information via kernel command line options OR provide the full networking configuration via Ignition files.
(In reply to Dusty Mabe from comment #5) > Hey sperezto. Unfortunately you can't define networking > information in multiple ways and have them merged together. > > You provide some via Ignition: > > > ``` > [ 61.790396] ignition[1827]: INFO : files: createFilesystemsFiles: > createFiles: op(1c): [started] writing file > "/sysroot/etc/sysconfig/network-scripts/route-bond0" > ``` > > > and others via kernel command line (i.e. ip= bond=). If the system detects > any networking was provided via Ignition it will not propagate any initramfs > networking configuration. > > ``` > [ OK ] Reached target Switch Root. > [ 67.299615] coreos-teardown-initramfs[1983]: info: networking config is > defined in the real root > Starting Switch Root... > [ 67.317105] coreos-teardown-initramfs[1983]: info: will not attempt to > propagate initramfs networking > ``` > > Unfortunately you'd need to provide the default route information via kernel > command line options OR provide the full networking configuration via > Ignition files. Hi @Dusty, Thanks for the quick reply. We've been checking the policies[1][2] and code to contrast it with our scenario. We think we might have a chicken egg problem here. I'd like to give you a brief summary what the customer is facing. They have a really large cluster, around 90 nodes, and they want to add another 40 nodes. The first 90 nodes has been updated from previous OCP versions to 4.6. These 90 node's network configuration has been configured using kernel parameter plus the routes configuration into the ignition file config, which is in a machineconfig object. When trying to add new nodes, they're not able to do it because what you've mentioned, basically by the policy to propagate_initramfs_networking via ignition or dracut: if ignition: use ignition else if dracut pass through dracut to real root Being said that, what options do we have? How we can provide specific node network configuration such as IP NAMESERVER BOND parameters, etc? This is the issue we have: if only ignition files: that means that we should create a new role, new mc. Specific network parameters must be added on each new node and must be unique. else if only dracut: We need to delete the MC which creates the static route, because if we dont, won't be taken into account the kernel paramenters. So, what will happen with the nodes we've created before?, that ones that are in use already and are under the MC static route config?. What would be the best and less invasive approach? Thanks in advance, [1].-https://github.com/coreos/fedora-coreos-tracker/issues/394#issuecomment-599721173 [2].-https://github.com/coreos/fedora-coreos-config/blob/rhcos-4.6/overlay.d/05core/usr/lib/dracut/modules.d/30ignition-coreos/coreos-teardown-initramfs.sh#L36-L63
This turns out to be a bigger problem than we originally anticipated. The scenario here is that you have a cluster that was installed some time ago. After initial install, you realize you need some slight networking tweak (like added static route). You add this static route to existing nodes via the MCO (i.e. writing out `/etc/sysconfig/network-scripts/route-bond0`). Later you try to deploy a new worker node. Since the `route-bond0` file gets written out by Ignition during the initramfs, the initramfs network propagation code sees "networking configuration" that was written by Ignition and decides to not propagate any initramfs networking configuration to the real root (kernel args). So you can't deploy new nodes. We'd like to not have this problem in the future so we are discussing it more widely amongst our teams. For now there are a few possible workarounds you have that shouldn't require manually running commands on invidual nodes. Of course, manually running commands on individual nodes is still an option if you prefer. Here is what we came up with: A. Write a systemd unit that checks for the existence of the `route-bond0` file and creates it if it doesn't exist. - The systemd unit is delivered as a machine config. - The systemd unit runs in the real root before NetworkManager is started. - You'll also remove the existing machine config `route-bond0` file entry. - This works around the check for networking configuration that happens at the end of the initramfs (allows karg networking to persist). - Will need to make sure the created file has appropriate selinux context. - To minimize disruption (less reboots), the best way to do so is: - pause the corresponding machineconfigpool - delete the MC with the file entry - add the new MC with the systemd unit - unpause the machineconfigpool B. Machine Config Pool Musical Chairs: the idea here is to move all existing nodes into a new custom machineconfigpool, which will have the `route-bond0` MC entry. New nodes joining the cluster can then boot into a worker pool without that MC, such that the karg-provided networking configuration gets propagated. The new nodes can then also be moved into the custom pool to add `route-bond0`. - This doesn't require any new scripts, but has more steps. - This requires you to move new nodes to the custom pool when you boot them - To minimize disruption, the best way to do so is: - Create a custom machineconfigpool (henceforth named custom1). - Add the same MC with `route-bond0` to the custom pool. - Move all current worker nodes into the custom pool, by adding the custom1 role label. - Remove the MC with `route-bond0` from the worker pool. - When you join a new node to the cluster, join it as worker, and then move to custom pool. - For details see https://github.com/openshift/machine-config-operator/blob/master/docs/custom-pools.md
Hi Dusty, Thanks again for your suggestions. The environment has another particularity, as you said the cluster was installed some time ago, so all the network configuration is under the directory /etc/sysconfig/network-scripts (legacy networking), because all of their interfaces and nw configuration were create in previous ocp/rhco versions. For new nodes deployments, this configuration will be under /etc/NetworkManager/system-connections/, so the route-bond0 won't have any effect. After thinking about it, we might have another option here. We'd like to know your opinion. Basically, we'll create a new MachineConfigPool with data taken from the original MCP, taking away the route network configuration. This option allow us to deploy the new node and after the node is deployed, will be under the worker MachineConfigPool. Please check the procedure we've followed to test it. 1.- Export the rendered-worker MachineConfig that is currently being used by the "worker" MachineConfigPool. $ oc get mc $(oc get mcp worker -ojson|jq -r '.spec.configuration.name') -oyaml > machine_config_pool_custom_worker.yml 2.- Change the ManchieConfig render name $ sed -i s/$(oc get mcp worker -ojson|jq -r '.spec.configuration.name')/rendered-worker-custom-temporary/g machine_config_pool_custom_worker.yml $ grep rendered-worker machine_config_pool_custom_worker.yml 3.- Delete or comment out the block that reference to the route-bond0 file. $ vim machine_config_pool_custom_worker.yml ... ... ... # - contents: # source: data:text/plain;charset=utf-8;base64,QUREUkVTUzA9OC44LjguOApORVRNQVNLMD0yNTUuMjU1LjI1NS4wCkdBVEVXQVkwPTEwLjAuMTEzLjEK # mode: 420 # path: /etc/sysconfig/network-scripts/route-bond0 ... ... ... 4.- Create a new MachineConfigPool dummy, which points to a non-existent label, so as not to affect any node. The configuration will be taken from the redered we created in the last step. cat << EOF |oc apply -f - --- apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: name: workercustomtemporal spec: configuration: name: rendered-worker-temporal machineConfigSelector: matchExpressions: - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker02,worker03]} nodeSelector: matchLabels: node-role.kubernetes.io/worker02: "" ... 5.- Check the route network configuration is not in the MachineConfigPool definition, consulting the API: $ export APIURL=api-int.docp4.lab.bcnconsulting.com $ curl -k https://$APIURL:22623/config/workercustomtemporal |jq -r '.storage.files[]|select(.path=="/etc/sysconfig/network-scripts/ens3-route")' NOTE: As you can see there is no configuration, you can check the MachineConfigPool worker to see the differences. $ curl -k https://$APIURL:22623/config/worker |jq -r '.storage.files[]|select(.path=="/etc/sysconfig/network-scripts/ens3-route")' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 185k 100 185k 0 0 942k 0 --:--:-- --:--:-- --:--:-- 948k { "filesystem": "root", "overwrite": false, "path": "/etc/sysconfig/network-scripts/ens3-route", "contents": { "source": "data:text/plain;charset=utf-8;base64,QUREUkVTUzA9OC44LjguOApORVRNQVNLMD0yNTUuMjU1LjI1NS4wCkdBVEVXQVkwPTEwLjAuMTEzLjEK" }, "mode": 420 } 6.- Change the path into the ignition file that will be fetch in the new node. $ cat /var/lib/libvirt/images/openshift/ocp-dworker02.ign| tr '\r\n' ' '|jq { "ignition": { "config": { "merge": [ { "source": "https://api-int.docp4.lab.bcnconsulting.com:22623/config/workercustomtemporal" } ] }, "security": { "tls": { "certificateAuthorities": [ { "source": "data:text/plain;charset=utf-8;base64,...LS1CRUdJTiBDR.." } ] } }, "version": "3.1.0" } } 7.- Start the new node and check logs, that should be similar to: [ 6.990992] systemd[1]: Started dracut pre-mount hook. [ 7.261979] ignition[743]: GET https://api-int.docp4.lab.bcnconsulting.com:22623/config/worker03: attempt #3 [ 7.265327] ignition[743]: GET error: Get "https://api-int.docp4.lab.bcnconsulting.com:22623/config/workercustomtemporal": dial tcp: lookup api-int.docp4.lab.bcnconsulting.com on [::1]:53: read udp [::1]:37018->[::1]:53: read: connection refused [ 8.063681] ignition[743]: GET https://api-int.docp4.lab.bcnconsulting.com:22623/config/worker03: attempt #4 [ 8.066457] ignition[743]: GET error: Get "https://api-int.docp4.lab.bcnconsulting.com:22623/config/workercustomtemporal": dial tcp: lookup api-int.docp4.lab.bcnconsulting.com on [::1]:53: read udp [::1]:53736->[::1]:53: read: connection refused [ 9.665212] ignition[743]: GET https://api-int.docp4.lab.bcnconsulting.com:22623/config/worker03: attempt #5 [ 9.668339] ignition[743]: GET error: Get "https://api-int.docp4.lab.bcnconsulting.com:22623/config/workercustomtemporal": dial tcp: lookup api-int.docp4.lab.bcnconsulting.com on [::1]:53: read udp [::1]:59600->[::1]:53: read: connection refused [** ] A start job is running for Ignition (fetch) (9s / no limit)[ 12.866524] ignition[743]: GET https://api-int.docp4.lab.bcnconsulting.com:22623/config/workercustomtemporal: attempt #6 [ 12.934271] ignition[743]: GET result: OK [ 12.972592] ignition[743]: Adding "root-ca" to list of CAs [ 12.974326] ignition[743]: Adding "root-ca" to list of CAs [ 12.985327] ignition[743]: fetched base config from "system" [ 12.988217] ignition[743]: fetched user config from "qemu" [ 12.989914] ignition[743]: fetch: fetch complete [ OK ] Started Ignition (fetch)[ 12.992768] ignition[743]: fetched referenced user config from "/config/workercustomtemporal" . [ 12.995119] ignition[743]: fetch: fetch passed Startin[ 12.996686] systemd[1]: Started Ignition (fetch). g Check for FIPS mode... [ 12.998451] ignition[743]: Ignition finished successfully [ 12.999964] systemd[1]: Starting Check for FIPS mode... [ 13.180125] rhcos-fips[797]: Found /etc/ignition-machine-config-encapsulated.json in Ignition config [ 13.212998] rhcos-fips[797]: FIPS mode not requested [ 13.222819] systemd[1]: Started Check for FIPS mode. [ OK ] Started Check for FIPS mode. Thanks in advance,
(In reply to sperezto from comment #13) > For new nodes deployments, this configuration will be under > /etc/NetworkManager/system-connections/, It is true that for new deployments (4.6+) will create files under `/etc/NetworkManager/system-connections/` when propagating kernel argument networking configuration forward... > so the route-bond0 won't have any effect. ..but this isn't true. NetworkManager will still read and use `/etc/sysconfig/network-scripts/route-bond0` if it exists. I just tested this locally. Does this make option A a little more attractive now? > > > After thinking about it, we might have another option here. We'd like to > know your opinion. Basically, we'll create a new MachineConfigPool with data > taken from the original MCP, taking away the route network configuration. > This option allow us to deploy the new node and after the node is deployed, > will be under the worker MachineConfigPool. Please check the procedure we've > followed to test it. I've talked with a colleague of mine on this. Basically what you've described seems to be working (quite nicely in fact), but there is no guarantee it will work in the future. For example, pulling from https://$APIURL:22623/config/workercustomtemporal may or may not work in the future as it wasn't intended to be able to be pulled now (according to my colleague). I did mention before that we think this situation is something we want to make better in the future so hopefully the workaround you do end up using is temporary. Either way we should probably steer you towards something that is a little more future proof than what you proposed. We think option A or B will probably give you that.
Hi Dusty, Thanks for comments. We know that the workaround we've suggested is temporary, is a workaround, just to let them add nodes quickly without modify their current configuration/environments. Regarding the options you mentioned, we totally agreed with you, option A would be the best option thinking forward, but there is a few things to consider according customer's needs. - They need to add nodes quickly, by the end of this week, so option A implies that around 90 nodes must be rebooted to getting working. - As to route file, I might be doing something wrong with the route file because I've tried again and it doesn't work. This could be fixed adding a systemd unit with nmcli or ip route add as it's described in kcs[1]. I'll explain the scenario I've tested to add the route and it doesn't work: I think what is happening is that when you have an interface created by the new model, its configuration should reside under "/etc/NetworkManager/system-connections/" and the route file we created under /etc/sysconfig/network-scripts. This casuistry, I've tried a few times with same result. 1) The device ens3 is created by the coreos/ocp installation under system-connections with the connection.id: "Wired Connection". # nmcli conn NAME UUID TYPE DEVICE Wired Connection 97cabb7f-1045-41c8-a8f7-4270494fa132 ethernet ens3 # cat /etc/NetworkManager/system-connections/default_connection.nmconnection [connection] id=Wired Connection uuid=97cabb7f-1045-41c8-a8f7-4270494fa132 type=ethernet multi-connect=3 permissions= timestamp=1620663753 [ethernet] mac-address-blacklist= [ipv4] dns-search= method=auto [ipv6] addr-gen-mode=eui64 dns-search= method=auto [proxy] 2) I've tried to add a route to the interface ens3 with three different files names under the "/etc/sysconfig/network-scripts" directory: - "route-Wired Connection" - route-Wired_Connection - route-ens3 The file contains a simply route to be added: ADDRESS0=192.168.1.0 NETMASK0=255.255.255.0 GATEWAY0=10.0.113.1 3) I've tried to load nmcli config file, restart NetworkManager and reboot the server. # nmcli conn load /etc/sysconfig/network-scripts/route-ens3 4) Check routes # ip route default via 10.0.113.1 dev ens3 proto dhcp metric 100 10.0.113.0/24 dev ens3 proto kernel scope link src 10.0.113.102 metric 100 172.0.0.0/16 dev tun0 scope link 172.255.0.0/16 dev tun0 The result was the same, no routes. However, if I create the interface under /etc/sysconfig/network-scripts, then I'm able to create the route either, with the file under /etc/sysconfig/network-scripts or through nmcli. [root@dworker01 network-scripts]# nmcli conn down "Wired Connection" [root@dworker01 network-scripts]# cat ifcfg-ens3 BOOTPROTO=none DEFROUTE=yes DEVICE=ens3 NAME=ens3 GATEWAY=10.0.113.1 IPADDR=10.0.113.102 NETMASK=255.255.255.0 ONBOOT=yes TYPE=Ethernet USERCTL=no [root@dworker01 network-scripts]# cat route-ens3 ADDRESS0=192.168.1.0 NETMASK0=255.255.255.0 GATEWAY0=10.0.113.1 [root@dworker01 network-scripts]# nmcli conn up ens3 ]# nmcli conn NAME UUID TYPE DEVICE ens3 21d47e65-8523-1a06-af22-6f121086f085 ethernet ens3 Wired Connection 97cabb7f-1045-41c8-a8f7-4270494fa132 ethernet - [root@dworker01 network-scripts]# ip route default via 10.0.113.1 dev ens3 proto static metric 100 10.0.113.0/24 dev ens3 proto kernel scope link src 10.0.113.102 metric 100 172.0.0.0/16 dev tun0 scope link 172.255.0.0/16 dev tun0 192.168.1.0/24 via 10.0.113.1 dev ens3 proto static metric 100 Let me know your thoughts, [1].-https://access.redhat.com/solutions/5876771 Thanks in advance,
(In reply to sperezto from comment #15) > Hi Dusty, > > Thanks for comments. We know that the workaround we've suggested is > temporary, is a workaround, just to let them add nodes quickly without > modify their current configuration/environments. Understood. > > Regarding the options you mentioned, we totally agreed with you, option A > would be the best option thinking forward, but there is a few things to > consider according customer's needs. Understood. > > - They need to add nodes quickly, by the end of this week, so option A > implies that around 90 nodes must be rebooted to getting working. Understood. > - As to route file, I might be doing something wrong with the route file > because I've tried again and it doesn't work. This could be fixed adding a > systemd unit with nmcli or ip route add as it's described in kcs[1]. > > I'll explain the scenario I've tested to add the route and it doesn't work: > > I think what is happening is that when you have an interface created by the > new model, its configuration should reside under > "/etc/NetworkManager/system-connections/" and the route file we created > under /etc/sysconfig/network-scripts. This casuistry, I've tried a few times > with same result. I apologize. I could swear in my local testing this was working yesterday, but I just tried to re-confirm and it does not look like it is working. I just tried with 4.6 and 4.7. I must have been mistaken yesterday. This makes option A less attractive as we'd have to change it up a bit. I guess since you've got the constraints you listed above you'll need to with the other strategy for now anyway.
Hey Dusty, Thanks again. I think option A still is the good one thinking forward. One thing to take into account is that it can be done through "nmcli or ip route add" just like support suggest in KCS[1], no matter how the interface is created. Regarding if we'are going to use that workaround, we'll wait to support. [1].-https://access.redhat.com/solutions/5876771 - Create static routes post cluster installation for a specific worker pool Cheers,
For problems like this in the future we have implemented upstream a `coreos.force_persist_ip` kernel argument that can be used to force propagation of initramfs networking configuration (ip= kernel arguments) even if some Network configuration was defined in Ignition. Using that kernel argument (once it is in an openshift release) should make it so the customer could get unstuck much easier. Upstream Issue/PR: - https://github.com/coreos/fedora-coreos-tracker/issues/853 - https://github.com/coreos/fedora-coreos-config/pull/1045
The associated boot image BZ had code merged and moved to VERIFIED; moving this to MODIFIED
Verified on registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-07-01-043852 which has RHCOS 48.84.202106091622-0 as the boot image. Using steps outlined in https://bugzilla.redhat.com/show_bug.cgi?id=1958930#c29 I was able to get the kernel ip arguments to persist while concurrently having the ignition network configuration laid out. State: idle Deployments: * ostree://457db8ff03dda5b3ce1a8e242fd91ddbe6a82f838d1b0047c3d4aeaf6c53f572 Version: 48.84.202106091622-0 (2021-06-09T16:25:42Z)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days