Bug 1895979
Summary: | Unable to get coreos-installer with --copy-network to work | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jonas Nordell <jnordell> | ||||||
Component: | RHCOS | Assignee: | Jonathan Lebon <jlebon> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4.6 | CC: | bbreard, bgilbert, imcleod, jlebon, jligon, miabbott, nstielau, sejug | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 4.7.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Cause: Network-related service units were not strictly ordered correctly.
Consequence: Sometimes, network configurations copied using `-copy-network` did not take effect on the first reboot into the installed system.
Fix: The ordering of the relevant service units has been fixed.
Result: Network configurations copied using `--copy-network` now always take effect on the first reboot into the installed system.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1899286 (view as bug list) | Environment: | |||||||
Last Closed: | 2021-02-24 15:31:28 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1899286 | ||||||||
Attachments: |
|
Description
Jonas Nordell
2020-11-09 15:12:00 UTC
Created attachment 1727860 [details]
screenshot picture3.png
I was unable to reproduce this in a local VM test. I used the same `nmcli` commands and observed that the NM file was correctly written. Is it possible that your Ignition config is also writing the `/etc/NetworkManager/system-connections/default_connection.nmconnection`? Could you provide the full journal from the host showing the boot after the install was done? That would give us insight as to if Ignition is writing out the same file. A copy of the Ignition configuration would be useful, too. The boot logs show that coreos-copy-firstboot-network and coreos-teardown-network are picking up and propagating the injected NM config: ``` [ 6.861733] coreos-copy-firstboot-network[698]: info: copying files from /mnt/boot_partition/coreos-firstboot-network to /run/NetworkManager/system-connections/ [ 6.870657] coreos-copy-firstboot-network[698]: '/mnt/boot_partition/coreos-firstboot-network/default_connection.nmconnection' -> '/run/NetworkManager/system-connections/default_connection.nmconnection' ... [ 17.888523] coreos-teardown-initramfs[1105]: info: no networking config is defined in the real root [ 17.891753] coreos-teardown-initramfs[1105]: info: propagating initramfs networking config to the real root [ 17.906937] coreos-teardown-initramfs[1105]: /usr/bin/coreos-relabel [ 18.085890] coreos-teardown-initramfs[1105]: Relabeled /sysroot//etc/NetworkManager/system-connections/default_connection.nmconnection from (null) to system_u:object_r:NetworkManager_etc_rw_t:s0 ``` (I opened https://github.com/coreos/fedora-coreos-config/pull/732 to make it easier to tell what files coreos-teardown-initramfs actually copied.) One test worth doing is booting with `rd.break` and inspecting `/sysroot//etc/NetworkManager/system-connections/default_connection.nmconnection`. If it has the correct contents, then it means that something in the real root is modifying the config (possibly NM itself?). If it doesn't, then it's something in the initrd. As mentioned in https://github.com/coreos/fedora-coreos-config/pull/733#issuecomment-724914891, a workaround for this is to boot with `rd.neednet=1`. You can do this with `coreos-installer install --firstboot-args 'rd.neednet=1'`. Can you verify that this fixes the issue? I can confirm that adding "--firstboot-args 'rd.neednet=1'" solved my issue and the node booted with IP I had setup with nmcli before running coreos-installer. Another verification that adding "--firstboot-args 'rd.neednet=1'" fixed this issue for me as well. Fix for this is in https://github.com/openshift/installer/pull/4414. This is pending the merge of the installer PR; setting UpcomingSprint to appease the bots. The Installer PR is merged; moving to MODIFIED Verified with RHCOS 47.83.202012072242-0 From the Live ISO: - sudo nmcli con mod "Wired Connection" ipv4.addresses 192.168.122.100/24 - sudo nmcli con mod "Wired Connection" ipv4.gateway 192.168.122.1 - sudo nmcli con mod "Wired Connection" ipv4.dns 192.168.122.1 Confirmed the /etc/NetworkManager/system-connections/default_connection.nmconnection was configured properly Installed RHCOS via: - sudo coreos-installer install --copy-network --insecure-ignition --ignition-url=http://192.168.122.1/ignitionv3.json /dev/vda Inspected system after install ``` [core@localhost ~]$ rpm-ostree status State: idle Deployments: ● ostree://d70e44dde4765c2b59cedae6c399c7255a4bb877cc80b1be5c93cbe614b1d395 Version: 47.83.202012072242-0 (2020-12-07T22:46:11Z) [core@localhost ~]$ sudo cat /etc/NetworkManager/system-connections/default_connection.nmconnection [connection] id=Wired Connection uuid=3106acba-86dd-4e6e-b8dd-5676e78df2b6 type=ethernet multi-connect=3 permissions= timestamp=1607616494 [ethernet] mac-address-blacklist= [ipv4] address1=192.168.122.100/24,192.168.122.1 dns=192.168.122.1; dns-search= method=auto [ipv6] addr-gen-mode=eui64 dns-search= method=auto [proxy] $ cat /usr/lib/dracut/modules.d/15coreos-network/coreos-copy-firstboot-network.service # This unit will run early in boot and detect if the user copied # in firstboot networking config files into the installed system # (most likely by using `coreos-installer install --copy-network`). # Since this unit is modifying network configuration there are some # dependencies that we have: # # - Need to look for networking configuration on the /boot partition # - i.e. after /dev/disk/by-label/boot is available # - and after the ignition-dracut GPT generator (see below) # - Need to run before networking is brought up. # - This is done in nm-run.sh [1] that runs as part of dracut-initqueue [2] # - i.e. Before=dracut-initqueue.service # - Need to make sure karg networking configuration isn't applied # - There are two ways to do this. # - One is to run *before* the nm-config.sh [3] that runs as part of # dracut-cmdline [4] and `ln -sf /bin/true /usr/libexec/nm-initrd-generator`. # - i.e. Before=dracut-cmdline.service # - Another is to run *after* nm-config.sh [3] in dracut-cmdline [4] # and just delete all the files created by nm-initrd-generator. # - i.e. After=dracut-cmdline.service, but Before=dracut-initqueue.service # - We'll go with the second option here because the need for the /boot # device (mentioned above) means we can't start before dracut-cmdline.service # # [1] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/nm-run.sh # [2] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/module-setup.sh#L37 # [3] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/nm-config.sh # [4] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/module-setup.sh#L36 # [Unit] Description=Copy CoreOS Firstboot Networking Config ConditionPathExists=/usr/lib/initrd-release DefaultDependencies=false Before=ignition-diskful.target Before=dracut-initqueue.service After=dracut-cmdline.service # Any services looking at mounts need to order after this # because it causes device re-probing. After=coreos-gpt-setup.service # Since we are mounting /boot/, require the device first Requires=dev-disk-by\x2dlabel-boot.device After=dev-disk-by\x2dlabel-boot.device # Need to run after coreos-enable-network since it may re-run the NM cmdline # hook which will generate NM configs from the network kargs, but we want to # have precedence. After=coreos-enable-network.service [Service] Type=oneshot RemainAfterExit=yes # The MountFlags=slave is so the umount of /boot is guaranteed to happen # /boot will only be mounted for the lifetime of the unit. MountFlags=slave ExecStart=/usr/sbin/coreos-copy-firstboot-network $ journalctl -b | grep coreos-copy Dec 10 16:15:52 localhost coreos-copy-firstboot-network[704]: info: copying files from /mnt/boot_partition/coreos-firstboot-network to /run/NetworkManager/system-connections/ Dec 10 16:15:52 localhost coreos-copy-firstboot-network[704]: '/mnt/boot_partition/coreos-firstboot-network/default_connection.nmconnection' -> '/run/NetworkManager/system-connections/default_connection.nmconnection' Dec 10 16:15:55 localhost systemd[1]: coreos-copy-firstboot-network.service: Succeeded. Dec 10 16:15:56 localhost systemd[1]: coreos-copy-firstboot-network.service: Succeeded. ``` Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |