Bug 1899286 - [4.6.z] Unable to get coreos-installer with --copy-network to work
Summary: [4.6.z] Unable to get coreos-installer with --copy-network to work
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.z
Assignee: Jonathan Lebon
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1895979
Blocks: 1899176
TreeView+ depends on / blocked
 
Reported: 2020-11-18 20:03 UTC by Micah Abbott
Modified: 2021-03-17 21:50 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1895979
Environment:
Last Closed: 2020-12-14 13:50:58 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:5259 0 None None None 2020-12-14 13:51:16 UTC

Description Micah Abbott 2020-11-18 20:03:50 UTC
+++ This bug was initially created as a clone of Bug #1895979 +++

Description of problem:

In the documentation [1] it states that changes made with nmcli and/or nmtui in the Live ISO environment can be persisted with the use of --copy-network together with coreos-installer.

But when I try this nothing is persisted and after the first reboot the network configuration does not contain any of my customization. 

[1] https://docs.openshift.com/container-platform/4.6/installing/installing_bare_metal/installing-bare-metal-network-customizations.html#installation-user-infra-machines-advanced_network_installing-bare-metal-network-customizations



Version-Release number of selected component (if applicable):

rhcos-4.6.1-x86_64-live.x86_64.iso


How reproducible:
Every time

Steps to Reproduce:
1. Load Live ISO image
2. Change Network settings with nmcli
   - sudo nmcli con mod "Wired Connection" ipv4.addresses 10.0.1.123/24
   - sudo nmcli con mod "Wired Connection" ipv4.gateway 10.0.0.1
   - sudo nmcli con mod "Wired Connection" ipv4.dns 10.0.0.1 
3. Verify NetworkManager configuration, se attached screenshoot picture2.png
4. Run coreos-installer, see attached screenshoot picture3.png
5. reboot
6. Verify NetworkManager configuration



Actual results:

[core@localhost ~]$ sudo cat /etc/NetworkManager/system-connections/default_connection.nmconnection 
[connection]
id=Wired Connection
uuid=3ab5973e-8dfa-41ca-963f-68c1089347f7
type=ethernet
multi-connect=3
permissions=

[ethernet]
mac-address-blacklist=

[ipv4]
dns-search=
method=auto

[ipv6]
addr-gen-mode=eui64
dns-search=
method=auto

[proxy]


Expected results:

/etc/NetworkManager/system-connections/default_connection.nmconnection should look like before the reboot.


Additional info:

--- Additional comment from Jonas Nordell on 2020-11-09 15:12:45 UTC ---



--- Additional comment from Micah Abbott on 2020-11-09 16:18:43 UTC ---

I was unable to reproduce this in a local VM test.  I used the same `nmcli` commands and observed that the NM file was correctly written.

Is it possible that your Ignition config is also writing the `/etc/NetworkManager/system-connections/default_connection.nmconnection`?

Could you provide the full journal from the host showing the boot after the install was done?  That would give us insight as to if Ignition is writing out the same file.

A copy of the Ignition configuration would be useful, too.

--- Additional comment from Jonas Nordell on 2020-11-10 08:30:07 UTC ---



--- Additional comment from Jonas Nordell on 2020-11-10 08:32:11 UTC ---

I have added a complete systemboot. 

My ignition config http://10.0.0.10/rhcos/worker-3.ign only contains a certificate ?

--- Additional comment from Jonas Nordell on 2020-11-10 08:32:21 UTC ---

I have added a complete systemboot. 

My ignition config http://10.0.0.10/rhcos/worker-3.ign only contains a certificate ?

--- Additional comment from Jonas Nordell on 2020-11-10 08:42:27 UTC ---



--- Additional comment from Jonathan Lebon on 2020-11-10 19:01:55 UTC ---

The boot logs show that coreos-copy-firstboot-network and coreos-teardown-network are picking up and propagating the injected NM config:

```
[    6.861733] coreos-copy-firstboot-network[698]: info: copying files from /mnt/boot_partition/coreos-firstboot-network to /run/NetworkManager/system-connections/
[    6.870657] coreos-copy-firstboot-network[698]: '/mnt/boot_partition/coreos-firstboot-network/default_connection.nmconnection' -> '/run/NetworkManager/system-connections/default_connection.nmconnection'
...
[   17.888523] coreos-teardown-initramfs[1105]: info: no networking config is defined in the real root
[   17.891753] coreos-teardown-initramfs[1105]: info: propagating initramfs networking config to the real root
[   17.906937] coreos-teardown-initramfs[1105]: /usr/bin/coreos-relabel
[   18.085890] coreos-teardown-initramfs[1105]: Relabeled /sysroot//etc/NetworkManager/system-connections/default_connection.nmconnection from (null) to system_u:object_r:NetworkManager_etc_rw_t:s0
```

(I opened https://github.com/coreos/fedora-coreos-config/pull/732 to make it easier to tell what files coreos-teardown-initramfs actually copied.)

One test worth doing is booting with `rd.break` and inspecting `/sysroot//etc/NetworkManager/system-connections/default_connection.nmconnection`. If it has the correct contents, then it means that something in the real root is modifying the config (possibly NM itself?). If it doesn't, then it's something in the initrd.

--- Additional comment from Jonathan Lebon on 2020-11-10 19:26:42 UTC ---

As mentioned in https://github.com/coreos/fedora-coreos-config/pull/733#issuecomment-724914891, a workaround for this is to boot with `rd.neednet=1`. You can do this with `coreos-installer install --firstboot-args 'rd.neednet=1'`. Can you verify that this fixes the issue?

--- Additional comment from Jonas Nordell on 2020-11-11 07:29:37 UTC ---

I can confirm that adding "--firstboot-args 'rd.neednet=1'" solved my issue and the node booted with IP I had setup with nmcli before running coreos-installer.

Comment 1 Jonathan Lebon 2020-11-26 16:32:26 UTC
Fixed by https://github.com/openshift/installer/pull/4422.

Comment 2 Micah Abbott 2020-11-30 21:31:36 UTC
(In reply to Jonathan Lebon from comment #1)
> Fixed by https://github.com/openshift/installer/pull/4422.

This PR has merged, so moving to MODIFIED

Comment 4 Micah Abbott 2020-12-04 19:50:39 UTC
Verified with RHCOS 46.82.202012032341-0

Booted the ISO, used `nmcli` to configure interface

```
$ sudo nmcli con mod "Wired Connection" ipv4.addr 192.168.122.100/24
$ sudo nmcli con mod "Wired Connection" ipv4.gateway 192.168.122.1
$ sudo nmcli con mod "Wired Connection" ipv4.dns 192.168.122.1
```

Used `coreos-install` to copy network config

`$ sudo coreos-install install --copy-network --insecure-ignition --ignition-url=http://192.168.122.1/ignitionv3.json /dev/vda`

Rebooted into RHCOS, confirmed NM config was copied, message was logged, and dracut module was updated:

```
$ rpm-ostree status              
State: idle                                
Deployments:             
● ostree://713f7a88c06960f42d52e1fb50baf35fd7f14df9b474d94d46fd67a2a9c07494
                   Version: 46.82.202012032341-0 (2020-12-03T23:45:01Z)
[core@localhost ~]$ sudo cat /etc/NetworkManager/system-connections/default_connection.nmconnection 
[connection]                                             
id=Wired Connection                   
uuid=75c32fef-f2bb-49a2-b002-0e48f8565580
type=ethernet                                                                 
multi-connect=3                                                             
permissions=                                            
timestamp=1607110273                                    
                                                                               
[ethernet]                               
mac-address-blacklist=                
                                                                               
[ipv4]                                                                      
address1=192.168.122.100/24,192.168.122.1
dns=192.168.122.1;                 
dns-search=
method=auto
                                                                               
[ipv6]             
addr-gen-mode=eui64                                                     
dns-search=                                               
method=auto     
                                                                               
[proxy]                 
[core@localhost ~]$ journalctl -b | grep coreos-copy                                                                                                           
Dec 04 19:37:50 localhost coreos-copy-firstboot-network[704]: info: copying files from /mnt/boot_partition/coreos-firstboot-network to /run/NetworkManager/system-connections/
Dec 04 19:37:50 localhost coreos-copy-firstboot-network[704]: '/mnt/boot_partition/coreos-firstboot-network/default_connection.nmconnection' -> '/run/NetworkManager/system-connections/default_connection.nmconnection'
[core@localhost ~]$ sudo cat /usr/lib/dracut/modules.d/15coreos-network/coreos-copy-firstboot-network.service | grep enable-network
# Need to run after coreos-enable-network since it may re-run the NM cmdline
After=coreos-enable-network.service
```

Comment 7 errata-xmlrpc 2020-12-14 13:50:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5259


Note You need to log in before you can comment on or make changes to this bug.