Bug 1899286

Summary: [4.6.z] Unable to get coreos-installer with --copy-network to work
Product: OpenShift Container Platform Reporter: Micah Abbott <miabbott>
Component: RHCOSAssignee: Jonathan Lebon <jlebon>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: bbreard, bgilbert, imcleod, jlebon, jligon, jnordell, miabbott, mnguyen, nstielau
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1895979 Environment:
Last Closed: 2020-12-14 13:50:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1895979    
Bug Blocks: 1899176    

Description Micah Abbott 2020-11-18 20:03:50 UTC
+++ This bug was initially created as a clone of Bug #1895979 +++

Description of problem:

In the documentation [1] it states that changes made with nmcli and/or nmtui in the Live ISO environment can be persisted with the use of --copy-network together with coreos-installer.

But when I try this nothing is persisted and after the first reboot the network configuration does not contain any of my customization. 

[1] https://docs.openshift.com/container-platform/4.6/installing/installing_bare_metal/installing-bare-metal-network-customizations.html#installation-user-infra-machines-advanced_network_installing-bare-metal-network-customizations



Version-Release number of selected component (if applicable):

rhcos-4.6.1-x86_64-live.x86_64.iso


How reproducible:
Every time

Steps to Reproduce:
1. Load Live ISO image
2. Change Network settings with nmcli
   - sudo nmcli con mod "Wired Connection" ipv4.addresses 10.0.1.123/24
   - sudo nmcli con mod "Wired Connection" ipv4.gateway 10.0.0.1
   - sudo nmcli con mod "Wired Connection" ipv4.dns 10.0.0.1 
3. Verify NetworkManager configuration, se attached screenshoot picture2.png
4. Run coreos-installer, see attached screenshoot picture3.png
5. reboot
6. Verify NetworkManager configuration



Actual results:

[core@localhost ~]$ sudo cat /etc/NetworkManager/system-connections/default_connection.nmconnection 
[connection]
id=Wired Connection
uuid=3ab5973e-8dfa-41ca-963f-68c1089347f7
type=ethernet
multi-connect=3
permissions=

[ethernet]
mac-address-blacklist=

[ipv4]
dns-search=
method=auto

[ipv6]
addr-gen-mode=eui64
dns-search=
method=auto

[proxy]


Expected results:

/etc/NetworkManager/system-connections/default_connection.nmconnection should look like before the reboot.


Additional info:

--- Additional comment from Jonas Nordell on 2020-11-09 15:12:45 UTC ---



--- Additional comment from Micah Abbott on 2020-11-09 16:18:43 UTC ---

I was unable to reproduce this in a local VM test.  I used the same `nmcli` commands and observed that the NM file was correctly written.

Is it possible that your Ignition config is also writing the `/etc/NetworkManager/system-connections/default_connection.nmconnection`?

Could you provide the full journal from the host showing the boot after the install was done?  That would give us insight as to if Ignition is writing out the same file.

A copy of the Ignition configuration would be useful, too.

--- Additional comment from Jonas Nordell on 2020-11-10 08:30:07 UTC ---



--- Additional comment from Jonas Nordell on 2020-11-10 08:32:11 UTC ---

I have added a complete systemboot. 

My ignition config http://10.0.0.10/rhcos/worker-3.ign only contains a certificate ?

--- Additional comment from Jonas Nordell on 2020-11-10 08:32:21 UTC ---

I have added a complete systemboot. 

My ignition config http://10.0.0.10/rhcos/worker-3.ign only contains a certificate ?

--- Additional comment from Jonas Nordell on 2020-11-10 08:42:27 UTC ---



--- Additional comment from Jonathan Lebon on 2020-11-10 19:01:55 UTC ---

The boot logs show that coreos-copy-firstboot-network and coreos-teardown-network are picking up and propagating the injected NM config:

```
[    6.861733] coreos-copy-firstboot-network[698]: info: copying files from /mnt/boot_partition/coreos-firstboot-network to /run/NetworkManager/system-connections/
[    6.870657] coreos-copy-firstboot-network[698]: '/mnt/boot_partition/coreos-firstboot-network/default_connection.nmconnection' -> '/run/NetworkManager/system-connections/default_connection.nmconnection'
...
[   17.888523] coreos-teardown-initramfs[1105]: info: no networking config is defined in the real root
[   17.891753] coreos-teardown-initramfs[1105]: info: propagating initramfs networking config to the real root
[   17.906937] coreos-teardown-initramfs[1105]: /usr/bin/coreos-relabel
[   18.085890] coreos-teardown-initramfs[1105]: Relabeled /sysroot//etc/NetworkManager/system-connections/default_connection.nmconnection from (null) to system_u:object_r:NetworkManager_etc_rw_t:s0
```

(I opened https://github.com/coreos/fedora-coreos-config/pull/732 to make it easier to tell what files coreos-teardown-initramfs actually copied.)

One test worth doing is booting with `rd.break` and inspecting `/sysroot//etc/NetworkManager/system-connections/default_connection.nmconnection`. If it has the correct contents, then it means that something in the real root is modifying the config (possibly NM itself?). If it doesn't, then it's something in the initrd.

--- Additional comment from Jonathan Lebon on 2020-11-10 19:26:42 UTC ---

As mentioned in https://github.com/coreos/fedora-coreos-config/pull/733#issuecomment-724914891, a workaround for this is to boot with `rd.neednet=1`. You can do this with `coreos-installer install --firstboot-args 'rd.neednet=1'`. Can you verify that this fixes the issue?

--- Additional comment from Jonas Nordell on 2020-11-11 07:29:37 UTC ---

I can confirm that adding "--firstboot-args 'rd.neednet=1'" solved my issue and the node booted with IP I had setup with nmcli before running coreos-installer.

Comment 1 Jonathan Lebon 2020-11-26 16:32:26 UTC
Fixed by https://github.com/openshift/installer/pull/4422.

Comment 2 Micah Abbott 2020-11-30 21:31:36 UTC
(In reply to Jonathan Lebon from comment #1)
> Fixed by https://github.com/openshift/installer/pull/4422.

This PR has merged, so moving to MODIFIED

Comment 4 Micah Abbott 2020-12-04 19:50:39 UTC
Verified with RHCOS 46.82.202012032341-0

Booted the ISO, used `nmcli` to configure interface

```
$ sudo nmcli con mod "Wired Connection" ipv4.addr 192.168.122.100/24
$ sudo nmcli con mod "Wired Connection" ipv4.gateway 192.168.122.1
$ sudo nmcli con mod "Wired Connection" ipv4.dns 192.168.122.1
```

Used `coreos-install` to copy network config

`$ sudo coreos-install install --copy-network --insecure-ignition --ignition-url=http://192.168.122.1/ignitionv3.json /dev/vda`

Rebooted into RHCOS, confirmed NM config was copied, message was logged, and dracut module was updated:

```
$ rpm-ostree status              
State: idle                                
Deployments:             
● ostree://713f7a88c06960f42d52e1fb50baf35fd7f14df9b474d94d46fd67a2a9c07494
                   Version: 46.82.202012032341-0 (2020-12-03T23:45:01Z)
[core@localhost ~]$ sudo cat /etc/NetworkManager/system-connections/default_connection.nmconnection 
[connection]                                             
id=Wired Connection                   
uuid=75c32fef-f2bb-49a2-b002-0e48f8565580
type=ethernet                                                                 
multi-connect=3                                                             
permissions=                                            
timestamp=1607110273                                    
                                                                               
[ethernet]                               
mac-address-blacklist=                
                                                                               
[ipv4]                                                                      
address1=192.168.122.100/24,192.168.122.1
dns=192.168.122.1;                 
dns-search=
method=auto
                                                                               
[ipv6]             
addr-gen-mode=eui64                                                     
dns-search=                                               
method=auto     
                                                                               
[proxy]                 
[core@localhost ~]$ journalctl -b | grep coreos-copy                                                                                                           
Dec 04 19:37:50 localhost coreos-copy-firstboot-network[704]: info: copying files from /mnt/boot_partition/coreos-firstboot-network to /run/NetworkManager/system-connections/
Dec 04 19:37:50 localhost coreos-copy-firstboot-network[704]: '/mnt/boot_partition/coreos-firstboot-network/default_connection.nmconnection' -> '/run/NetworkManager/system-connections/default_connection.nmconnection'
[core@localhost ~]$ sudo cat /usr/lib/dracut/modules.d/15coreos-network/coreos-copy-firstboot-network.service | grep enable-network
# Need to run after coreos-enable-network since it may re-run the NM cmdline
After=coreos-enable-network.service
```

Comment 7 errata-xmlrpc 2020-12-14 13:50:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5259