Bug 2007563
Summary: | NM failed to bring up bonding network in initrd when using the same .nmconnection files in the real root system | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Coiby <coxu> | ||||
Component: | NetworkManager | Assignee: | NetworkManager Development Team <nm-team> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Desktop QE <desktop-qa-list> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 9.0 | CC: | bgalvani, ferferna, fge, lrintel, rkhan, sfaye, sukulkar, thaller, till | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2022-12-21 01:51:34 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
This bug also applies to bridging network. > 1. If I didn't copy the origial .nmconnection e.g. /etc/NetworkManager/system-connections/eno1.nmconnection to initrd, the bonding network could be brought up successfully. The problem is that there are two connection files for the same device eno1. One connection (the one copied from real root) configures eno1 as standalone interface, the other configures it as port of 'mybond0'. These two configuration are conflicting and NM chooses one (standalone), which is not the one you expect. > 3. nm-initrd-generator doesn't create a bond-slave-*.nmconnection like NM, Right, with `ip=mybond0:dhcp bond=mybond0:enc8000:` the generator should only create 2 connections: one for the bond and one for enc8000 (configured as port of the bond). (In reply to Beniamino Galvani from comment #2) > > 1. If I didn't copy the origial .nmconnection e.g. /etc/NetworkManager/system-connections/eno1.nmconnection to initrd, the bonding network could be brought up successfully. > > The problem is that there are two connection files for the same device eno1. > One connection (the one copied from real root) configures eno1 as standalone > interface, the other configures it as port of 'mybond0'. These two > configuration are conflicting and NM chooses one (standalone), which is not > the one you expect. In case this wasn't clear, you should copy only the 'bond-mybond0' and the 'bond-slave-eno1' connections to the initrd. Otherwise, if you copy also the 'eno1' connection from real root, the device eno1 will not be put under the bond, and the bond will not be able to get an address via DHCP. (In reply to Beniamino Galvani from comment #2) > > 1. If I didn't copy the origial .nmconnection e.g. /etc/NetworkManager/system-connections/eno1.nmconnection to initrd, the bonding network could be brought up successfully. > > The problem is that there are two connection files for the same device eno1. > One connection (the one copied from real root) configures eno1 as standalone > interface, the other configures it as port of 'mybond0'. These two > configuration are conflicting and NM chooses one (standalone), which is not > the one you expect. Although I still don't know what leads to the difference between real root system and initrd, I find "nmcli connection modify --temporary ID connection.autoconnect false" would make NM to not bring up specific connnection thus bypass this issue. Will you recommend it? Btw, adjusting connection.autoconnection-priority could also make NM bring up the connection I want but it doesn't work for znet network device. > > > 3. nm-initrd-generator doesn't create a bond-slave-*.nmconnection like NM, > > Right, with `ip=mybond0:dhcp bond=mybond0:enc8000:` the generator should > only create 2 connections: one for the bond and one for enc8000 (configured > as port of the bond). I know /usr/lib/udev/ccw_init need to extract the values of SUBCHANNELS, NETTYPE and LAYER2 from /etc/sysconfig/network-scripts/ifcfg-* or /etc/NetworkManager/system-connections/*.nmconnection to activate znet network device. But the info needed by ccw_init isn't contained in the connection profile bond-slave-enc8000.nmconnection like NM, created via "nmcli con add type ethernet ifname enc8000 master mybond0". This is why I need to copy enc8000.nmconnection to initrd as well which led me to find this bug. (In reply to Beniamino Galvani from comment #3) > (In reply to Beniamino Galvani from comment #2) > > > 1. If I didn't copy the origial .nmconnection e.g. /etc/NetworkManager/system-connections/eno1.nmconnection to initrd, the bonding network could be brought up successfully. > > > > The problem is that there are two connection files for the same device eno1. > > One connection (the one copied from real root) configures eno1 as standalone > > interface, the other configures it as port of 'mybond0'. These two > > configuration are conflicting and NM chooses one (standalone), which is not > > the one you expect. > > In case this wasn't clear, you should copy only the 'bond-mybond0' and the > 'bond-slave-eno1' connections to the initrd. Otherwise, if you copy also the Thanks for the clarification. Is znet device the only exception as explained in Comment #4? > 'eno1' connection from real root, the device eno1 will not be put under the > bond, and the bond will not be able to get an address via DHCP. > Although I still don't know what leads to the difference between real root system and initrd, I find "nmcli connection modify --temporary ID connection.autoconnect false" would make NM to not bring up specific connnection thus bypass this issue. Will you recommend it? Btw, adjusting connection.autoconnection-priority could also make NM bring up the connection I want but it doesn't work for znet network device. In a simple case, there is one device and one suitable profile to autoconnect, and what happens iis clear. If you have multiple profiles that are applicable at a time on the device (i.e. they are able to autoconnect because all the circumstances are right) then: - if you configure different "connection.autoconnect-priority", then that determines which profile is chosen. Likewise, if you set `connection.autoconnect=false`, that of course disables autoconnect for the other profile, also resolving the tie. - in case there are still multiple candidates, then NM first chooses the one with the more recent timestamp in /var/lib/NetworkManager/timestamps (which gets updated whenever you activate a new profile). But that timestamp information is not directly accessible or under your control. A human user can somewhat control that, by explicitly activate the profile they want. But for a non-interactive tool, there is real no solution what you can do. The solution is: don't configure conflicting/unsuitable things in NetworkManager if you want that automatically the right thing happens. > I know /usr/lib/udev/ccw_init need to extract the values of SUBCHANNELS, NETTYPE and LAYER2 from /etc/sysconfig/network-scripts/ifcfg-* or /etc/NetworkManager/system-connections/*.nmconnection to activate znet network device. But the info needed by ccw_init isn't contained in the connection profile bond-slave-enc8000.nmconnection like NM, created via "nmcli con add type ethernet ifname enc8000 master mybond0". This is why I need to copy enc8000.nmconnection to initrd as well which led me to find this bug. znet uses a udev rule which parses NetworkManager profiles to configure the interfaces. The way of doing that is in my opinion wrong and a hack. In particular, because the general idea of NetworkManager is that you create profiles and activate them (for the configuration to take effect). The udev rules only run once per interface. It's doubly odd that the udev rule likes to parse NetworkManager configuration. If you would write a udev rule (and configure it in the rule, or via some script), then that is all fine. But here the udev rule re-uses some NetworkManager files for its own configuration. I don't know how to solve that. I suggest a hack :) if you have a profile for the sole purpose to configure the udev rule, then you probably don't want to configure that one to autoconnect.... well, it depends again on what purpose the user has for this profile, which a non-interactive tool can only guess. Proper znet support from NetworkManager might be interesting. But a large effort. (In reply to Coiby from comment #7) > Thanks for the explanation. In some cases, I find NM doesn't choose the one with the more recent timestamp, > Could any other factors influence NM's behaviour on choosing which profile to activate? No, the order is determined by the autoconnect-priority value; in case of a tie the timestamp is compared. As a last resort, the UUID is used. > 2. create a bonding network over this network interface, > > nmcli con add type bond ifname mybond0 > nmcli con add type ethernet ifname eth0 master mybond0 > nmcli c up bond-slave-eth0 > nm-initrd-generator doesn't create a bond-slave-*.nmconnection like NM, > If I copy the two .nmconnection files to initrd, the bonding network could be brought up successfully. That's because you are not specifying the s390 parameters for the ethernet connection. You can use this command: nmcli con add type ethernet \ ifname eth0 \ master mybond0 \ ethernet.s390-nettype qeth \ ethernet.s390-subchannels "0.0.8000 0.0.8001 0.0.8002;" \ ethernet.s390-options "layer2=1 portno=0" to add the ethernet as a bond port and also set the s390 parameters. With this you will need only two connection profiles in the initrd. I think Beniamino answered already in comment 9. sorry for being so late to answer this request. Is there anything else missing on this issue? It seems to me, there is no bug here. (In reply to Beniamino Galvani from comment #9) > (In reply to Coiby from comment #7) > > Thanks for the explanation. In some cases, I find NM doesn't choose the one with the more recent timestamp, > > > Could any other factors influence NM's behaviour on choosing which profile to activate? > > No, the order is determined by the autoconnect-priority value; in case of a > tie the timestamp is compared. As a last resort, the UUID is used. So there are three factors for NM to determine which profile to activate. Thanks for the clarification! (In reply to Thomas Haller from comment #10) > I think Beniamino answered already in comment 9. > > sorry for being so late to answer this request. > > Is there anything else missing on this issue? It seems to me, there is no > bug here. Yes, this not a bug. Thanks! |
Created attachment 1825874 [details] nm trace log when failed to bring up bonding network Description of problem: In the real root file system, there is an active network interface. I created a bonding network using this network inferface as the slave and copied the three .nmconnection files to the initramfs and trigger sysrq to boot into kdump kernel. But in initrd the bonding network failed to be brought up. Attached is the NM trace log. While in the normal kernel, the boding network could be brought up successfully each time after rebooting. Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. There is an active network interface specified in e.g. /etc/NetworkManager/system-connections/eno1.nmconnection, ``` [connection] id=eno1 uuid=2a180551-3c17-4cee-b184-787a9069fc29 type=ethernet interface-name=eno1 permissions= [ethernet] mac-address-blacklist= [ipv4] dns-search= method=auto [ipv6] addr-gen-mode=eui64 dns-search= method=auto [proxy] ``` 2. create a bonding network over this network interface, ``` nmcli con add type bond ifname mybond0 nmcli con add type ethernet ifname eth0 master mybond0 nmcli c up bond-slave-eth0 ``` 3. Now NM created two new files, ``` [root@hpe-dl320egen8-02 system-connections]# cat bond-mybond0.nmconnection [connection] id=bond-mybond0 uuid=1e489223-9a7c-4f30-859a-1586dc3142c8 type=bond interface-name=mybond0 permissions= [bond] mode=balance-rr [ipv4] dns-search= method=auto [ipv6] addr-gen-mode=stable-privacy dns-search= method=auto [proxy] [root@hpe-dl320egen8-02 system-connections]# cat bond-slave-eno1.nmconnection [connection] id=bond-slave-eno1 uuid=3d0472d6-ccfa-438a-990f-828eee9528fe type=ethernet interface-name=eno1 master=mybond0 permissions= slave-type=bond [ethernet] mac-address-blacklist= ``` 3. Make kexec-tools copy the three .nmconnection files to the initramfs and trigger sysrq Actual results: NM failed to bring up the bonding network. Expected results: NM should bring up the bonding network in initrd successfully as in the normal kernel Additional info: 1. If I didn't copy the origial .nmconnection e.g. /etc/NetworkManager/system-connections/eno1.nmconnection to initrd, the bonding network could be brought up successfully. 2. I first met this problem when trying to set up bonding network for a z/vm s390x machine. The z/vm s390x machine use znet and depends on /usr/lib/udev/ccw_init to activiate the network interface. ccw_init in turn depends on ifcfg-enc8000 or enc8000.nmconnection to extract s390-subchannels, s390-nettype and s390-options. So I have to copy enc8000.nmconnection to initrd as well. But if I simply remove the lines not containing s390-subchannels, s390-nettype or s390-options in enc8000.nmconnection. The bonding network could be brought up successfully. 3. nm-initrd-generator doesn't create a bond-slave-*.nmconnection like NM, ``` $ /usr/libexec/nm-initrd-generator -s -- rd.znet=qeth,0.0.8000,0.0.8001,0.0.8002,layer2=1,portno=0 ip=mybond0:dhcp ifname=enc8000:02:de:ad:be:ef:64 bond=mybond0:enc8000: nameserver=127.0.0 .53 rd.neednet *** Connection 'enc8000' *** [connection] id=enc8000 uuid=aefc0e62-29e5-4cbd-ae9a-388cea2da78f type=ethernet autoconnect-retries=1 interface-name=enc8000 master=b4e9d4b0-5fa4-4ccf-9973-94b73d0d6e1a multi-connect=1 permissions= slave-type=bond wait-device-timeout=60000 [ethernet] mac-address-blacklist= s390-nettype=qeth s390-subchannels=0.0.8000;0.0.8001;0.0.8002; [ethernet-s390-options] layer2=1 portno=0 [user] org.freedesktop.NetworkManager.origin=nm-initrd-generator *** Connection 'mybond0' *** [connection] id=mybond0 uuid=b4e9d4b0-5fa4-4ccf-9973-94b73d0d6e1a type=bond autoconnect-retries=1 interface-name=mybond0 multi-connect=1 permissions= [ethernet] mac-address-blacklist= [bond] mode=balance-rr [ipv4] dhcp-timeout=90 dns=127.0.0.53; dns-search= may-fail=false method=auto [ipv6] addr-gen-mode=eui64 dhcp-timeout=90 dns-search= method=auto [proxy] [user] org.freedesktop.NetworkManager.origin=nm-initrd-generator ``` If I copy the two .nmconnection files to initrd, the bonding network could be brought up successlly.