Hide Forgot
Description of problem: TL;DR: NetworkManager should add the DEVICE=<ifname>.<vlanid> to ifcfg it creates so the ifcfg files work even if NetworkManager is then disabled. Long version: Given two NICS, eth0, and eth1, using nmcli/NetworkManager to create an active/passive bond0 interface, and then creating a VLAN on that interface results in an invalid ifcfg file that the network.service can use when NetworKManager is then disabled. # create bond0 nmcli con add type bond con-name bond0 ifname bond0 mode active-backup ipv4.method disabled ipv6.method ignore 802-3-ethernet.mtu 9000 # add bond slaves nmcli con add type bond-slave ifname eth0 master bond0 802-3-ethernet.mtu 9000 nmcli con add type bond-slave ifname eth1 master bond0 802-3-ethernet.mtu 9000 # create bond0.12 interface for VLAN 12 nmcli con add type vlan con-name bond0.12 dev bond0 id 12 ipv4.method disabled ipv6.method ignore 802-3-ethernet.mtu 9000 # cat /etc/sysconfig/network-scripts/ifcfg-bond0.12 VLAN=yes TYPE=Vlan DEVICE=bond0.12 PHYSDEV=bond0 VLAN_ID=12 REORDER_HDR=yes GVRP=no MVRP=no MTU=9000 IPV6INIT=no NAME=bond0.12 UUID=0f3235fe-8102-4f06-8288-b1fab5a9e69c ONBOOT=yes It's missing 'DEVICE=bond0.12', so if you disable NetworkManager with NM_CONTROLLED=no, then you get errors trying to start or stop the interface: # ifdown bond0.12 Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' ERROR : [ipv6_test_device_status] Missing parameter 'device' (arg 1) Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' # ifup bond0.12 Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device does not seem to be present, delaying initialization. NetworkManager is OK without the DEVICE line, no errors, but if you disable NetworkManger then the above errors appear. Version-Release number of selected component (if applicable): RHEL7.3 (and Fedora 25) How reproducible: every time Steps to Reproduce: It's simple to reproduce, you can do this on any machine: nmcli con add type bond con-name bond0 ifname bond0 mode active-backup ipv4.method disabled ipv6.method ignore 802-3-ethernet.mtu 9000 nmcli con add type vlan con-name bond0.12 dev bond0 id 12 ipv4.method disabled ipv6.method ignore 802-3-ethernet.mtu 9000 # notice no errors ifup bond0.12 ifdown bond0.12 # now disable NetworkManager echo 'NM_CONTROLLED=no' > /etc/sysconfig/network-scripts/ifcfg-bond0 echo 'NM_CONTROLLED=no' > /etc/sysconfig/network-scripts/ifcfg-bond0.12 nmcli c reload # try to bring it up ifup bond0.12 Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' Could not load file '/etc/sysconfig/network-scripts/ifcfg-bond0.12' ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device does not seem to be present, delaying initialization. # add the DEVICE variable and try again echo 'DEVICE=bond0.12' >> ifcfg-bond0.12 ifup bond0.12 (works, no error now) Actual results: Expected results: The ifcfg files created by NetworkManager should work with or without NetworkManager. Additional info:
in general, it is not supported to let NetworkManager write ifcfg files, then disable NetworkManager and expect legacy initscripts to handle them. -- note that legacy initscripts handle NM ifcfg files by calling nmcli to let NetworkManager deal with those files. If you disable NetworkManager, that is not guaranteed to work. Nor is it possible to work for every case because both initscripts and NM support feature that the other doesn't. Of course, it should aim to create files that work as well as possible. See also https://bugzilla.redhat.com/show_bug.cgi?id=1369008#c4
initscripts need the DEVICE= variable set for vlans in order to determine the interface name. For NetworkManager this is a bit different, as it can build the interface name from the physical device and the vlan id. So, in NM you can create connections with: * nmcli con add type vlan dev bond0 id 12 This sets PHYSDEV=bond0, VLAN_ID=12 in the ifcfg file. The VLAN interface name is determined automatically by NM. The generated ifcfg file is not compatible with initscripts. * nmcli con add type vlan dev bond0 id 12 ifname bond0.12 This sets PHYSDEV=bond0, VLAN_ID=12, DEVICE=bond0.12 in the ifcfg file, which is compatible with initscripts. We could try to add a workaround that always sets DEVICE when the interface name is not explictly set, but that would break existing scripts; for example: * nmcli con add type vlan dev bond0 id 12 con-name my * nmcli con mod my vlan.id 13 would result in a connection with DEVICE=bond0.12 and VLAN_id=13, unless we add an other ugly workaround to "adjust" things (and possibly breaking something else). Note that there are other differences in how vlan connections are stored, as initscripts use the 'VID' variable for the vlan id while NM uses 'VLAN_ID'. Unfortunately when initscripts added support for an arbitrary vlan id in 2014, they picked a variable name different from NM's: https://github.com/fedora-sysv/initscripts/commit/368d19d31a057bf4d82ba428629694618edce133 So, I think this is not easily fixable on NM side and I suggest that if you really need a connection file compatible with initscripts (which is in general something we try to achieve, but without guarantees), you explicitly create the vlan with an interface name: * nmcli con add type vlan dev bond0 id 12 ifname bond0.12 Therefore I propose to close this as WONTFIX, any other opinions?
(In reply to Beniamino Galvani from comment #3) Nice investigation. > We could try to add a workaround that always sets DEVICE when the > interface name is not explictly set, but that would break existing > scripts; for example: If I understand you right, you propose to always write DEVICE. But without taking additional measures, that means that a VLAN connection always has connection.interface-name set. ifcfg-rh plugin is supposed to be able to store every valid connection. It would then no longer be able to store connections without connection.interface-name. A additional measure/workaround could be to also write another flag like DEVICE=bond0.12 DEVICE_NM_IGNORED=yes That would also omit any problems with subsequent modify, because every time we write the connection anew, we would re-write DEVICE+DEVICE_NM_IGNORED. The VID issue could be solved, by just writing both VID and VLAN_ID.NM could also fallback to VID if VLAN_ID is not present. The ugliness of all this is of course frightening, but I think we should do this.
(In reply to Thomas Haller from comment #4) > A additional measure/workaround could be to also write another flag like > > DEVICE=bond0.12 > DEVICE_NM_IGNORED=yes > > That would also omit any problems with subsequent modify, because every time > we write the connection anew, we would re-write DEVICE+DEVICE_NM_IGNORED. > The VID issue could be solved, by just writing both VID and VLAN_ID.NM could > also fallback to VID if VLAN_ID is not present. I've pushed branch bg/vlan-device-ifcfg-rh1400649 which addresses the problem using your suggestions. What concerns me is that it adds complexity for a corner case that can be easily worked around in a different way. The complexity is not in the code, but especially in the ifcfg file, that now has duplicated variables (VLAN_ID, VID) and variables whose meaning depend on another variable (DEVICE, NM_IGNORE_DEVICE). I'd like to hear what others think about it.
(In reply to Beniamino Galvani from comment #5) > What concerns me is that it adds complexity for a corner case that can > be easily worked around in a different way. That is true, I would still do it, but no strong opinion. (at least parts of the branch are worthy in any cas). > ifcfg-rh: rename write_bonding_setting() to write_bond_setting() "trivial"? > ifcfg-rh: always write DEVICE for VLAN connections + tmp = nm_utils_new_vlan_name (nm_setting_vlan_get_parent (s_vlan), + nm_setting_vlan_get_id (s_vlan)); nm_setting_vlan_get_parent() can return NULL, which would trigger an assertion. > ifcfg-rh: add VID together with VLAN_ID DEVICE=super-vlan -VLAN_ID=44 +VID=44 PHYSDEV=eth9 this looks wrong. Does the test pass? Otherwise, I am in favor of this patch.
(In reply to Thomas Haller from comment #6) > > ifcfg-rh: rename write_bonding_setting() to write_bond_setting() > > "trivial"? Ok. > > ifcfg-rh: always write DEVICE for VLAN connections > > + tmp = nm_utils_new_vlan_name (nm_setting_vlan_get_parent (s_vlan), > + nm_setting_vlan_get_id (s_vlan)); > > nm_setting_vlan_get_parent() can return NULL, which would trigger an > assertion. Right, fixed. If the parent is NULL, we can't determine the VLAN interface name here and thus DEVICE is not added to the file. > > ifcfg-rh: add VID together with VLAN_ID > > DEVICE=super-vlan > -VLAN_ID=44 > +VID=44 > PHYSDEV=eth9 > > this looks wrong. Does the test pass? Yes. This is to test that the reader can parse VID instead of VLAN_ID.
bg/vlan-device-ifcfg-rh1400649 lgtm
Since Lubomir expressed some concerns about the branch and I tend to agree that it adds hacks for an unsupported scenario, I'm closing this as WONTFIX. A simple way to achieve compatibility with initscripts in this case is to explicitly specify the interface name with nmcli: nmcli con add type vlan dev bond0 id 12 ifname bond0.12
Created attachment 1330948 [details] [patch] rh1400649-ifcfg-vlan-compat.patch Attaching the rejected patches to the bug. the apply on upstream commit 7070d17cedd09d07f12ce977dd1e16cecf8d4b45