Hide Forgot
Description of problem: Cannot create `vlan0` holding vlan ID 10. Both the kernel and nmcli shows the VLAN ID is 0. Version-Release number of selected component (if applicable): NetworkManager-1.29.6-27413.copr.ec7d8ddb29.el8.x86_64 How reproducible: 100% Steps to Reproduce: sudo nmcli connection add type dummy \ con-name dummy0 ifname dummy0 \ ipv4.method disabled ipv6.method disabled sudo nmcli connection add type vlan \ con-name vlan0 ifname vlan0 vlan.parent dummy0 \ vlan.id 10 ipv4.method disabled ipv6.method disabled sudo ip -D link show vlan0 sudo nmcli -f 'vlan.id' c show vlan0 Actual results: Both nmcli and ip shows the VLAN ID is 0. Expected results: Both nmcli and ip shows the VLAN ID is 10. Additional info:
Created attachment 1739364 [details] System logs with NM trace enabled
Please be aware kernel has no limitation on `vlan0` name: sudo ip link add link eth1 name vlan0 type vlan id 101
This happens because the ifcfg-rh connection reader tries to be backwards compatible with initscripts: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/1.30.2/src/core/settings/plugins/ifcfg-rh/nms-ifcfg-rh-reader.c#L6077 At least for connections written by NM, this compatibility adjustment should not be done.
(In reply to Beniamino Galvani from comment #3) > This happens because the ifcfg-rh connection reader tries to be backwards > compatible with initscripts: > > https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/1.30.2/ > src/core/settings/plugins/ifcfg-rh/nms-ifcfg-rh-reader.c#L6077 > > At least for connections written by NM, this compatibility adjustment should > not be done. this needs careful considerations. Possibly we need to introduce a new key "NM_VLAN_ID" that does take precedence over the id from the interface name.
Can we just document this as a known limitation and recommend to use the keyfile plugin for such connections if necessary?
(In reply to Till Maas from comment #5) > Can we just document this as a known limitation and recommend to use the > keyfile plugin for such connections if necessary? I don't thinks so, because if you are in a situation where this hits you, it's not at all clear where to find that documentation. Also, what's nmstate (in this example) gonna do? Fail? Report an error to RHV/OSP? And then RHV/OSP logs an error? Log a message to use keyfile plugin? That's all high effort, hard to understand (and describe) and cumbersome to workaround. In general, when NetworkManager persists a profile to disk, then it must be able to do so without loss of information. There are a few exceptions, for example regarding certificate blobs. But in general, this is really important. Luckily, the fix is simple...
WIP: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/794
Can we instead drop the compatibily code altogether? I mean upstream. So that we don't need to complicate it further with additional ifcfg-rh variables. For RHEL 8 we need to carry a simple downstream patch to re-introduce the compat workaround. We have done the same in the past to preserve compatibility when we changed behavior upstream (as for the mac=preserve/permanent, or for slaves-order=name/ifindex). RHEL 9 will be aligned to upstream and we need to document the change.
(In reply to Beniamino Galvani from comment #8) > Can we instead drop the compatibily code altogether? I mean upstream. So > that we don't need to complicate it further with additional ifcfg-rh > variables. > > For RHEL 8 we need to carry a simple downstream patch to re-introduce the > compat workaround. We have done the same in the past to preserve > compatibility when we changed behavior upstream (as for the > mac=preserve/permanent, or for slaves-order=name/ifindex). > > RHEL 9 will be aligned to upstream and we need to document the change. That's possible. It still leaves the bug on rhel-8. How would you address it there? Do you mean to only merge the patch to rhel-8? I disagree with the term "complicated". Sure, ifcfg is a mess, it's inconsistent etc. But you can `git grep '\<VLAN_ID'` and get a clear(?) picture of how the reader/writer interpret the format. Also the documentation of the property is not really(?) complicated. I mean, if you write an ifcfg file by hand and you read `man nm-settings-ifcfg-rh` and have hope to understand it (if not, let's improve the wording there). I'd reserve words like "complicated" for NMDevice and NMManager. Because thos is orders of magnitude more complicated. This patch, is something I did in 20 minutes and I can understand it. I am very sure you can review and understand it also in 20 minutes. Patches like 8df3ef5063cf0f8ee904f8fd39e9e64cc60836eb I do not understand despite trying. Is this fix ugly? Yes! So maybe something else should be done. On the upside, it's backward compatible, which leads to ugliness but has important benefits.
> Patches like 8df3ef5063cf0f8ee904f8fd39e9e64cc60836eb I do not understand despite trying. I meant https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/190fd9aa9f3fbf5705c2b80b9fc64c89d22b7593.
(In reply to Thomas Haller from comment #9) > It still leaves the bug on rhel-8. How would you address it there? > Do you mean to only merge the patch to rhel-8? What I mean is that the direction that we should pursue upstream is to remove such compatibility hacks. If a ifcfg file contains: PHYSDEV=vlan0 VLAN_ID=80 it doesn't make any sense to ignore the VLAN id, since it's set explicitly for a reason. I think in the long term not having the workaround is a better solution; however I do realize that having your patch as RHEL8-only is painful.
I understood that part. You want to have a nice fix upstream (which changes behavior). My question was, what exactly do you suggest for rhel-8? The patch that I opened? Or WONTFIX? Also, there is a desire that upgrade for rhel-8 to rhel-9 works without problems. How do we address the change in behavior when switching from one version of NetworkManager to another? With a ReleaseNote entry? Having a downstream-only patch in rhel-8 by itself is not a problem. The problem is that there is a behavior associated with this patch, and whenever you upgrade to a different behavior (e.g. from rhel-8 to rhel-9), it's painful. Originally this behavior was added to preserve backward compatibility. Back then apparently backward compatibility with initscripts was important. That might have changed. We don't really care about backward compat with initscripts. But we care that our on-disk format for connection profiles is stable. And changing how we interpret the flags in an ifcfg file is a breaking change. Long term we want to get rid of ifcfg-rh altogether. That's why I don't mind some ugly solution mid-term, in particular because it's backward compatible. These are just some arguments. I don't have a strong opinion, except that I would tend to care less about the ugliness and more about changing behavior. Btw, the patch addes VLAN_ID_USE=yes, instead of VLAN_ID2=. The reason for that is not only backward compatiblity, but "forward compatibility". Not sure that is the right word but I mean that you can write a profile with a future version that writes VLAN_ID_USE=yes, and downgrade to an old version, and chances are that it's interpreted the same way. That's achieved by: VLAN_ID=10 VLAN_ID_USE=yes That could have also been achieved also by writing VLAN_ID=10 VLAN_ID2=10 but then we write redundant information, which seems more ugly. If we don't care about this kind of forward compatibility, we could simply write: VLAN_ID2=10 and treat the old profiles as before. That is less ugly than !794, more backward compatible than changing behavior, and not "forward compatible". So, there are also alternatives to !794 that break compatibility less.
> So, there are also alternatives to !794 that break compatibility less I meant: there are also alternatives to !794 that break compatibility more/differently, but less then what you might have in mind.
> PHYSDEV=vlan0 > VLAN_ID=80 > > it doesn't make any sense to ignore the VLAN id, since it's set explicitly for a reason. OK, good point. I'll update the MR (next week)
(In reply to Thomas Haller from comment #12) > My question was, what exactly do you suggest for rhel-8? > The patch that I opened? Or WONTFIX? I would use your patch in RHEL8. > Also, there is a desire that upgrade for rhel-8 to rhel-9 works > without problems. How do we address the change in behavior when switching > from one version of NetworkManager to another? With a ReleaseNote entry? Yes, I think a release note is enough. If necessary, we can add a Leapp module to convert files during the upgrade from 8 to 9. > These are just some arguments. I don't have a strong opinion, except that I would > tend to care less about the ugliness and more about changing behavior. This is a good point. Ugliness doesn't cause more burden for us, while changing behavior does (as we need to take care about downstream patches, release notes, migration). So I'm reconsidering my suggestion, and I'm fine with having your patch both upstream and in RHEL. BTW, you current branch is how I would have done it. My observation on VLAN_ID= was only about the current state (e.g. git master) vs the ideal one.
fixed upstream by https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/5d6532f2d7c203c2d70c0e328806dcb0b0ea53bc This is now considered simply a bug(fix), and not something where "backward compatibilty" is affected (see commit message).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: NetworkManager security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4361