Bug 1741678

Summary: udevd attempts to rename VLAN devices to the name of the base interface
Product: [Fedora] Fedora Reporter: Chris Siebenmann <cks-rhbugzilla>
Component: systemdAssignee: systemd-maint
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 30CC: lnykryn, msekleta, s, steved424, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-26 18:39:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
output from 'systemd-analyze cat-config udev/rules.d'
none
output from 'SYSTEMD_LOG_LEVEL=debug udevadm test /sys/class/net/em-dev2' none

Description Chris Siebenmann 2019-08-15 19:39:29 UTC
Description of problem:
I have VLANs on top of a physical network, managed by networkd. In Fedora 30,
which I just upgraded to, something in the stack of systemd, networkd, and
udevd attempts to rename all of my VLAN devices (eg 'em-dev2') to the name
of the underlying physical device ('em0'). Naturally this fails, since the
underlying physical device is already using the name. One consequence of
this failure is that a 'sys-subsystem-net-devices-<name>.device' systemd
unit never appears for the VLAN devices, so any other units that depend
on those units fail (they may depend on them so that they can delay
starting until my VLANs are set up, for example).

This doesn't appear to depend on the specific form of the VLAN network
names; in the process of attempting to deal with this, I renamed them
from a 'em0.151' form to 'em-dev2'.

The devices are fully configured despite this problem, with IP addresses
and so on. I am not certain if there are other side effects beyond the
systemd unit failure.

This very much did not happen in Fedora 29.

Version-Release number of selected component (if applicable):

systemd-udev-241-10.git511646b.fc30.x86_64
systemd-241-10.git511646b.fc30.x86_64

How reproducible:

Completely. It happens every time the system boots.

Actual results:

Log messages of the form:
Aug 15 15:08:58 hawkwind.cs systemd-udevd[1082]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Aug 15 15:08:58 haAug 15 15:08:58 hawkwind.cs systemd-udevd[1082]: Using default interface naming scheme 'v240'.
wkwind.cs systemd-udevd[1082]: em-dev2: Failed to rename network interface 4 from 'em-dev2' to 'em0': File exists

Despite this error, systemd-networkd will later report 'em-dev2: netdev ready'.

Additional information:

'udevadm info /sys/class/net/em-dev2' reports:
P: /devices/virtual/net/em-dev2
L: 0
E: DEVPATH=/devices/virtual/net/em-dev2
E: DEVTYPE=vlan
E: INTERFACE=em-dev2
E: IFINDEX=4
E: SUBSYSTEM=net
E: USEC_INITIALIZED=6992333
E: ID_MM_CANDIDATE=1
E: ID_NET_DRIVER=802.1Q VLAN Support
E: ID_NET_LINK_FILE=/etc/systemd/network/10-em0-amd.link
E: ID_NET_NAME=em0
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/em0
E: TAGS=:systemd:

It may be relevant that my base device has multiple VLANs and an assigned IP
address (for untagged traffic). I also set 'biosdevname=0' on the kernel
command line; networkd finds the interface it will name em0 by MAC address.

Comment 1 Zbigniew Jędrzejewski-Szmek 2019-12-21 17:10:07 UTC
I think this is most likely caused by some "rogue" udev rule provided externally.
Please attach the output of "systemd-analyze cat-config udev/rules.d"
and "sudo SYSTEMD_LOG_LEVEL=debug udevadm test /sys/class/net/em-dev2".

Comment 2 Chris Siebenmann 2019-12-24 21:28:44 UTC
I've upgraded to Fedora 31 since this report, but the issue appears to still be present based on the 'failed to rename' messages still being present in journalctl logs. I'll attach the two files of output to this issue.

Comment 3 Chris Siebenmann 2019-12-24 21:29:31 UTC
Created attachment 1647554 [details]
output from 'systemd-analyze cat-config udev/rules.d'

Comment 4 Chris Siebenmann 2019-12-24 21:30:03 UTC
Created attachment 1647555 [details]
output from 'SYSTEMD_LOG_LEVEL=debug udevadm test /sys/class/net/em-dev2'

Comment 5 Chris Siebenmann 2019-12-24 21:39:27 UTC
I think I see what is happening here from the udevadm output. The VLAN devices have the same MAC as the underlying physical device, and so they match the /etc/systemd/network/10-em0-amd.link file that recognizes the physical device and gives it a name:

# ASUS Prime X370-PRO AMD motherboard onboard port, for the new hardware.
[Match]
MACAddress=60:45:cb:a0:e8:dd

[Link]
Description=Onboard port
MACAddressPolicy=persistent
Name=em0

I think that networkd or udevd must have changed how matching was handled in some version, so that now this should have a 'Type=' of some value. But I don't know what 'Type=' to use here; I can't see what DEVTYPE udev assigns for physical ports in eg 'udevadm test' of one.

Comment 6 Zbigniew Jędrzejewski-Szmek 2019-12-25 12:39:31 UTC
Yeah, in systemd-241 we added a new NamingPolicy=keep value. The default .link file
was changed to have "keep", while the general idea was that user-provided files would
not include "keep", so that they would rename interfaces even if they were previously
given a name by userspace. This was done because users were confused and annoyed when
their explicit configuration was ignored. But for your case, the effect is bad.

I think the easiest solution would be to add 'NamePolicy=keep' to your .link file.
This should be enough to keep the vlan interfaces from being renamed. Please let
me know if that works for you.

I'm not sure if we need to make changes upstream. We didn't consider this case when
making the change, but at this point I'm not sure if changing things yet again is
better then keeping them stable.

Comment 7 Chris Siebenmann 2020-01-06 17:57:30 UTC
Adding 'NamePolicy=keep' to my 10-em0-amd.link (in the '[Link]' section) appears to have resolved this issue for me; I no longer see logged complaints about 'failed to rename ...' and the systemd sys-subsystem-net-devices-<...>.device units are now created and shown as active, which they weren't before.

Comment 8 Ben Cotton 2020-04-30 20:13:44 UTC
This message is a reminder that Fedora 30 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 30 on 2020-05-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '30'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 30 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Ben Cotton 2020-05-26 18:39:44 UTC
Fedora 30 changed to end-of-life (EOL) status on 2020-05-26. Fedora 30 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 10 Steve D 2021-11-28 15:10:28 UTC
Having just hit this, I can confirm setting Type= appropriately works as well, and may be less brittle. "networkctl status <interface>" shows the type.

Comment 11 Steve D 2021-12-04 11:05:31 UTC
For completeness, relevant upstream bug reports:

https://github.com/systemd/systemd/issues/11992
https://github.com/systemd/systemd/issues/11921

For me, this issue seemed to be having a knock-on effect of preventing complete vlan initialization in a similar way to:

https://github.com/systemd/systemd/issues/12308

(though the underlying cause in that report is different)