1427482 – NetworkManager doesn't see vlan team-slaves after reboot

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1427482 - NetworkManager doesn't see vlan team-slaves after reboot

Summary: NetworkManager doesn't see vlan team-slaves after reboot

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	NetworkManager
Sub Component:
Version:	7.3
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Thomas Haller
QA Contact:	Desktop QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-28 10:47 UTC by Stijn De Smet
Modified:	2017-08-01 09:24 UTC (History)
CC List:	8 users (show)
Fixed In Version:	NetworkManager-1.8.0-0.4.rc1.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-08-01 09:24:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:2299	0	normal	SHIPPED_LIVE	Moderate: NetworkManager and libnl3 security, bug fix and enhancement update	2017-08-01 12:40:28 UTC

Description Stijn De Smet 2017-02-28 10:47:42 UTC

Description of problem:
When team-slaves are created using a Vlan, this works as expected, but when the server is rebooted, the slave is no longer recognized as a 'vlan' devicetype, but it is seen as a '802-3-ethernet' devicetype, and it no longer works.

Version-Release number of selected component (if applicable):
1.4.0-14.el7_3

How reproducible:
always

Steps to Reproduce:
1.nmcli con add type team con-name team142 ifname team142 ip4 10.106.142.57/24 ipv6.method ignore
2.nmcli con add type vlan slave-type team con-name team-slave-enp31s0f1-142 ifname enp31s0f1-142 dev enp31s0f1 id 142 master team142
3.nmcli con add type vlan slave-type team con-name team-slave-enp36s0f1-142 ifname enp36s0f1-142 dev enp36s0f1 id 142 master team142
4.reboot
5. after reboot the slaves to longer work:
[root@localhost ~]# nmcli con up team-slave-enp31s0f1-142
Error: Connection activation failed: No suitable device found for this connection.
[root@localhost ~]# 

Actual results:
[root@localhost ~]# nmcli c
NAME                      UUID                                  TYPE            DEVICE      
team142                   60da3835-f6a5-47b5-b49f-e620018d546f  team            team142     
team-slave-enp31s0f1-142  c03fe54c-e3fb-4ea0-b342-4108dc455dba  802-3-ethernet  --          
team-slave-enp36s0f1-142  57844fd0-234a-4e22-97d7-454a07ecf284  802-3-ethernet  --          


Expected results:
[root@localhost ~]# nmcli c
NAME                      UUID                                  TYPE            DEVICE      
team-slave-enp31s0f1-142  c03fe54c-e3fb-4ea0-b342-4108dc455dba  vlan            enp31s0f1-142 
team-slave-enp36s0f1-142  57844fd0-234a-4e22-97d7-454a07ecf284  vlan            enp36s0f1-142 
team142                   60da3835-f6a5-47b5-b49f-e620018d546f  team            team142      

Additional info:
The same behaviour is observed when these interfaces are created with nmtui.
Since I needed this working, I looked at the source code of NetworkManager, and the problem appears to be in in interpretation of the redhat-specific portion:
NetworkManager-1.4.0/src/settings/plugins/ifcfg-rh/reader.c:
line 5011, when it is a team-slave it is just assigned TYPE_ETHERNET, without any checks.

I copied the vlan checks there and recompiled the package, and this solved the problem for me.

char* device;
device = svGetValueString (parsed, "DEVICE");
if ((device)&&(is_vlan_device (device, parsed))
{
type = g_strdup (TYPE_VLAN);
g_free (device);
}
else
{
type = g_strdup (TYPE_ETHERNET);
}

Comment 2 Thomas Haller 2017-02-28 11:30:41 UTC

the ifcfg-plugin cannot correctly write/read the team slave connection.
Reboot is not relevant to reproduce



nmcli con add type team con-name team142 ifname team142 ip4 10.106.142.57/24 ipv6.method ignore autoconnect no
nmcli con add type vlan slave-type team con-name team-slave-enp31s0f1-142 ifname enp31s0f1-142 dev enp31s0f1 id 142 master team142 autoconnect no

nmcli connection show team-slave-enp31s0f1-142 > c-1
nmcli connection reload
nmcli connection show team-slave-enp31s0f1-142 > c-2

diff c-?




# cat /etc/sysconfig/network-scripts/ifcfg-team-slave-enp31s0f1-142 
VLAN=yes
DEVICE=enp31s0f1-142
PHYSDEV=enp31s0f1
VLAN_ID=142
REORDER_HDR=yes
GVRP=no
MVRP=no
NAME=team-slave-enp31s0f1-142
UUID=bffcb0c3-a6dd-4ceb-8c84-1bc569bca556
ONBOOT=no
TEAM_MASTER=team142
DEVICETYPE=TeamPort

Comment 3 Thomas Haller 2017-02-28 13:42:20 UTC

https://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=th/ifcfg-reread-rh1427482

Comment 4 Thomas Haller 2017-02-28 13:50:06 UTC

as workaround you could use the keyfile plugin.

NM doesn't allow you to move the connection from ifcfg-rh settings plugin to keyfile plugin, but you can just create a file like



cat <<EOF > /etc/NetworkManager/system-connections/team-slave-enp31s0f1-142
[vlan]
parent=foo
EOF
chmod 600 /etc/NetworkManager/system-connections/team-slave-enp31s0f1-142
nmcli connection reload



and from then on, modify the connection as you like:

  nmcli con modify id team-slave-enp31s0f1-142 slave-type team con-name \
        team-slave-enp31s0f1-142 ifname enp31s0f1-142 dev enp31s0f1 id 142 \
        master team142

Comment 5 Beniamino Galvani 2017-02-28 22:24:35 UTC

> ifcfg-rh: change "goto error" pattern to return early and nm_auto*

 write_route_file_legacy (const char *filename, NMSettingIPConfig *s_ip4, GError **error)
 {
 [...]
        if (!g_file_set_contents (filename, route_contents, -1, NULL)) {
                g_set_error (error, NM_SETTINGS_ERROR, NM_SETTINGS_ERROR_FAILED,
                             "Writing route file '%s' failed", filename);
-               goto error;
+               return TRUE;

return FALSE?


> ifcfg-rh: re-read connection after write and compare the result

Perhaps the check should only be performed when NM is compiled with
more assertions? In other words, the question is: do we want users to
see such warnings in the logs and report them (even if they might be
harmless from user point of view, like the one about missing wired
setting on reread)?

The rest LGTM.

Comment 6 Lubomir Rintel 2017-03-02 09:53:32 UTC

LGTM (apart from what Beniamino already pointed out)

Comment 7 Thomas Haller 2017-03-02 11:22:48 UTC

(In reply to Beniamino Galvani from comment #5)

> > ifcfg-rh: re-read connection after write and compare the result
> 
> Perhaps the check should only be performed when NM is compiled with
> more assertions? In other words, the question is: do we want users to
> see such warnings in the logs and report them (even if they might be
> harmless from user point of view, like the one about missing wired
> setting on reread)?

there are two reasons why the patch of always warning is wrong:

 (1) during write, certificate blobs are converted to paths, and (as ugly
   it is to do that), it is a valid situation where a re-read connection
   differs
 (2) contrary to the keyfile implementation, for ifcfg we re-read the
   connection from file. Thus, there is a race where somebody might
   modify the connection after we write it and before reading it back.

The other cases, I consider a bug. Thus, they are not *harmless* at all. However, logging a warning about a bug isn't helpful to the user either, the bug should be fixed instead.

TL;DR: yes, now no warnings are logged.


Btw, (2) is the reason why I don't do "keyfile: updated connection when writing keyfile" for ifcfg-rh. Although, I really wish we would, but that would require that the writer first creates an in-memory-representation of the files and parses that back in, without accessing the file. Like keyfile writer does.


How about now?

Comment 8 Beniamino Galvani 2017-03-02 12:58:08 UTC

LGTM now.

Comment 9 Thomas Haller 2017-03-02 14:11:40 UTC

thanks. Merged: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=9d39569287f6770b951e30f68d88e14f9ec68ac7

Comment 11 errata-xmlrpc 2017-08-01 09:24:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2299

Note You need to log in before you can comment on or make changes to this bug.