Bug 1427482 - NetworkManager doesn't see vlan team-slaves after reboot
Summary: NetworkManager doesn't see vlan team-slaves after reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager
Version: 7.3
Hardware: All
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Thomas Haller
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-28 10:47 UTC by Stijn De Smet
Modified: 2017-08-01 09:24 UTC (History)
8 users (show)

Fixed In Version: NetworkManager-1.8.0-0.4.rc1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 09:24:38 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2299 normal SHIPPED_LIVE Moderate: NetworkManager and libnl3 security, bug fix and enhancement update 2017-08-01 12:40:28 UTC

Description Stijn De Smet 2017-02-28 10:47:42 UTC
Description of problem:
When team-slaves are created using a Vlan, this works as expected, but when the server is rebooted, the slave is no longer recognized as a 'vlan' devicetype, but it is seen as a '802-3-ethernet' devicetype, and it no longer works.

Version-Release number of selected component (if applicable):
1.4.0-14.el7_3

How reproducible:
always

Steps to Reproduce:
1.nmcli con add type team con-name team142 ifname team142 ip4 10.106.142.57/24 ipv6.method ignore
2.nmcli con add type vlan slave-type team con-name team-slave-enp31s0f1-142 ifname enp31s0f1-142 dev enp31s0f1 id 142 master team142
3.nmcli con add type vlan slave-type team con-name team-slave-enp36s0f1-142 ifname enp36s0f1-142 dev enp36s0f1 id 142 master team142
4.reboot
5. after reboot the slaves to longer work:
[root@localhost ~]# nmcli con up team-slave-enp31s0f1-142
Error: Connection activation failed: No suitable device found for this connection.
[root@localhost ~]# 

Actual results:
[root@localhost ~]# nmcli c
NAME                      UUID                                  TYPE            DEVICE      
team142                   60da3835-f6a5-47b5-b49f-e620018d546f  team            team142     
team-slave-enp31s0f1-142  c03fe54c-e3fb-4ea0-b342-4108dc455dba  802-3-ethernet  --          
team-slave-enp36s0f1-142  57844fd0-234a-4e22-97d7-454a07ecf284  802-3-ethernet  --          


Expected results:
[root@localhost ~]# nmcli c
NAME                      UUID                                  TYPE            DEVICE      
team-slave-enp31s0f1-142  c03fe54c-e3fb-4ea0-b342-4108dc455dba  vlan            enp31s0f1-142 
team-slave-enp36s0f1-142  57844fd0-234a-4e22-97d7-454a07ecf284  vlan            enp36s0f1-142 
team142                   60da3835-f6a5-47b5-b49f-e620018d546f  team            team142      

Additional info:
The same behaviour is observed when these interfaces are created with nmtui.
Since I needed this working, I looked at the source code of NetworkManager, and the problem appears to be in in interpretation of the redhat-specific portion:
NetworkManager-1.4.0/src/settings/plugins/ifcfg-rh/reader.c:
line 5011, when it is a team-slave it is just assigned TYPE_ETHERNET, without any checks.

I copied the vlan checks there and recompiled the package, and this solved the problem for me.

char* device;
device = svGetValueString (parsed, "DEVICE");
if ((device)&&(is_vlan_device (device, parsed))
{
type = g_strdup (TYPE_VLAN);
g_free (device);
}
else
{
type = g_strdup (TYPE_ETHERNET);
}

Comment 2 Thomas Haller 2017-02-28 11:30:41 UTC
the ifcfg-plugin cannot correctly write/read the team slave connection.
Reboot is not relevant to reproduce



nmcli con add type team con-name team142 ifname team142 ip4 10.106.142.57/24 ipv6.method ignore autoconnect no
nmcli con add type vlan slave-type team con-name team-slave-enp31s0f1-142 ifname enp31s0f1-142 dev enp31s0f1 id 142 master team142 autoconnect no

nmcli connection show team-slave-enp31s0f1-142 > c-1
nmcli connection reload
nmcli connection show team-slave-enp31s0f1-142 > c-2

diff c-?




# cat /etc/sysconfig/network-scripts/ifcfg-team-slave-enp31s0f1-142 
VLAN=yes
DEVICE=enp31s0f1-142
PHYSDEV=enp31s0f1
VLAN_ID=142
REORDER_HDR=yes
GVRP=no
MVRP=no
NAME=team-slave-enp31s0f1-142
UUID=bffcb0c3-a6dd-4ceb-8c84-1bc569bca556
ONBOOT=no
TEAM_MASTER=team142
DEVICETYPE=TeamPort

Comment 4 Thomas Haller 2017-02-28 13:50:06 UTC
as workaround you could use the keyfile plugin.

NM doesn't allow you to move the connection from ifcfg-rh settings plugin to keyfile plugin, but you can just create a file like



cat <<EOF > /etc/NetworkManager/system-connections/team-slave-enp31s0f1-142
[vlan]
parent=foo
EOF
chmod 600 /etc/NetworkManager/system-connections/team-slave-enp31s0f1-142
nmcli connection reload



and from then on, modify the connection as you like:

  nmcli con modify id team-slave-enp31s0f1-142 slave-type team con-name \
        team-slave-enp31s0f1-142 ifname enp31s0f1-142 dev enp31s0f1 id 142 \
        master team142

Comment 5 Beniamino Galvani 2017-02-28 22:24:35 UTC
> ifcfg-rh: change "goto error" pattern to return early and nm_auto*

 write_route_file_legacy (const char *filename, NMSettingIPConfig *s_ip4, GError **error)
 {
 [...]
        if (!g_file_set_contents (filename, route_contents, -1, NULL)) {
                g_set_error (error, NM_SETTINGS_ERROR, NM_SETTINGS_ERROR_FAILED,
                             "Writing route file '%s' failed", filename);
-               goto error;
+               return TRUE;

return FALSE?


> ifcfg-rh: re-read connection after write and compare the result

Perhaps the check should only be performed when NM is compiled with
more assertions? In other words, the question is: do we want users to
see such warnings in the logs and report them (even if they might be
harmless from user point of view, like the one about missing wired
setting on reread)?

The rest LGTM.

Comment 6 Lubomir Rintel 2017-03-02 09:53:32 UTC
LGTM (apart from what Beniamino already pointed out)

Comment 7 Thomas Haller 2017-03-02 11:22:48 UTC
(In reply to Beniamino Galvani from comment #5)

> > ifcfg-rh: re-read connection after write and compare the result
> 
> Perhaps the check should only be performed when NM is compiled with
> more assertions? In other words, the question is: do we want users to
> see such warnings in the logs and report them (even if they might be
> harmless from user point of view, like the one about missing wired
> setting on reread)?

there are two reasons why the patch of always warning is wrong:

 (1) during write, certificate blobs are converted to paths, and (as ugly
   it is to do that), it is a valid situation where a re-read connection
   differs
 (2) contrary to the keyfile implementation, for ifcfg we re-read the
   connection from file. Thus, there is a race where somebody might
   modify the connection after we write it and before reading it back.

The other cases, I consider a bug. Thus, they are not *harmless* at all. However, logging a warning about a bug isn't helpful to the user either, the bug should be fixed instead.

TL;DR: yes, now no warnings are logged.


Btw, (2) is the reason why I don't do "keyfile: updated connection when writing keyfile" for ifcfg-rh. Although, I really wish we would, but that would require that the writer first creates an in-memory-representation of the files and parses that back in, without accessing the file. Like keyfile writer does.


How about now?

Comment 8 Beniamino Galvani 2017-03-02 12:58:08 UTC
LGTM now.

Comment 11 errata-xmlrpc 2017-08-01 09:24:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2299


Note You need to log in before you can comment on or make changes to this bug.