Bug 1390605

Summary: Can't receive ip for vlan devices that were configured with cockpit version 118
Product: Red Hat Enterprise Linux 7 Reporter: Michael Burman <mburman>
Component: cockpitAssignee: Dominik Perpeet <dperpeet>
Status: CLOSED ERRATA QA Contact: qe-baseos-daemons
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: dperpeet, fdeutsch, jscotka, mburman, stefw
Target Milestone: rcKeywords: Extras, Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-06 17:44:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1329957, 1389324    
Attachments:
Description Flags
messages log none

Description Michael Burman 2016-11-01 13:59:01 UTC
Created attachment 1216120 [details]
messages log

Description of problem:
Can't receive ip for vlan devices that were configured with cockpit version 118.

When trying to activate a vlan device that is created in cockpit activation is failed and we can't get ip from the dhcp server.

This is happening only with vlan devices that are configured using cockpit.
When i isolated this and created the same vlan device using nmcli or using ifcfg-* files everything working as expected and i manage to get ip over the vlan device. 

Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2161] device (enp4s0.162): link connected
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2168] device (enp4s0.162): state change: unavailable -> disconnected (reason 'user-requested') [20 30 39]
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2185] device (enp4s0.162): Activation: starting connection 'enp4s0.162' (8a5ce148-df5a-4e82-be32-00c561453466)
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2198] device (enp4s0.162): state change: disconnected -> prepare (reason 'none') [30 40 0]
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2205] device (enp4s0.162): state change: prepare -> config (reason 'none') [40 50 0]
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2352] device (enp4s0.162): state change: config -> ip-config (reason 'none') [50 70 0]
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2357] dhcp4 (enp4s0.162): activation: beginning transaction (timeout in 45 seconds)
Nov  1 15:44:51 orchid-vds2 NetworkManager[1039]: <info>  [1478007891.2375] dhcp4 (enp4s0.162): dhclient started with pid 21605
Nov  1 15:44:51 orchid-vds2 dhclient[21605]: DHCPDISCOVER on enp4s0.162 to 255.255.255.255 port 67 interval 6 (xid=0x23e7d5b3)
Nov  1 15:44:57 orchid-vds2 dhclient[21605]: DHCPDISCOVER on enp4s0.162 to 255.255.255.255 port 67 interval 8 (xid=0x23e7d5b3)
Nov  1 15:45:05 orchid-vds2 dhclient[21605]: DHCPDISCOVER on enp4s0.162 to 255.255.255.255 port 67 interval 15 (xid=0x23e7d5b3)
Nov  1 15:45:20 orchid-vds2 dhclient[21605]: DHCPDISCOVER on enp4s0.162 to 255.255.255.255 port 67 interval 21 (xid=0x23e7d5b3)
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <warn>  [1478007936.1853] dhcp4 (enp4s0.162): request timed out
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.1853] dhcp4 (enp4s0.162): state changed unknown -> timeout
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.2014] dhcp4 (enp4s0.162): canceled DHCP transaction, DHCP client pid 21605
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.2015] dhcp4 (enp4s0.162): state changed timeout -> done
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.2018] device (enp4s0.162): state change: ip-config -> failed (reason 'ip-config-unavailable') [70 120 5]
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.2021] policy: disabling autoconnect for connection 'enp4s0.162'.
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <warn>  [1478007936.2023] device (enp4s0.162): Activation: failed for connection 'enp4s0.162'
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.2028] device (enp4s0.162): state change: failed -> disconnected (reason 'none') [120 30 0]
Nov  1 15:45:36 orchid-vds2 NetworkManager[1039]: <info>  [1478007936.2164] device (enp4s0.162): state change: disconnected -> unmanaged (reason 'user-requested') [30 10 39]


- The ifcfg-enp4s0.162 that is generated by cockpit -

[root@orchid-vds2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp4s0.162 
VLAN=yes
TYPE=Vlan
DEVICE=enp4s0.162
PHYSDEV=enp4s0
VLAN_ID=162
REORDER_HDR=no
GVRP=no
VLAN_FLAGS=NO_REORDER_HDR
MVRP=no
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=enp4s0.162
UUID=8a5ce148-df5a-4e82-be32-00c561453466
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes

- The ifcfg-enp4s0.162 that is created using nmcli - 

[root@orchid-vds2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp4s0.162-1 
VLAN=yes
TYPE=Vlan
PHYSDEV=enp4s0
VLAN_ID=162
REORDER_HDR=yes
GVRP=no
MVRP=no
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=enp4s0.162-1
UUID=88f2f4dc-aac9-4fda-a889-4a2f79b710f3
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes


Version-Release number of selected component (if applicable):
cockpit-ws-118-2.el7.x86_64
cockpit-shell-118-2.el7.noarch
cockpit-bridge-118-2.el7.x86_64
cockpit-storaged-118-2.el7.noarch
cockpit-ovirt-dashboard-0.10.6-1.4.1.el7ev.noarch
NetworkManager-1.4.0-12.el7.x86_64
rhvh-4.0-0.20161027.0+1
rhel7.3
kernel 3.10.0-514.el7.x86_64


How reproducible:
100% with version 118

Steps to Reproduce:
1. Install rhv-h server with cockpit version 118
2. Create vlan device using cockpit with ipv4 automatic(dhcp)
3. Try to activate the vlan device

Actual results:
Activation failed and vlan device can't get ip

Expected results:
Should work as expected.

Comment 1 Fabian Deutsch 2016-11-01 18:47:26 UTC
VLANs are crucial for the RHV use-case.

Comment 3 Dominik Perpeet 2016-11-01 20:38:45 UTC
I can confirm that the vlan device doesn't receive an IP address, but mvollmer will have to take a look and see whether this is intended behavior and/or a Cockpit bug.

Did this work before? It looks like this case isn't covered by the Cockpit integration tests.

Comment 4 Marius Vollmer 2016-11-02 08:27:48 UTC
(In reply to Dominik Perpeet from comment #3)
> I can confirm that the vlan device doesn't receive an IP address, but
> mvollmer will have to take a look and see whether this is intended behavior
> and/or a Cockpit bug.
> 
> Did this work before? It looks like this case isn't covered by the Cockpit
> integration tests.

We never got an IP in the tests because the vlan that we try to connect to doesn't actually exist, and there wont be any DHCP replies.

Can someone help me set up a working vlan with virt-manager?  I am a bit out of my depth here.

Comment 5 Marius Vollmer 2016-11-02 08:33:27 UTC
Can you show the nmcli invocation for setting up the vlan?

Also, can you show the output of "nmcli con show $CON" for the working vlan connection?

Comment 6 Marius Vollmer 2016-11-02 08:39:16 UTC
Here is the difference from the good to the bad ifcfg files.  Can you try them one by one and tell which one causes the breakage?

The bad one specifies the device:

+DEVICE=enp4s0.162

The bad one does IPv6 differently, no idea what this means exactly:

-IPV6INIT=yes
+IPV6INIT=no

The bad one switches of some reordering:

-REORDER_HDR=yes
+REORDER_HDR=no
+VLAN_FLAGS=NO_REORDER_HDR

Comment 7 Marius Vollmer 2016-11-02 08:56:53 UTC
> 3. Try to activate the vlan device

Did you try to activate it via Cockpit, or with nmcli?  Could you try nmcli as well?

Cockpit gets a JavaScript exception when clicking on the On button.  Maybe this is the whole problem?

Comment 8 Michael Burman 2016-11-02 09:08:23 UTC
(In reply to Marius Vollmer from comment #5)
> Can you show the nmcli invocation for setting up the vlan?
> 
> Also, can you show the output of "nmcli con show $CON" for the working vlan
> connection?

Hi

[root@orchid-vds2 ~]# nmcli connection show 
NAME           UUID                                  TYPE            DEVICE 
System enp4s0  7bea4bfb-6e6d-447f-8475-fe413f05f520  802-3-ethernet  enp4s0


[root@orchid-vds2 ~]# nmcli con add type vlan con-name enp4s0.162 dev enp4s0 id 162; \
> nmcli con mod uuid 7bea4bfb-6e6d-447f-8475-fe413f05f520 ipv4.method disabled ipv6.method ignore; \
> nmcli con mod id enp4s0.162 ipv4.method auto; \
> nmcli con down uuid 7bea4bfb-6e6d-447f-8475-fe413f05f520; \
> nmcli con up uuid 7bea4bfb-6e6d-447f-8475-fe413f05f520; \
> nmcli con up id enp4s0.162
Connection 'enp4s0.162' (88f2f4dc-aac9-4fda-a889-4a2f79b710f3) successfully added.


[root@orchid-vds2 ~]# nmcli con show --active 
NAME           UUID                                  TYPE            DEVICE     
System enp4s0  7bea4bfb-6e6d-447f-8475-fe413f05f520  802-3-ethernet  enp4s0     
enp4s0.162   88f2f4dc-aac9-4fda-a889-4a2f79b710f3  vlan            enp4s0.162 
virbr0         243e5622-6e3b-4784-b8a7-c8b3d119d42a  bridge          virbr0

Comment 9 Michael Burman 2016-11-02 09:10:59 UTC
(In reply to Marius Vollmer from comment #7)
> > 3. Try to activate the vlan device
> 
> Did you try to activate it via Cockpit, or with nmcli?  Could you try nmcli
> as well?
> 
> Cockpit gets a JavaScript exception when clicking on the On button.  Maybe
> this is the whole problem?

Yes, i'm trying to activate the vlan device via cockpit.
Like i wrote in description, with nmcli or via ifcfg-* files it wokring as expected. The issue is to activate(and get ip) the vlan device only via cockpit.

Comment 10 Marius Vollmer 2016-11-02 09:17:20 UTC
The JavaScript exception should be cured by this:

   https://github.com/cockpit-project/cockpit/pull/5293

Please reopen if there is more to do.  We unfortunately don't test whether the VLAN actually works in our tests, and I would need help with setting that up.

(Well, we don't have any VLAN tests at all right now, but we can fix that ourselves, I guess.)

Comment 11 Marius Vollmer 2016-11-02 09:22:01 UTC
> Like i wrote in description, with nmcli or via ifcfg-* files it wokring as expected.

Does the following work?

 - Create VLAN interface in Cockpit, say "ens3.62"
 - Activate that interface with "nmcli con up ens3.62"

Comment 12 Dominik Perpeet 2016-11-02 09:24:21 UTC
(In reply to Marius Vollmer from comment #10)
> The JavaScript exception should be cured by this:
> 
>    https://github.com/cockpit-project/cockpit/pull/5293
> 
> Please reopen if there is more to do.  We unfortunately don't test whether
> the VLAN actually works in our tests, and I would need help with setting
> that up.
> 
> (Well, we don't have any VLAN tests at all right now, but we can fix that
> ourselves, I guess.)

This exception isn't raised in the version reported (118), only on master. So it's unrelated, I would say.

Comment 13 Marius Vollmer 2016-11-02 09:45:26 UTC
(In reply to Marius Vollmer from comment #7)
 
> Cockpit gets a JavaScript exception when clicking on the On button.  Maybe
> this is the whole problem?

Argh, this only happens with versions later than 118, so it can't explain your trouble.  Sorry.

Comment 14 Marius Vollmer 2016-11-02 09:46:21 UTC
Could you look at comment 6?

Comment 15 Michael Burman 2016-11-02 10:03:29 UTC
(In reply to Marius Vollmer from comment #11)
> > Like i wrote in description, with nmcli or via ifcfg-* files it wokring as expected.
> 
> Does the following work?
> 
>  - Create VLAN interface in Cockpit, say "ens3.62"
>  - Activate that interface with "nmcli con up ens3.62"

The follow dosen't work. Activation failed. If the vlan device is created via cockpit, it can't be activated neither using nmcli.

Comment 16 Michael Burman 2016-11-02 10:04:45 UTC
(In reply to Marius Vollmer from comment #14)
> Could you look at comment 6?

I look at comment 6 and yes i noticed the difference as well...i'm not sure i understand what do you want me to try..?

Comment 18 Marius Vollmer 2016-11-02 11:25:25 UTC
(In reply to Michael Burman from comment #16)
> (In reply to Marius Vollmer from comment #14)
> > Could you look at comment 6?
> 
> I look at comment 6 and yes i noticed the difference as well...i'm not sure
> i understand what do you want me to try..?

Take the original ifcfg-enp4s0.162-1 file as reported in comment 0 and do this:

 # nmcli con reload
 # nmcli con up enp4s0.162-1
 # nmcli dev
 
I assume that this will not show the bug: enp4s0.162 correctly receives an IP address.  Please paste the output of the commands here.

Then you stop the device and make the first change to ifcfg-enp4s0.162-1:

 # nmcli dev dis enp4s0.162
 # vi etc/sysconfig/network-scripts/ifcfg-enp4s0.162 
 (add the "DEVICE=enp4s0.162" line somewhere)

Then you repeat the test:

 # nmcli con reload
 # nmcli con up enp4s0.162-1
 # nmcli dev

Does it still receive an IP address?  Please paste the output as well.

Continue with the remaining two changes:

 - Change IPV6INIT to "no".
 - Change REORDER_HDR to "no" and add "VLAN_FLAGS=NO_REORDER_HDR"

Comment 19 Marius Vollmer 2016-11-02 12:16:01 UTC
I was able to reproduce this with help of the reporter.  (Thanks!)

The crucial bit is the NM_VLAN_FLAG_REORDER_HEADERS flag in the vlan settings.
The interface can only successfully acquire a IP address when that flag is set.

NetworkManager says that this flag is set by default, except when using the D-Bus API.

Thus, "nmcli con add type vlan" will set the flag, but Cockpit will not.

Since the flag seems to be important and is on "by default" with certain ways of creating a VLAN interface, I think Cockpit should cargo cult this and just set it as well "by default".


Here is the relevant bit of the docs:

   The default value of this property is NM_VLAN_FLAG_REORDER_HEADERS, but it 
   used to be 0. To preserve backward compatibility, the default-value in the
   D-Bus API continues to be 0 and a missing property on D-Bus is still
   considered as 0.

I managed to miss this, or it was added later.

Comment 20 Marius Vollmer 2016-11-02 12:54:49 UTC
https://github.com/cockpit-project/cockpit/pull/5294

Comment 28 errata-xmlrpc 2016-12-06 17:44:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2888.html