Bug 1857532 - [sig-network] About vlan parameter of dracut doesn't work on RHCOS
Summary: [sig-network] About vlan parameter of dracut doesn't work on RHCOS
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 4.6.0
Assignee: Dusty Mabe
QA Contact: Michael Nguyen
URL:
Whiteboard:
: 1859897 1862146 (view as bug list)
Depends On:
Blocks: 1186913
TreeView+ depends on / blocked
 
Reported: 2020-07-16 05:55 UTC by checheng
Modified: 2020-12-08 13:13 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:15:06 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:15:43 UTC

Description checheng 2020-07-16 05:55:22 UTC
Description of problem:
Add the vlan setting parameter of dracut on RHCOS. But it doesn't work.

Version-Release number of selected component (if applicable):
OCP 4.3.z

How reproducible:
The bond will fail when add the VLAN setting parameter of dracut on RHCOS.
But the bond will work if remove vlan setting. Or bond the eth and bond1 by manual operation.

Steps to Reproduce:
1. Add bond1 info to dracut. Bond the vlan with bond1
----
~~~~~~ OMIT CODE ~~~~~~~~~~~~~
bond=bond1:ens224,ens193:mode=active-backup,miimon=100:9000
vlan=eth0.3000:bond1
~~~~~~ OMIT CODE ~~~~~~~~~~~~~
----
2. 
----
~~~~~~ OMIT CODE ~~~~~~~~~~~~~
4: ens193: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:80:6d:37 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::7f1a:a21e:4ec5:31f1/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
5: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:80:e1:ba brd ff:ff:ff:ff:ff:ff
    inet6 fe80::8963:8da2:12c6:dcf9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
~~~~~~ OMIT CODE ~~~~~~~~~~~~~
----

Actual results:
The setting doesn't work. And this setting similar lead ensxxx and bond1 bonding to fail.

Expected results:
vlan parameter is able to be working on RHCOS.

Additional info:
Get more information from the No.02695306 ticket.

Comment 1 Micah Abbott 2020-07-16 14:06:56 UTC
This may be a limitation in RHCOS 4.3 about which parameters from dracut are being passed to the real root after install.  RHCOS will have better support for complex networking configurations in the 4.6 release.

First time hearing of a problem like this; setting to medium priority and targeting for 4.6

Comment 2 checheng 2020-07-17 02:02:15 UTC
Thank you for responsing.

I understand the station of RHCOS network setting is still evolution.
Looking forward OCP 4.6 can solve VLAN setting problem.

Best regards.

Comment 3 Dusty Mabe 2020-07-30 19:37:41 UTC
This bug has not been selected for work in the current sprint.

Comment 4 Dusty Mabe 2020-08-12 20:31:52 UTC
Hi checheng,

I believe this behavior is ultimately caused by https://bugzilla.redhat.com/show_bug.cgi?id=1865738, which is an issue with a dracut module writing out an incomplete network configuration based on the kernel cmdline dracut networking arguments. In 4.6 we are moving to using NetworkManager in the initrd, which does not have the same issue. If you have access to the latest 4.6 builds it would be valuable if you could test our the same network configuration to confirm that the problem is solved.

Thanks!
Dusty

Comment 6 Dusty Mabe 2020-08-19 20:39:22 UTC
*** Bug 1859897 has been marked as a duplicate of this bug. ***

Comment 7 Dusty Mabe 2020-08-19 20:40:40 UTC
*** Bug 1862146 has been marked as a duplicate of this bug. ***

Comment 8 Dusty Mabe 2020-08-19 20:42:35 UTC
This bug should be fixed with the move to NetworkManager for networking in the initramfs in 4.6. At least it works in my local tests.

Moving to MODIFIED.

Comment 13 Renata Ravanelli 2020-09-25 15:07:15 UTC
I have checked it for 4.3, 4.5 and 4.6.
All tests were done in libvirt as following:


For 4.3 and 4.5 I saw the same behavior: The bond is not working when the vlan arg is added.

RHCOS: 43.82.202009181853.0 and 45.82.202009181447-0
Kernel args used: 
rd.neednet=1 vlan=bond0.0001:bond0 ip=192.168.122.111::192.168.122.1:255.255.255.0:initrdhost:bond0.0001:none:192.168.122.1 bond=bond0:ens2,ens3:mode=active-backup,miimon=100


[core@initrdhost ~]$ rpm -q dracut (same for 4.3 and 4.5)
dracut-049-70.git20200228.el8.x86_64

[core@initrdhost ~]$  nmcli device status
DEVICE  TYPE      STATE      CONNECTION
ens2    ethernet  connected  Wired connection 1
ens3    ethernet  connected  Wired connection 2
lo      loopback  unmanaged  --


[core@initrdhost ~]$ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:f2:08:2e brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.175/24 brd 192.168.122.255 scope global dynamic noprefixroute ens2
       valid_lft 3170sec preferred_lft 3170sec
    inet6 fe80::95c0:793a:9c10:b414/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:14:e6:62 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.87/24 brd 192.168.122.255 scope global dynamic noprefixroute ens3
       valid_lft 3170sec preferred_lft 3170sec
    inet6 fe80::c62a:58d6:61b2:34bd/64 scope link noprefixroute
       valid_lft forever preferred_lft forever


Case #2: No vlan

Kernel args used: 
rd.neednet=1 ip=192.168.122.111::192.168.122.1:255.255.255.0:initrdhost:bond0:none:192.168.122.1 bond=bond0:ens2,ens3:mode=active-backup,miimon=100


[core@initrdhost ~]$ nmcli device status
DEVICE  TYPE      STATE      CONNECTION
bond0   bond      connected  bond0
ens2    ethernet  connected  ens2
ens3    ethernet  connected  ens3
lo      loopback  unmanaged  --

[core@initrdhost ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:ba:a9:35 brd ff:ff:ff:ff:ff:ff
3: ens3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:ba:a9:35 brd ff:ff:ff:ff:ff:ff
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:ba:a9:35 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.111/24 brd 192.168.122.255 scope global noprefixroute bond0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feba:a935/64 scope link
       valid_lft forever preferred_lft forever


For 4.6 the vlan and the bond are working as expected.

RHCOS: 46.82.202009150240-0
Kernel args used: (same as 4.3 and 4.5)
rd.neednet=1 vlan=bond0.0001:bond0 ip=192.168.122.111::192.168.122.1:255.255.255.0:initrdhost:bond0.0001:none:192.168.122.1 bond=bond0:ens2,ens3:mode=active-backup,miimon=100

[root@initrdhost core]# rpm -q dracut
dracut-049-75.git20200422.el8.x86_64


[core@initrdhost ~]$ nmcli device status
DEVICE      TYPE      STATE      CONNECTION
bond0       bond      connected  bond0
bond0.0001  vlan      connected  bond0.0001
ens2        ethernet  connected  ens2
ens3        ethernet  connected  ens3
lo          loopback  unmanaged  --

[core@initrdhost ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:71:09:7b brd ff:ff:ff:ff:ff:ff
3: ens3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:71:09:7b brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:71:09:7b brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.90/24 brd 192.168.122.255 scope global dynamic noprefixroute bond0
       valid_lft 3572sec preferred_lft 3572sec
    inet6 fe80::5054:ff:fe71:97b/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
7: bond0.0001@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:71:09:7b brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.111/24 brd 192.168.122.255 scope global noprefixroute bond0.0001
       valid_lft forever preferred_lft forever

Comment 14 Dusty Mabe 2020-09-25 18:22:50 UTC
Thanks Renata. Moving to verified.

Comment 16 errata-xmlrpc 2020-10-27 16:15:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.