1547372 – [Cockpit 160] Duplicate IP when creating bond from a slave that has the host connection(Regression)

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1547372 - [Cockpit 160] Duplicate IP when creating bond from a slave that has the host connection(Regression)

Summary: [Cockpit 160] Duplicate IP when creating bond from a slave that has the host ...

Keywords:
Status:	CLOSED DUPLICATE of bug 1548265
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	cockpit
Sub Component:
Version:	7.5
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Marius Vollmer
QA Contact:	qe-baseos-daemons
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-21 07:43 UTC by Michael Burman
Modified:	2018-03-06 12:30 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-03-05 07:52:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Logs (108.65 KB, application/x-gzip) 2018-02-21 07:43 UTC, Michael Burman	no flags	Details
View All

Description Michael Burman 2018-02-21 07:43:26 UTC

Created attachment 1398560 [details]
Logs

Description of problem:
[Cockpit 160] Duplicate IP when creating bond from a slave that has the host connection(Regression)

If trying to create a bond from 2 slaves via cockpit and one of the slaves has the active connection of the host, then the bond created with an active IP of the host as expected, but the slave also remain with the same IP address which is quite bad and a regression on cockpit side(i think..) 

- Start point -
enp4s0 - active connection 10.35.x.z
enp6s0 - secondary slave 

enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 10.35.x.y/24 brd 10.35.128.255 scope global noprefixroute dynamic enp4s0
       valid_lft 42541sec preferred_lft 42541sec

- Create bond via cockpit from this 2 NICs
- Result 

enp4s0 - 10.35.x.y remain on the slave 
enp6s0 - secondary slave 
bond1 - 10.35.x.y 


2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 10.35.x.y/24 brd 10.35.128.255 scope global noprefixroute dynamic enp4s0
       valid_lft 42502sec preferred_lft 42502sec

27: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 10.35.x.y/24 brd 10.35.128.255 scope global noprefixroute dynamic bond1
       valid_lft 43139sec preferred_lft 43139sec

The same IP address now reported on the primary slave and on the active bond. 

[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp4s0 
# Generated by dracut initrd
NAME=enp4s0
DEVICE=enp4s0
ONBOOT=yes
NETBOOT="yes"
UUID=011867c1-4abe-49e1-ab49-59cd34b98a8b
TYPE=Ethernet
PROXY_METHOD="none"
BROWSER_ONLY="no"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
MASTER=bond1
SLAVE=yes
[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp6s0 
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
NAME=enp6s0
UUID=bd4b5b08-1730-4552-aa6f-1c76aa92c62a
DEVICE=enp6s0
ONBOOT=yes
MASTER=bond1
SLAVE=yes
[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1 
BONDING_OPTS="downdelay=0 miimon=100 mode=active-backup primary=enp4s0 updelay=0"
TYPE=Bond
BONDING_MASTER=yes
MACADDR=00:14:5E:00:00:00
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=bond1
UUID=040a78c8-c26b-4a2e-ae75-d055fea43347
DEVICE=bond1
ONBOOT=yes
AUTOCONNECT_SLAVES=yes

Version-Release number of selected component (if applicable):
cockpit-160-2.el7.x86_64
NetworkManager-1.10.2-11.el7.x86_64
3.10.0-845.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create bond via cockpit from this 2 NICs when one of the slaves has the active host connection

Actual results:
The same IP address now reported on the primary slave and on the active bond

Expected results:
When creating a bond in cockpit and one of the slaves has the active connection, then the IP address must be removed from the slave and remain only on the active new bond interface. This was implemented and requested in BZ 1395108 and worked properly on previous cockpit versions.

Comment 3 Michael Burman 2018-02-21 07:48:50 UTC

This bug may affecting RHV. If user will try to add the host to RHV over the FQDN/IP, it may end up as ovirtmgmt configured on top of enp4s0 or maybe on top of the bond because both interfaces has the host active connection IP address.

Comment 4 Michael Burman 2018-02-21 07:55:41 UTC

After adding such host to RHV, the ovirtmgmt bridge configured on top of the bond, but now both the bond and ovirtmgmt having the same IP which bad as well and possibly caused by this bug.

27: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 10.35.x.y/24 brd 10.35.128.255 scope global noprefixroute dynamic bond1
       valid_lft 41992sec preferred_lft 41992sec
32: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 10.35.x.y/24 brd 10.35.128.255 scope global dynamic ovirtmgmt
       valid_lft 42965sec preferred_lft 42965sec

Comment 5 Michael Burman 2018-02-21 08:28:51 UTC

If the bridge was set on top of enp4s0, then result is worse - 

[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1 
BONDING_OPTS="downdelay=0 miimon=100 mode=active-backup primary=enp4s0 updelay=0"
TYPE=Bond
BONDING_MASTER=yes
MACADDR=00:14:5E:00:00:00
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=bond1
UUID=040a78c8-c26b-4a2e-ae75-d055fea43347
DEVICE=bond1
ONBOOT=yes
AUTOCONNECT_SLAVES=yes
[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt 
# Generated by VDSM version 4.20.18-1.el7ev
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
BOOTPROTO=dhcp
MTU=1500
DEFROUTE=yes
NM_CONTROLLED=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp4s0 
# Generated by VDSM version 4.20.18-1.el7ev
DEVICE=enp4s0
BRIDGE=ovirtmgmt
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no
[root@orchid-vds1 ~]# brctl show
bridge name     bridge id               STP enabled     interfaces
;vdsmdummy;             8000.000000000000       no
ovirtmgmt               8000.00145e17d5b0       no              enp4s0
virbr0          8000.525400ba660a       yes             virbr0-nic

vdsm didn't acquired the bond

Comment 6 Michael Burman 2018-02-21 14:15:01 UTC

One last thing i noticed about this bond creation, when the bond created via cockpit, enp4s0 is actually not a member of the bond, although the ifcfg says differently so is the cockpit UI. 
But when running ip link after bond creation, only enp6s0 is part is a member of bond1. 

[root@orchid-vds1 ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:14:5e:17:xx:xx brd ff:ff:ff:ff:ff:ff
3: enp6s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
    link/ether 00:14:5e:17:xx:x0 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 00:15:17:3d:cd:aa brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 00:15:17:3d:cd:ab brd ff:ff:ff:ff:ff:ff
22: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:14:5e:xx:xx:xx brd ff:ff:ff:ff:ff:ff

[root@orchid-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp4s0 
# Generated by dracut initrd
NAME=enp4s0
DEVICE=enp4s0
ONBOOT=yes
NETBOOT="yes"
UUID=8654bdda-10c1-4c46-b3b2-88b8a9a475d4
TYPE=Ethernet
PROXY_METHOD="none"
BROWSER_ONLY="no"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
MASTER=bond1
SLAVE=yes

[root@orchid-vds1 ~]# nmcli c s enp4s0
connection.master:                      bond1
connection.slave-type:                  bond
connection.autoconnect-slaves:          -1 (default)

Comment 7 Marius Vollmer 2018-03-01 07:49:08 UTC

Superficially, this looks like a duplicate of bug 1548265.  Would you agree?

Comment 8 Martin Pitt 2018-03-02 07:13:27 UTC

An obvious workaround would be to disconnect the interfaces first before putting them into a bond - can you confirm that this works? Apparently NetworkManager changed behaviour to not tear down active devices that get put into a bond/bridge, and only do this at the next reboot. Intuitively this makes sense, as tearing down an active device without the user's explicit consent might sever the very connection the user is currently on, so this change might have been a safety improvement. However, the NetworkManager guys should comment here (or rather, on bug 1548265).

On the cockpit side we cannot change this behaviour - the only thing which Cockpit could do is to forcefully disconnect a device when you put it into a bond, but (1) this might make the behaviour actually worse, and (2) Cockpit should not second-guess NetworkManager's (or any other API it's talking to) logic.

Comment 9 Marius Vollmer 2018-03-02 19:41:04 UTC

> An obvious workaround would be to disconnect the interfaces first before putting them into a bond 

We can explicitly activate the bond after creating it, that will also enslave all interfaces that were forgotten by NM when it activated the bond automatically.

Comment 10 Marius Vollmer 2018-03-05 07:52:49 UTC


*** This bug has been marked as a duplicate of bug 1548265 ***

Comment 11 Michael Burman 2018-03-05 15:19:57 UTC

(In reply to Marius Vollmer from comment #7)
> Superficially, this looks like a duplicate of bug 1548265.  Would you agree?

Hi
To be honest i still not sure this is the same bug), but it is sounds very similar.

Comment 12 Marius Vollmer 2018-03-06 11:57:29 UTC

(In reply to Michael Burman from comment #11)

> To be honest i still not sure this is the same bug), but it is sounds very
> similar.

I am pretty sure now.  I'll add a nmcli reproducer to bug 1548265 that matches your report here.

Comment 13 Michael Burman 2018-03-06 12:30:53 UTC

(In reply to Marius Vollmer from comment #12)
> (In reply to Michael Burman from comment #11)
> 
> > To be honest i still not sure this is the same bug), but it is sounds very
> > similar.
> 
> I am pretty sure now.  I'll add a nmcli reproducer to bug 1548265 that
> matches your report here.

Ok, thanks Marius)

Note You need to log in before you can comment on or make changes to this bug.