Bug 2186008

Summary: sysfs interface to remove a bonding slave does not work
Product: Red Hat Enterprise Linux 9 Reporter: Andrew Schorr <ajschorr>
Component: kernelAssignee: Hangbin Liu <haliu>
kernel sub component: Networking QA Contact: Network QE <network-qe>
Status: CLOSED INSUFFICIENT_DATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bstinson, jiji, jwboyer, kzhang, liali, sukulkar
Version: CentOS Stream   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-25 10:07:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Schorr 2023-04-11 20:01:39 UTC
Description of problem:
Removing a bonding slave via the sysfs interface does not work

Version-Release number of selected component (if applicable):
kernel-5.14.0-295.el9.x86_64

How reproducible:
always

Steps to Reproduce:
1. create an active-backup bonding interface with slaves lan2 and lan0
2. add new interface lan3
3. try to remove lan2 with echo -lan2 > /sys/class/net/bond0/bonding/slaves

Actual results:
sh-5.1# cat /sys/class/net/bond0/bonding/slaves
lan0 lan3 lan2
sh-5.1# echo -lan2 > /sys/class/net/bond0/bonding/slaves
sh-5.1# cat /sys/class/net/bond0/bonding/slaves
lan0 lan3 lan2


Expected results:
The lan2 interface should be removed from the bonding interface.

Additional info:

Comment 1 Hangbin Liu 2023-05-18 04:04:16 UTC
(In reply to Andrew Schorr from comment #0)
> Actual results:
> sh-5.1# cat /sys/class/net/bond0/bonding/slaves
> lan0 lan3 lan2
> sh-5.1# echo -lan2 > /sys/class/net/bond0/bonding/slaves
> sh-5.1# cat /sys/class/net/bond0/bonding/slaves
> lan0 lan3 lan2
> 
> 
> Expected results:
> The lan2 interface should be removed from the bonding interface.

Hi Andrew,

I can't reproduce this issue with veth or virtio driver. Which NIC driver are you using?

Thanks
Hangbin

Comment 2 Andrew Schorr 2023-05-18 12:24:43 UTC
Hi Hangbin,

According to my records, lan0 was igb, and lan2 and lan3 were i40e.

Regards,
Andy

Comment 3 Hangbin Liu 2023-05-24 13:57:03 UTC
Hi Andrew,

I am still unable to reproduce. Are you able to reproduce the issue stably?

# uname -r
5.14.0-295.el9.x86_64
# ethtool -i enp4s0f0 | grep driver
driver: i40e
# echo +eno1 > /sys/class/net/bond0/bonding/slaves
# cat /sys/class/net/bond0/bonding/slaves
eno1
# echo +enp4s0f0 > /sys/class/net/bond0/bonding/slaves
# cat /sys/class/net/bond0/bonding/slaves
eno1 enp4s0f0
# echo +enp4s0f1 > /sys/class/net/bond0/bonding/slaves
# cat /sys/class/net/bond0/bonding/slaves
eno1 enp4s0f0 enp4s0f1
# echo -enp4s0f0 > /sys/class/net/bond0/bonding/slaves
# cat /sys/class/net/bond0/bonding/slaves
eno1 enp4s0f1

Comment 4 Andrew Schorr 2023-05-24 20:12:54 UTC
Hi Hangbin,

I don't think I'm going to have a chance to duplicate this setup. At the time, I was
working with a defective multi-port i40e card (I did not realize it was defective
at that time). I was trying to switch the active port on the i40e card to see if
a different port worked better, when I encountered this problem. It was stable
at that time. Could the fact that the i40e card had a hardware problem affect
the way the kernel was behaving? I doubt that I will get the chance to play around
with this again. I have too many other urgent issues to deal with right now.

Regards,
Andy

Comment 5 LiLiang 2023-05-25 09:39:41 UTC
I had a try on 5.14.0-309.el9.x86_64, didn't recreate this bug.

[root@dell-per750-57 ~]# modprobe -v bonding
insmod /lib/modules/5.14.0-309.el9.x86_64/kernel/drivers/net/bonding/bonding.ko.xz max_bonds=0 

[root@dell-per750-57 ~]# echo +bond0 > /sys/class/net/bonding_masters 
[root@dell-per750-57 ~]# echo 1 > /sys/class/net/bond0/bonding/mode 
[root@dell-per750-57 ~]# echo 100 > /sys/class/net/bond0/bonding/miimon
[root@dell-per750-57 ~]# ip link set bond0 up

[root@dell-per750-57 ~]# ip link set ens7f0 down
[root@dell-per750-57 ~]# echo +ens7f0 > /sys/class/net/bond0/bonding/slaves 
[root@dell-per750-57 ~]# ip link set ens7f1 down
[root@dell-per750-57 ~]# echo +ens7f1 > /sys/class/net/bond0/bonding/slaves 
[root@dell-per750-57 ~]# ip link set enp177s0np0 down
[root@dell-per750-57 ~]# echo +enp177s0np0 > /sys/class/net/bond0/bonding/slaves 
[root@dell-per750-57 ~]# cat /sys/class/net/bond0/bonding/slaves 
ens7f0 ens7f1 enp177s0np0

[root@dell-per750-57 ~]# echo -ens7f0 > /sys/class/net/bond0/bonding/slaves 
[root@dell-per750-57 ~]# cat /sys/class/net/bond0/bonding/slaves 
ens7f1 enp177s0np0

[root@dell-per750-57 ~]# uname -r
5.14.0-309.el9.x86_64

Comment 6 LiLiang 2023-05-25 09:40:22 UTC
[root@dell-per750-57 ~]# ethtool -i ens7f0
driver: i40e
version: 5.14.0-309.el9.x86_64
firmware-version: 7.00 0x80004c97 1.2154.0
expansion-rom-version: 
bus-info: 0000:ca:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
[root@dell-per750-57 ~]# ethtool -i ens7f1
driver: i40e
version: 5.14.0-309.el9.x86_64
firmware-version: 7.00 0x80004c97 1.2154.0
expansion-rom-version: 
bus-info: 0000:ca:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
[root@dell-per750-57 ~]# ethtool -i enp177s0np0
driver: nfp
version: 5.14.0-309.el9.x86_64
firmware-version: 0.0.3.5 0.22 nic-2.1.16.1 nic
expansion-rom-version: 
bus-info: 0000:b1:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

Comment 7 Hangbin Liu 2023-05-25 10:07:39 UTC
Thanks Liang for the conformation. I will close this bug first until we can reproduce it again.