Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2151982

Summary: [upstream] bonding: bug in "add per-port priority for failover re-selection"
Product: Red Hat Enterprise Linux 9 Reporter: Jonathan Toppins <jtoppins>
Component: kernelAssignee: Jonathan Toppins <jtoppins>
kernel sub component: Bonding QA Contact: LiLiang <liali>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: haliu, jtoppins, network-qe
Version: unspecifiedKeywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 9.3   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-14 18:13:10 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2092194, 2221092    

Description Jonathan Toppins 2022-12-08 18:51:28 UTC
This bug is an upstream bug, no RHEL product is affected at this time.

Bonding: add per-port priority for failover re-selection

For this patch, there is a problem.

When a higher prio slave change to up, it can't be selected as the newly active slave.


```
ip link add name veth0 type veth peer name veth0_p
ip link add name veth1 type veth peer name veth1_p
ip link add name veth2 type veth peer name veth2_p

ip link add name bond0 type bond mode 1 miimon 100 primary veth1 primary_reselect 0
ip link set bond0 up
ip link set veth0 master bond0
ip link set veth1 master bond0
ip link set veth2 master bond0
ip link set dev veth0 type bond_slave prio 0
ip link set dev veth1 type bond_slave prio 10
ip link set dev veth2 type bond_slave prio 11
ip -d link show veth0 | grep -q 'prio 0'
ip -d link show veth1 | grep -q 'prio 10'
ip -d link show veth2 | grep -q 'prio 11'

ip link set veth0 up
ip link set veth1 up
ip link set veth2 up
ip link set veth0_p up
ip link set veth1_p up
ip link set veth2_p up

ip link add name br0 type bridge
ip link set br0 up
ip link set veth0_p master br0
ip link set veth1_p master br0
ip link set veth2_p master br0
ip link add name veth3 type veth peer name veth3_p
ip netns add ns1
ip link set veth3_p master br0 up
ip link set veth3 netns ns1 up

# current active slave should be primary slave
active_slave=$(cat /sys/class/net/bond0/bonding/active_slave)
test $active_slave = "veth1" || echo "BUG: current active slave is not primary slave"

ip link set veth1 down
ip link set veth2 down
sleep 5

# current active slave should be the only one up slave veth0
active_slave=$(cat /sys/class/net/bond0/bonding/active_slave)
test $active_slave = "veth0" || echo "BUG: current active slave is not the only up slave veth0"

ip link set veth2 up
sleep 5

# higher priority slave veth2 should become active slave
active_slave=$(cat /sys/class/net/bond0/bonding/active_slave)
test $active_slave = "veth2" || echo "BUG: higher priorty slave didn't become active slave"

```

"BUG: higher priorty slave didn't become active slave" this step will fail.


From Hangbin:
---
Oh, thanks. I didn't notice this. When post patch, I only test
1. enslave a high prio slave
2. when setting current active slave down, bond will find high prio slave in the remaining up slaves.

I missed the test that when a high prio slave up, it should do failover and replace the current active slave.

The following patch should fix the issue(I omit the arp monitor fix)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index b9a882f182d2..d7351c416004 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2689,7 +2689,8 @@ static void bond_miimon_commit(struct bonding *bond)

                        bond_miimon_link_change(bond, slave, BOND_LINK_UP);

-                       if (!bond->curr_active_slave || slave == primary)
+                       if (!bond->curr_active_slave || slave == primary ||
+                           slave->prio > bond->curr_active_slave->prio)
                                goto do_failover;

                        continue;

Comment 1 Jonathan Toppins 2022-12-08 19:19:33 UTC
Hangbin, do you intend to upstream the proposed fix or do you want me to?

Comment 2 Hangbin Liu 2022-12-09 00:51:17 UTC
(In reply to Jonathan Toppins from comment #1)
> Hangbin, do you intend to upstream the proposed fix or do you want me to?

I talked with Liang yesterday. He would like to try the upstream work.
So I'm waiting for his selftest patch and post with mime together.
I will talk with Liang and see if we can post the patch set today.

Thanks
Hangbin

Comment 4 Jonathan Toppins 2022-12-14 17:52:50 UTC
v2 was accepted upstream.

Comment 5 Jonathan Toppins 2022-12-14 18:13:10 UTC
Closing this bug, it was really opened to track the upstream fix.