RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2188102 - active-backup bond configured with an 802.3ad bond as a slave has incorrect speed/duplex information
Summary: active-backup bond configured with an 802.3ad bond as a slave has incorrect s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kernel
Version: CentOS Stream
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Hangbin Liu
QA Contact: LiLiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-19 19:18 UTC by Andrew Schorr
Modified: 2024-04-30 10:10 UTC (History)
6 users (show)

Fixed In Version: kernel-5.14.0-364.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-30 10:09:49 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
script to demonstrate that the active-backup bond speed is not updating when the speed of the underlying link changes (1.20 KB, application/x-shellscript)
2023-04-25 15:23 UTC, Andrew Schorr
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src/kernel centos-stream-9 merge_requests 3012 0 None opened bonding: update port speed when getting bond speed 2023-09-05 02:56:35 UTC
Red Hat Issue Tracker RHELPLAN-155240 0 None None None 2023-04-19 19:19:12 UTC
Red Hat Product Errata RHSA-2024:2394 0 None None None 2024-04-30 10:09:51 UTC

Description Andrew Schorr 2023-04-19 19:18:14 UTC
Description of problem:
I have bond0 configured as an 802.3ad bond with 2 10-gig links, for 20-gig
of bandwidth. On top of that, I have bond1 configured as an active-backup bond
with bond0 as the primary slave and a single 1-gig ethernet connection as
the backup slave. Ethtool and /sys/class/net are not correctly reporting
the speed and duplex of bond1, which should match bond0 when it is active.

Version-Release number of selected component (if applicable):
kernel-5.14.0-295.el9.x86_64


How reproducible:
It seems always to be wrong. Usually, bond1 speed and duplex are unknown, although
at the moment, after ifdown bond0 followed by ifup bond0, I see that bond1
is now showing full duplex, but with the wrong speed (10-gig instead of 20-gig)

Steps to Reproduce:
1. configure bond0 802.3ad link with bond1 active-backup on top of that
2. ethtool bond1
3. head /sys/class/net/bond1/{speed,duplex}

Actual results:
        Speed: 10000Mb/s
        Duplex: Full
sh-5.1$ head /sys/class/net/bond1/{speed,duplex}
==> /sys/class/net/bond1/speed <==
10000

==> /sys/class/net/bond1/duplex <==
full


Expected results:
Should be 20000/Full, matching the active slave:
sh-5.1$ head -5 /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v5.14.0-295.el9.x86_64

Bonding Mode: fault-tolerance (active-backup) (fail_over_mac follow)
Primary Slave: bond0 (primary_reselect always)
Currently Active Slave: bond0
sh-5.1$ sudo ethtool bond0
Settings for bond0:
        Supported ports: [  ]
        Supported link modes:   Not reported
        Supported pause frame use: No
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 20000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: Other
        PHYAD: 0
        Transceiver: internal
        Link detected: yes


Additional info:

Comment 1 LiLiang 2023-04-20 05:41:49 UTC
I did a simple test on rhel9, didn't reproduce this issue


# create bonds
```
echo +bond0 > /sys/class/net/bonding_masters 
echo 4 > /sys/class/net/bond0/bonding/mode 
echo 100 > /sys/class/net/bond0/bonding/miimon
ip link set bond0 up
ifenslave bond0 ens1f0 ens1f1 
#source /mnt/tests/kernel/networking/common/network.sh
#get_iface_sw_port "ens1f0 ens1f1" sw p
#swcfg setup_port_channel $sw "$p" active
echo +bond1 > /sys/class/net/bonding_masters 
echo 1 > /sys/class/net/bond1/bonding/mode 
echo 100 > /sys/class/net/bond1/bonding/miimon
ip link set bond1 up
ifenslave bond1 bond0 ens4f0np0
```

# ethtool bond1
Settings for bond1:
	Supported ports: [  ]
	Supported link modes:   Not reported
	Supported pause frame use: No
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 50000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Other
	PHYAD: 0
	Transceiver: internal
	Link detected: yes

[root@dell-per740-86 ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v5.14.0-296.2191_828573416.el9.x86_64

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP active: on
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: b4:96:91:a5:9f:50
Active Aggregator Info:
	Aggregator ID: 2
	Number of ports: 2
	Actor Key: 21
	Partner Key: 47
	Partner Mac Address: b0:8b:d0:0a:73:3b

Slave Interface: ens1f0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: b4:96:91:a5:9f:50
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 1
details actor lacp pdu:
    system priority: 65535
    system mac address: b4:96:91:a5:9f:50
    port key: 21
    port priority: 255
    port number: 1
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: b0:8b:d0:0a:73:3b
    oper key: 47
    port priority: 32768
    port number: 353
    port state: 63

Slave Interface: ens1f1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: b4:96:91:a5:9f:51
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 1
Partner Churned Count: 1
details actor lacp pdu:
    system priority: 65535
    system mac address: b4:96:91:a5:9f:50
    port key: 21
    port priority: 255
    port number: 2
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: b0:8b:d0:0a:73:3b
    oper key: 47
    port priority: 32768
    port number: 357
    port state: 63
[root@dell-per740-86 ~]# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v5.14.0-296.2191_828573416.el9.x86_64

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: bond0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: bond0
MII Status: up
Speed: 50000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: b4:96:91:a5:9f:50
Slave queue ID: 0

Slave Interface: ens4f0np0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: 00:0f:53:7f:88:a0
Slave queue ID: 0

# uname -r
5.14.0-296.2191_828573416.el9.x86_64

Comment 2 LiLiang 2023-04-20 06:36:41 UTC
when setup bond using NM, after the creation of bonding devices, the activeback mode bonding can't show correct speed info.
but after re-up lacp bonding, the ab mode bonding speed can be displayed correctly.

# setup
```
nmcli con add con-name mybond1 type bond ifname mybond1 bond.options "mode=1,miimon=100,updelay=5000,primary=mybond0"

nmcli con add con-name mybond0 type bond ifname mybond0 bond.options "mode=802.3ad,miimon=100,updelay=5000" master mybond1
nmcli con add con-name ens1f0 type ethernet ifname ens1f0 master mybond0
nmcli con add con-name ens1f1 type ethernet ifname ens1f1 master mybond0
#$nmcli con up mybond0

nmcli con add con-name ens4f0np0 type ethernet ifname ens4f0np0 master mybond1
nmcli con up mybond1
```
# after creation, the lacp bond can display speed info, but the ab bond can't display speed info
[root@dell-per740-86 ~]# ethtool mybond0
Settings for mybond0:
	Supported ports: [  ]
	Supported link modes:   Not reported
	Supported pause frame use: No
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 50000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Other
	PHYAD: 0
	Transceiver: internal
	Link detected: yes
[root@dell-per740-86 ~]# ethtool mybond1
Settings for mybond1:
	Supported ports: [  ]
	Supported link modes:   Not reported
	Supported pause frame use: No
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: Unknown!
	Duplex: Unknown! (255)
	Auto-negotiation: off
	Port: Other
	PHYAD: 0
	Transceiver: internal
	Link detected: yes
[root@dell-per740-86 ~]# cat /proc/net/bonding/mybond0
Ethernet Channel Bonding Driver: v5.14.0-296.2191_828573416.el9.x86_64

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 5000
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP active: on
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 00:0f:53:7f:88:a0
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 2
	Actor Key: 21
	Partner Key: 47
	Partner Mac Address: b0:8b:d0:0a:73:3b

Slave Interface: ens1f0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: b4:96:91:a5:9f:50
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 00:0f:53:7f:88:a0
    port key: 21
    port priority: 255
    port number: 1
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: b0:8b:d0:0a:73:3b
    oper key: 47
    port priority: 32768
    port number: 353
    port state: 63

Slave Interface: ens1f1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: b4:96:91:a5:9f:51
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 00:0f:53:7f:88:a0
    port key: 21
    port priority: 255
    port number: 2
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: b0:8b:d0:0a:73:3b
    oper key: 47
    port priority: 32768
    port number: 357
    port state: 63
[root@dell-per740-86 ~]# cat /proc/net/bonding/mybond1
Ethernet Channel Bonding Driver: v5.14.0-296.2191_828573416.el9.x86_64

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: mybond0 (primary_reselect always)
Currently Active Slave: mybond0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 5000
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: ens4f0np0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0f:53:7f:88:a0
Slave queue ID: 0

Slave Interface: mybond0
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: ea:7e:04:ad:e4:e5
Slave queue ID: 0


But after re-up lacp bonding, the ab bonding can show correct link speed info

# re-up lacp bond
[root@dell-per740-86 ~]# ip link set mybond0 down
[root@dell-per740-86 ~]# ip link set mybond0 up
# let lacp bond become active slave
[root@dell-per740-86 ~]# ip link set ens4f0np0 down
[root@dell-per740-86 ~]# ip link set ens4f0np0 up
# check ab bond speed info
[root@dell-per740-86 ~]# ethtool mybond1
Settings for mybond1:
	Supported ports: [  ]
	Supported link modes:   Not reported
	Supported pause frame use: No
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 50000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Other
	PHYAD: 0
	Transceiver: internal
	Link detected: yes

Comment 3 Andrew Schorr 2023-04-20 15:08:27 UTC
Yes, I duplicated your result. The initial bond1 speed/duplex after booting is Unknown!/Unknown!,
but setting bond0 down and then back up fixes the problem. But that shouldn't be necessary.
This seems like a bug.

Comment 4 Andrew Schorr 2023-04-24 20:15:37 UTC
Is there a reason that this has not been accepted as a bug? Surely this behavior is not OK.
After rebooting, bond1 comes up with speed and duplex unknown:

sh-5.1$ head /sys/class/net/bond1/{speed,duplex}
==> /sys/class/net/bond1/speed <==
-1

==> /sys/class/net/bond1/duplex <==
unknown

sh-5.1$ sudo ethtool bond1 | egrep -i 'speed|duplex'
        Speed: Unknown!
        Duplex: Unknown! (255)

 
Regards,
Andy

Comment 5 LiLiang 2023-04-25 00:40:31 UTC
Hangbin,

Is this a bug?

Regards,
Liang Li

Comment 6 Hangbin Liu 2023-04-25 08:49:40 UTC
(In reply to Andrew Schorr from comment #3)
> Yes, I duplicated your result. The initial bond1 speed/duplex after booting
> is Unknown!/Unknown!,
> but setting bond0 down and then back up fixes the problem. But that
> shouldn't be necessary.
> This seems like a bug.

This is because you add bond1 on top of bond0 before bond0 exchanges the 802.3ad/lacp link info with peer.
So bond1 only get 10g link info from bond0. After that the bond0 sync link info and update to 20g.

If you set bond0 up and exchange/sync the link info. Then add bond1 on top of bond0, you will get correct speed.

Here is a reporducer:

```
#!/bin/bash
s_ns="s"
c_ns="c"

ip -a netns del
sleep 1
ip netns add ${c_ns}
ip netns add ${s_ns}

ip -n ${c_ns} link add bond0 type bond mode 802.3ad miimon 100
ip -n ${s_ns} link add bond0 type bond mode 802.3ad miimon 100
ip -n ${s_ns} link add bond1 type bond mode active-backup miimon 100

for i in $(seq 0 2); do
        ip -n ${c_ns} link add eth${i} type veth peer name eth${i} netns ${s_ns}
        [ $i -eq 2 ] && break
        ip -n ${c_ns} link set eth${i} master bond0
        ip -n ${s_ns} link set eth${i} master bond0
done

ip -n ${c_ns} link set eth2 up
ip -n ${c_ns} link set bond0 up

# set bond0 up to sync the info
# ip -n ${s_ns} link set bond0 up
# sleep 5
# ip -n ${s_ns} link set bond0 down

ip -n ${s_ns} link set bond0 master bond1
ip -n ${s_ns} link set eth2 master bond1
ip -n ${s_ns} link set bond1 up

ip netns exec ${c_ns} ethtool bond0 | grep Speed
ip netns exec ${s_ns} ethtool bond0 | grep Speed
ip netns exec ${s_ns} ethtool bond1 | grep Speed
```

When run the reproducer directly, you will see:
# ./bond_topo_lacp.sh
        Speed: 10000Mb/s
        Speed: 10000Mb/s
        Speed: 10000Mb/s

If you uncomment the following part
```
# ip -n ${s_ns} link set bond0 up
# sleep 5
# ip -n ${s_ns} link set bond0 down
```
and re-run the test, you will get:
# ./bond_topo_lacp.sh
        Speed: 20000Mb/s
        Speed: 20000Mb/s
        Speed: 20000Mb/s

So I think this is a configuration issue, not a bug. What do you think?

Thanks
Hangbin

Comment 7 Andrew Schorr 2023-04-25 13:01:58 UTC
Hi,

Thanks for working on this. I agree 100% that there are race conditions involved.
However, I still think that the current behavior is buggy. The fact is that
an 802.3ad bond has a dynamic speed setting. If a slave goes down or up,
then the speed of the 802.3ad bond will decrease or increase. The active-backup bond in bond1
that is the parent of bond0 must therefore also have a dynamic speed value.
When the speed of its active slave interface changes, it must also change its
speed. With the sleep uncommented, can you please at the end of the script
use ip link set down to disable one of the slave links of the 802.3ad bond?
After you do that, does the speed of bond1 update properly? If not, then
there's a bug. I'm 99% certain that you will see a bug. The active-backup bond
needs to understand that the speed of its slaves may be dynamic. There need
to be hooks in the kernel to trigger the bond master to reevaluate its speed
when a slave goes up or down.

Regards,
Andy

Comment 8 Andrew Schorr 2023-04-25 15:23:55 UTC
Created attachment 1959829 [details]
script to demonstrate that the active-backup bond speed is not updating when the speed of the underlying link changes

This script demonstrates that the the top-level active-backup link does
not dynamically update when the speed/duplex of the underlying link changes.
This looks like a kernel bug to me. There need to be kernel hooks so that
the master can react when the speed/duplex of a slave changes.

Comment 9 Andrew Schorr 2023-04-25 15:25:16 UTC
To be clear, when I run that script, I see this:

sh-5.2# ./bondbug.sh
        Speed: 20000Mb/s
        Speed: 20000Mb/s
        Speed: 20000Mb/s
Now disabling eth0 in namespace c
        Speed: 10000Mb/s
        Speed: 10000Mb/s
        Speed: 20000Mb/s

That's on Fedora 37 with kernel 6.2.12-200.

Comment 10 Andrew Schorr 2023-04-26 16:44:45 UTC
I'm puzzled by what's going on here. It seems blindingly obvious to me that there's
a kernel bug here. How is that not obvious to you? Let's move forward. It could well
be an upstream issue. How do we make progress on this?

Regards,
Andy

Comment 11 Hangbin Liu 2023-04-27 04:08:12 UTC
(In reply to Andrew Schorr from comment #8)
> 
> This script demonstrates that the the top-level active-backup link does
> not dynamically update when the speed/duplex of the underlying link changes.
> This looks like a kernel bug to me. There need to be kernel hooks so that
> the master can react when the speed/duplex of a slave changes.

The bonding interface only dynamically updates the speed/duplex for 8023AD mode.
For other modes, It will only update the speed/duplex unless there is a failover or slave change.

The downlink bond0's speed change doesn't trigger a failover, so the upper link bond1's speed won't change.

If you still think this is a kernel bug or design issue, I'd suggest you ask this upstream.

Thanks
Hangbin

Comment 12 Andrew Schorr 2023-04-27 12:49:39 UTC
I understand the current behavior: it does not dynamically update the speed/duplex
of the active-backup bond when the speed/duplex of the active slave changes.

I don't THINK that it's a bug; I KNOW that it's a bug. Unless the kernel folks say that it's
invalid to have a bond on top of another bond and that they don't support this configuration.
But that would be silly.

I'm willing to report it upstream, but they don't want me to. When I go to
https://bugzilla.kernel.org/
it says:

"This bugzilla is for reporting bugs against upstream Linux kernels.

If you did not compile your own kernel from scratch, you are probably in the wrong place.
Please use the following links to report a bug to your distribution instead:
Ubuntu | Fedora | Arch | Mint | Debian | Red Hat | OpenSUSE | SUSE"

I am not compiling my own kernel. I'm using the CentOS Stream 9 kernel. So you guys
really need to file the bug against upstream.

Why are you so unwilling to acknowledge that this is a bug? It seems pretty obvious.

Regards,
Andy

Comment 13 Hangbin Liu 2023-04-28 07:34:24 UTC
(In reply to Andrew Schorr from comment #12)
> Why are you so unwilling to acknowledge that this is a bug? It seems pretty
> obvious.

OK, let me discuss this upstream and see how to fix it.

Hangbin

Comment 14 Hangbin Liu 2023-05-09 03:43:09 UTC
Hi Andrew,

Here is the discussion[1] I made with bonding maintainer.
He doesn't suggest nesting LACP bond inside an active-backup bond. Could you use LACP bond directly?

[1] https://lore.kernel.org/netdev/ZFm7Hwz6cqEkVB1g@Laptop-X1/T/#mf8433b43239f5cb843fdc974565bb41b5b94ce5f

Thanks
Hangbin

Comment 15 Andrew Schorr 2023-05-09 15:33:26 UTC
Hi Hangin,

Thank you for pursuing this upstream. I selected an active-backup bond on top of 802.3ad
for 2 reasons:

1. I am using arp_ip_target to ensure that the network interface is truly connected
to the other network devices that it needs to talk to. Perhaps I am mistaken, but I
believe that 802.3ad bonds use only MII monitoring to determine whether the link is up.

2. I have 2 10-gig links in a port-channel group on an Arista switch as the
primary link, and the backup is a single, "normal", 1-gig connection to a Cisco switch.
I would also like to be able to use PXE booting over that normal 1-gig link, so I
did not want to set it up in an LACP port-channel.

While I was aware that a Linux 802.3ad bond could contain multiple aggregators,
I was not aware of this concept of having a non-LACP normal NIC as an aggregator
inside the 802.3ad bond. If that truly works, then it would address my concern #2,
but it doesn't change issue #1 regarding ARP monitoring. 

Where is this idea of having a non-LACP NIC documented? I don't see any
mention in the Bonding Driver HOWTO:

https://www.kernel.org/doc/Documentation/networking/bonding.txt

If I want to use that config, how do I enable it? Looking at the source code
in drivers/net/bonding/bond_3ad.c, I see this code snippet:

                        if (port->actor_oper_port_key & AD_DUPLEX_KEY_MASKS)
                                /* if port is full duplex */
                                port->aggregator->is_individual = false;
                        else
                                port->aggregator->is_individual = true;

This seems to suggest that I need to disable duplex in the ad_user_port_key to
select this "individual" mode, but I'm confused about that, because
"ad_user_port_key" seems to be a property of the bonding master, not the slaves.

So how do I actually use this "individual" non-LACP normal NIC in a 802.3ad bonding
group? Does it somehow magically work automatically with no configuration? Normally,
I would expect the Linux system to attempt LACP with the switch and then refuse to
bring up the link if the switch does not respond with LACP packets.

In any case, I still prefer to have an active-backup bond on top of 802.3ad
because that allows me to use arp monitoring.

And look -- it's not the end of the world if the speed & duplex info for the
active-backup bond are incorrect. However, it would be preferable if they could
just be set to -1 or N/A instead of sometimes having bogus values. It seems like
it should be possible to detect that one or more of the slaves is a bond and then
just set speed and duplex to N/A.

Please do not disable bonds of bonds in the kernel. I find it to be a useful feature.
But also -- please improve the 802.3ad bonding documentation to explain more clearly
that one can use multiple aggregators (it took me a while to figure this out when
I was planning my configuration), and also to explain this concept of having an
individual "normal" non-LACP aggregator inside the bond and how that works.

Thanks,
Andy

Comment 25 errata-xmlrpc 2024-04-30 10:09:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:2394


Note You need to log in before you can comment on or make changes to this bug.