Bug 2218406

Summary: [ovs bond]the hash id disappear after change bond-rebalance-interval to zero
Product: Red Hat Enterprise Linux Fast Datapath Reporter: mhou <mhou>
Component: openvswitchAssignee: Mike Pattrick <mpattric>
openvswitch sub component: daemons and tools QA Contact: mhou <mhou>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: unspecified CC: ctrautma, fleitner, jhsiao, mhou, mpattric
Version: RHEL 9.0   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-29 13:44:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mhou 2023-06-29 02:52:30 UTC
Description of problem:
change bond-rebalance-interval from 60000 to 0 and check current flow's hash id disappear

Version-Release number of selected component (if applicable):
kernel version: 5.14.0-325.el9.x86_64
ovs version: openvswitch3.1-3.1.0-24.el9fdp.x86_64

How reproducible: 100%


Steps to Reproduce:
1. create ovs bond topo and add balance-slb port
ovs-vsctl add-bond bondbridge balance-slb ens4f0 ens4f1 lacp=off bond_mode=balance-slb  -- set Interface ens4f0 mtu_request=9000  -- set Interface ens4f1 mtu_request=9000  -- set Port balance-slb  other_config:bond-rebalance-interval=60000
# ovs-vsctl show
9debfe07-1cd3-4715-8a38-17a9cad01d00
    Bridge bondbridge
        Port balance-slb
            Interface ens4f0
            Interface ens4f1
        Port bondbridge
            Interface bondbridge
                type: internal
        Port patchbond
            Interface patchbond
                type: patch
                options: {peer=patchguest}
    Bridge guestbridge
        Port guestbridge
            Interface guestbridge
                type: internal
        Port de796d2ede724_l
            Interface de796d2ede724_l
        Port "45aca8e1121e4_l"
            Interface "45aca8e1121e4_l"
        Port patchguest
            Interface patchguest
                type: patch
                options: {peer=patchbond}
        Port a0fc52e5b7d74_l
            Interface a0fc52e5b7d74_l
    ovs_version: "3.1.2"

2. send traffic from port a0fc52e5b7d74_l and 45aca8e1121e4_l to peer side
nohup podman exec -it g1 bash -c 'netperf -4 -t TCP_STREAM -H 172.31.150.1 -l 3600 -- -m 9000' &> /dev/null &
nohup podman exec -it g2 bash -c 'netperf -4 -t TCP_STREAM -H 172.31.150.1 -l 3600 -- -m 9000' &> /dev/null &
3. check current bond status from ovs-appctl bond/show
# ovs-appctl bond/show
---- balance-slb ----
bond_mode: balance-slb
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
all members active: false
updelay: 0 ms
downdelay: 0 ms
next rebalance: 56783 ms
lacp_status: off
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 3c:fd:fe:bd:1c:a4(ens4f0)

member ens4f0: enabled
  active member
  may_enable: true
  hash 161: 76672627 kB load

member ens4f1: enabled
  may_enable: true
  hash 40: 75351593 kB load

4. set bond-rebalance-interval to 0.
# ovs-vsctl set port balance-slb other_config:bond-rebalance-interval=0
# ovs-appctl bond/hash 00:de:ad:96:02:12
161

5. check current bond status
# ovs-appctl bond/show
---- balance-slb ----
bond_mode: balance-slb
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
all members active: false
updelay: 0 ms
downdelay: 0 ms
lacp_status: off
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 3c:fd:fe:bd:1c:a4(ens4f0)

member ens4f0: enabled
  active member
  may_enable: true

member ens4f1: enabled
  may_enable: true


Actual results:
after change bond-rebalance-interval to zero, the traffic hash_id disappear.

Expected results:
change value of bond-rebalance-interval won't affect taffic hash

Additional info:

This issue also affect balance-tcp port
# ovs-vsctl set port balance-tcp other_config:bond-rebalance-interval=0
[root@hp-dl388g10-03 ovs_bond_function]# ovs-vsctl list port balance-tcp
_uuid               : 8270104e-8d85-4b0b-9dcd-067a2b7f4969
bond_active_slave   : "3c:fd:fe:bd:1c:a4"
bond_downdelay      : 0
bond_fake_iface     : false
bond_mode           : balance-tcp
bond_updelay        : 0
cvlans              : []
external_ids        : {}
fake_bridge         : false
interfaces          : [2f8df384-e862-4dc7-854c-6d3e3abe1914, 81b7adfa-af28-4066-a9cb-5a54a6259835]
lacp                : active
mac                 : []
name                : balance-tcp
other_config        : {bond-rebalance-interval="0"}
protected           : false
qos                 : []
rstp_statistics     : {}
rstp_status         : {}
statistics          : {}
status              : {}
tag                 : []
trunks              : []
vlan_mode           : []

# ovs-appctl bond/show
---- balance-tcp ----
bond_mode: balance-tcp
bond may use recirculation: yes, Recirc-ID : 1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
lacp_status: negotiated
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 3c:fd:fe:bd:1c:a4(ens4f0)

member ens4f0: enabled
  active member
  may_enable: true

member ens4f1: enabled
  may_enable: true

# ovs-appctl bond/hash 00:de:ad:96:02:02
40

Comment 1 mhou 2023-06-29 07:47:24 UTC
This triggers an extended issue where ovs-appctl bond/show does not match the value ovs-appctl bond/hash queries after setting bond-rebalance-interval=0 to bond-rebalance-interval=1000

[root@hp-dl388g10-03 ovs_bond_function]# ovs-vsctl set port balance-tcp other_config:bond-rebalance-interval=0
[root@hp-dl388g10-03 ovs_bond_function]# ovs-vsctl list port balance-tcp
_uuid               : 8270104e-8d85-4b0b-9dcd-067a2b7f4969
bond_active_slave   : "3c:fd:fe:bd:1c:a4"
bond_downdelay      : 0
bond_fake_iface     : false
bond_mode           : balance-tcp
bond_updelay        : 0
cvlans              : []
external_ids        : {}
fake_bridge         : false
interfaces          : [2f8df384-e862-4dc7-854c-6d3e3abe1914, 81b7adfa-af28-4066-a9cb-5a54a6259835]
lacp                : active
mac                 : []
name                : balance-tcp
other_config        : {bond-rebalance-interval="0"}
protected           : false
qos                 : []
rstp_statistics     : {}
rstp_status         : {}
statistics          : {}
status              : {}
tag                 : []
trunks              : []
vlan_mode           : []
# ovs-vsctl set port balance-tcp other_config:bond-rebalance-interval=1000
# ovs-vsctl get port balance-tcp other_config
{bond-rebalance-interval="1000"}
# ovs-appctl bond/hash 00:de:ad:96:02:02
40
# ovs-appctl bond/hash 00:de:ad:96:02:12
161
# ovs-appctl bond/show
---- balance-tcp ----
bond_mode: balance-tcp
bond may use recirculation: yes, Recirc-ID : 1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
next rebalance: 725 ms
lacp_status: negotiated
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 3c:fd:fe:bd:1c:a4(ens4f0)

member ens4f0: enabled
  active member
  may_enable: true
  hash 162: 1514807 kB load

member ens4f1: enabled
  may_enable: true
  hash 58: 1511342 kB load


check the ovs-vswitchd.log and found the hash id same with ovs-appctl bond/show but not equal ovs-appctl bond/hash
2023-06-29T07:46:28.173Z|00437|bond|DBG|bond balance-tcp: ens4f1 3028455kB (h58: 3028455kB + h224: 0kB), ens4f0 3020699kB (h162: 3020699kB)
2023-06-29T07:46:29.179Z|00438|bond|DBG|bond balance-tcp: ens4f1 3036464kB (h58: 3036464kB + h224: 0kB), ens4f0 3031085kB (h162: 3031085kB)
2023-06-29T07:46:30.179Z|00439|bond|DBG|bond balance-tcp: ens4f1 3033405kB (h58: 3033405kB + h224: 0kB), ens4f0 3025485kB (h162: 3025485kB)
2023-06-29T07:46:31.179Z|00440|bond|DBG|bond balance-tcp: ens4f1 3027718kB (h58: 3027718kB + h224: 0kB), ens4f0 3026777kB (h162: 3026777kB)
2023-06-29T07:46:32.179Z|00441|bond|DBG|bond balance-tcp: ens4f0 3027846kB (h162: 3027846kB), ens4f1 3024283kB (h58: 3024283kB + h224: 0kB)
2023-06-29T07:46:33.179Z|00442|bond|DBG|bond balance-tcp: ens4f1 3026467kB (h58: 3026467kB + h224: 0kB), ens4f0 3025002kB (h162: 3025002kB)

Comment 2 Mike Pattrick 2023-06-29 13:44:40 UTC
We do not print out the hash values when rebalance interval is zero intentionally.

This is the logic we use to decide if we will print this information:

>     return bond->rebalance_interval                                             
>         && (bond->balance == BM_SLB || bond->balance == BM_TCP)                 
>         && !(bond->lacp_fallback_ab && bond->lacp_status == LACP_CONFIGURED);

 If this is a desired feature, than please reopen this bug as a RFE.