Description of problem: Issue with LACP--It can't detect the link down and the path is not failover as expected. The lacp show as below: ovs-appctl lacp/show ---- bond_dpdk ---- status: active negotiated sys_id: a0:8c:f8:89:ca:90 sys_priority: 65534 aggregation key: 1 lacp_time: slow slave: dpdk0: current attached port_id: 2 port_priority: 65535 may_enable: true actor sys_id: a0:8c:f8:89:ca:90 actor sys_priority: 65534 actor port_id: 2 actor port_priority: 65535 actor key: 1 actor state: activity aggregation synchronized collecting distributing partner sys_id: 80:b5:75:86:30:11 partner sys_priority: 32768 partner port_id: 2 partner port_priority: 32768 partner key: 5953 partner state: activity timeout aggregation synchronized collecting distributing slave: dpdk1: current attached port_id: 1 port_priority: 65535 may_enable: true actor sys_id: a0:8c:f8:89:ca:90 actor sys_priority: 65534 actor port_id: 1 actor port_priority: 65535 actor key: 1 actor state: activity aggregation synchronized collecting distributing partner sys_id: 80:b5:75:86:30:11 partner sys_priority: 32768 partner port_id: 1 partner port_priority: 32768 partner key: 5953 partner state: activity timeout aggregation synchronized collecting distributing Version-Release number of selected component (if applicable): cat installed-rpms | egrep 'kernel|openvswitch|dpdk' dpdk-17.11-7.el7.x86_64 Thu Jun 21 16:29:26 2018 erlang-kernel-18.3.4.8-1.el7ost.x86_64 Thu Jun 21 16:21:12 2018 kernel-3.10.0-862.3.3.el7.x86_64 Thu Jun 21 15:42:09 2018 kernel-tools-3.10.0-862.3.3.el7.x86_64 Thu Jun 21 15:42:42 2018 kernel-tools-libs-3.10.0-862.3.3.el7.x86_64 Thu Jun 21 15:38:30 2018 openstack-neutron-openvswitch-12.0.2-0.20180421011362.0ec54fd.el7ost.noarch Thu Jun 21 16:24:25 2018 openvswitch-2.9.0-19.el7fdp.1.x86_64 Thu Jun 21 16:17:13 2018 openvswitch-ovn-central-2.9.0-19.el7fdp.1.x86_64 Thu Jun 21 16:29:22 2018 openvswitch-ovn-common-2.9.0-19.el7fdp.1.x86_64 Thu Jun 21 16:18:25 2018 openvswitch-ovn-host-2.9.0-19.el7fdp.1.x86_64 Thu Jun 21 16:29:22 2018 python-openvswitch-2.9.0-19.el7fdp.1.noarch Thu Jun 21 16:17:15 2018 How reproducible: 100%reproduced Steps to Reproduce: The networking is as follows: (1) The two NICs on com6 and com7 is the bond of dpdk. Detailed configuration:bond_mode=balance-tcp lacp=active other_config:lacp-time=slow other-config:lacp-fallback-ab=true other_config:bond-rebalance-interval=1000 other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100 (2)Configure eth-trunk on SW and support LACP. (3)There are two links between the server(com6 and com7) and the SW. -------------------- | SW | ---|--|---------|--|-- ---|--|-- ---|--|--- | com6 | | com7 | --------- --------- (1) Create a virtual machine on each of com6 and com7. (2)The eth0 of two vms is configured as follows: ifconfig eth0 7.7.7.1/24 up ifconfig eth0 7.7.7.2/24 up (3) From a VM Long Ping to another VM. ping -c 10000 -s 60000 7.7.7.2 (4)Observe the Ping traffic passing through a link between the server(com6 and com7) and the SW. You can observe which link passes through the ovs-aapctl bond/show command. (5) Cut off the link with traffic. (6)Observe Ping, no matter how long it has been, it always blocked. (7) Unless you stop the current Ping and do nothing in the middle, 10S will be able to recover. Actual results: It can't detect the link down and the path is not failover as expected Expected results: detect the link down and link failover is as expected Additional info: The ovs-bonding configuration is below: cat etc/sysconfig/network-scripts/ifcfg-bond_dpdk # This file is autogenerated by os-net-config DEVICE=bond_dpdk ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no DEVICETYPE=ovs TYPE=OVSDPDKBond OVS_BRIDGE=br-link0 BOND_IFACES="dpdk0 dpdk1" RX_QUEUE=1 OVS_OPTIONS="bond_mode=balance-tcp lacp=active other_config:lacp-time=slow other-config:lacp-fallback-ab=true other_config:bond-rebalance-interval=1000 other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100" MTU=9000 OVS_EXTRA="set Interface dpdk0 options:dpdk-devargs=0000:3e:00.0 -- set Interface dpdk1 options:dpdk-devargs=0000:3e:00.1 -- set Interface dpdk0 mtu_request=$MTU -- set Interface dpdk1 mtu_request=$MTU -- set Interface dpdk0 options:n_rxq=$RX_QUEUE -- set Interface dpdk1 options:n_rxq=$RX_QUEUE" The log collection is below: [browse] the files here: http://collab-shell.usersys.redhat.com/02242686/
How did you "cut off the traffic"? Just want to know the exact method.
Dear Aaron Conole: Thanks for your attenion on this ticket. The below is from the Customer side: During failover testing, he shutdown the port which has network traffic on switch side. I hope the answer is clear for you . Wei
Is this the same bug as https://bugzilla.redhat.com/show_bug.cgi?id=1644383 ?
Thanks. I'll close this as duplicate so that we don't get confused efforts. *** This bug has been marked as a duplicate of bug 1644383 ***