RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1650292 - Losing a few packets when the first LACP bond member is brought up.
Summary: Losing a few packets when the first LACP bond member is brought up.
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openvswitch
Version: 7.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Matteo Croce
QA Contact: Hekai Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-15 18:26 UTC by Andreas Karis
Modified: 2023-09-15 00:14 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-25 15:09:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andreas Karis 2018-11-15 18:26:28 UTC
Description of problem:

Losing a few packets when the first LACP bond member is brought up.

Version-Release number of selected component (if applicable):

~~~
[root@overcloud-novacomputeiha-0 ~]# rpm -qa | egrep 'kernel|openvswitch'
openvswitch-ovn-host-2.9.0-56.el7fdp.x86_64
openvswitch-2.9.0-84.el7fdn.x86_64
kernel-tools-3.10.0-957.el7.x86_64
erlang-kernel-18.3.4.8-1.el7ost.x86_64
openvswitch-selinux-extra-policy-1.0-5.el7fdp.noarch
kernel-3.10.0-957.el7.x86_64
openvswitch-ovn-common-2.9.0-56.el7fdp.x86_64
openstack-neutron-openvswitch-12.0.3-5.el7ost.noarch
python-openvswitch-2.9.0-56.el7fdp.noarch
kernel-tools-libs-3.10.0-957.el7.x86_64
openvswitch-ovn-central-2.9.0-56.el7fdp.x86_64
[root@overcloud-novacomputeiha-0 ~]# 
~~~

How reproducible:

Shut down the port opposite to the first LACP slave, e.g. on the switch. Start a ping across the bond. Bring the port back up and observe some packet loss. 
I can reproduce this in software, not using DPDK (kernel datapath). Pretty sure I could reproduce the same with DPDK, although I didn't configure my system for it.

Steps to Reproduce:

~~~
ovs-vsctl add-br br1 -- set bridge br1 datapath_type=netdev
ip link add ovs-bond-if0 type veth peer name lx-bond-if0
ip link add ovs-bond-if1 type veth peer name lx-bond-if1
ip link add lx-bond0 type bond miimon 100 mode 802.3ad
ip link set dev lx-bond-if0 master lx-bond0
ip link set dev lx-bond-if1 master lx-bond0
ip link set dev lx-bond-if0 up
ip link set dev lx-bond-if1 up
ip link set dev ovs-bond-if0 up
ip link set dev ovs-bond-if1 up
ip link set dev lx-bond0 up
ovs-vsctl add-bond br1 dpdkbond1 ovs-bond-if0 ovs-bond-if1 -- set port dpdkbond1 lacp=active -- set port dpdkbond1 bond_mode=balance-tcp --  set port dpdkbond1 other-config:lacp-time=fast
ip link add name lx-bond0.905 link lx-bond0 type vlan id 905
ip link set dev lx-bond0.905 up
ip a a dev lx-bond0.905 192.168.123.10/24
ip link add veth2 type veth peer name veth3
ip netns add test2
ip link set dev veth2 netns test2
ip link set dev veth3 up
ip netns exec test2 ip link set dev lo up
ip netns exec test2 ip link set dev veth2 up
ip netns exec test2 ip a a dev veth2 192.168.123.11/24
ovs-vsctl add-port br1 veth3 tag=905
~~~

Run:
~~~
[root@overcloud-novacomputeiha-0 ~]# ip link set dev lx-bond-if0 down
~~~

Start a ping. While the ping is running, run:
~~~
[root@overcloud-novacomputeiha-0 ~]# ip link set dev lx-bond-if0 up
~~~

Observe packet loss:
~~~
[root@overcloud-novacomputeiha-0 ~]# ping 192.168.123.11 -i 0.1
PING 192.168.123.11 (192.168.123.11) 56(84) bytes of data.
64 bytes from 192.168.123.11: icmp_seq=1 ttl=64 time=0.235 ms
64 bytes from 192.168.123.11: icmp_seq=2 ttl=64 time=0.214 ms
64 bytes from 192.168.123.11: icmp_seq=3 ttl=64 time=0.167 ms
64 bytes from 192.168.123.11: icmp_seq=4 ttl=64 time=0.170 ms
64 bytes from 192.168.123.11: icmp_seq=5 ttl=64 time=0.165 ms
64 bytes from 192.168.123.11: icmp_seq=6 ttl=64 time=0.163 ms
64 bytes from 192.168.123.11: icmp_seq=7 ttl=64 time=0.184 ms
64 bytes from 192.168.123.11: icmp_seq=8 ttl=64 time=0.165 ms
64 bytes from 192.168.123.11: icmp_seq=9 ttl=64 time=0.169 ms
64 bytes from 192.168.123.11: icmp_seq=10 ttl=64 time=0.162 ms
64 bytes from 192.168.123.11: icmp_seq=11 ttl=64 time=0.173 ms
64 bytes from 192.168.123.11: icmp_seq=12 ttl=64 time=0.200 ms
64 bytes from 192.168.123.11: icmp_seq=13 ttl=64 time=0.188 ms
64 bytes from 192.168.123.11: icmp_seq=14 ttl=64 time=0.192 ms
64 bytes from 192.168.123.11: icmp_seq=15 ttl=64 time=0.174 ms
64 bytes from 192.168.123.11: icmp_seq=16 ttl=64 time=0.209 ms
64 bytes from 192.168.123.11: icmp_seq=17 ttl=64 time=0.174 ms
64 bytes from 192.168.123.11: icmp_seq=18 ttl=64 time=0.178 ms
64 bytes from 192.168.123.11: icmp_seq=19 ttl=64 time=0.205 ms
64 bytes from 192.168.123.11: icmp_seq=20 ttl=64 time=0.188 ms
64 bytes from 192.168.123.11: icmp_seq=21 ttl=64 time=0.182 ms
64 bytes from 192.168.123.11: icmp_seq=22 ttl=64 time=0.174 ms
64 bytes from 192.168.123.11: icmp_seq=23 ttl=64 time=0.187 ms
64 bytes from 192.168.123.11: icmp_seq=24 ttl=64 time=0.251 ms
64 bytes from 192.168.123.11: icmp_seq=25 ttl=64 time=0.191 ms
64 bytes from 192.168.123.11: icmp_seq=26 ttl=64 time=0.193 ms
64 bytes from 192.168.123.11: icmp_seq=27 ttl=64 time=0.190 ms
64 bytes from 192.168.123.11: icmp_seq=28 ttl=64 time=0.194 ms
64 bytes from 192.168.123.11: icmp_seq=29 ttl=64 time=0.205 ms
64 bytes from 192.168.123.11: icmp_seq=30 ttl=64 time=0.186 ms
64 bytes from 192.168.123.11: icmp_seq=31 ttl=64 time=0.183 ms
64 bytes from 192.168.123.11: icmp_seq=32 ttl=64 time=0.211 ms
64 bytes from 192.168.123.11: icmp_seq=33 ttl=64 time=0.183 ms
64 bytes from 192.168.123.11: icmp_seq=34 ttl=64 time=0.182 ms
64 bytes from 192.168.123.11: icmp_seq=35 ttl=64 time=0.186 ms
64 bytes from 192.168.123.11: icmp_seq=36 ttl=64 time=0.191 ms
64 bytes from 192.168.123.11: icmp_seq=37 ttl=64 time=0.192 ms
64 bytes from 192.168.123.11: icmp_seq=38 ttl=64 time=0.180 ms
64 bytes from 192.168.123.11: icmp_seq=39 ttl=64 time=0.189 ms
64 bytes from 192.168.123.11: icmp_seq=40 ttl=64 time=0.174 ms
64 bytes from 192.168.123.11: icmp_seq=41 ttl=64 time=0.185 ms
64 bytes from 192.168.123.11: icmp_seq=42 ttl=64 time=0.188 ms
64 bytes from 192.168.123.11: icmp_seq=43 ttl=64 time=0.185 ms
64 bytes from 192.168.123.11: icmp_seq=44 ttl=64 time=0.183 ms
64 bytes from 192.168.123.11: icmp_seq=45 ttl=64 time=0.183 ms
64 bytes from 192.168.123.11: icmp_seq=46 ttl=64 time=0.178 ms
64 bytes from 192.168.123.11: icmp_seq=47 ttl=64 time=0.177 ms
64 bytes from 192.168.123.11: icmp_seq=48 ttl=64 time=0.186 ms
64 bytes from 192.168.123.11: icmp_seq=49 ttl=64 time=0.175 ms
64 bytes from 192.168.123.11: icmp_seq=50 ttl=64 time=0.194 ms
64 bytes from 192.168.123.11: icmp_seq=51 ttl=64 time=0.175 ms
64 bytes from 192.168.123.11: icmp_seq=52 ttl=64 time=0.185 ms
64 bytes from 192.168.123.11: icmp_seq=53 ttl=64 time=0.186 ms
64 bytes from 192.168.123.11: icmp_seq=54 ttl=64 time=0.201 ms
64 bytes from 192.168.123.11: icmp_seq=55 ttl=64 time=0.182 ms
64 bytes from 192.168.123.11: icmp_seq=56 ttl=64 time=0.175 ms
64 bytes from 192.168.123.11: icmp_seq=57 ttl=64 time=0.180 ms
64 bytes from 192.168.123.11: icmp_seq=58 ttl=64 time=0.174 ms
64 bytes from 192.168.123.11: icmp_seq=59 ttl=64 time=0.211 ms
64 bytes from 192.168.123.11: icmp_seq=60 ttl=64 time=0.177 ms
64 bytes from 192.168.123.11: icmp_seq=61 ttl=64 time=0.171 ms
64 bytes from 192.168.123.11: icmp_seq=62 ttl=64 time=0.264 ms
64 bytes from 192.168.123.11: icmp_seq=63 ttl=64 time=0.185 ms
64 bytes from 192.168.123.11: icmp_seq=64 ttl=64 time=0.188 ms
64 bytes from 192.168.123.11: icmp_seq=65 ttl=64 time=0.180 ms
64 bytes from 192.168.123.11: icmp_seq=66 ttl=64 time=0.180 ms
64 bytes from 192.168.123.11: icmp_seq=67 ttl=64 time=0.173 ms
64 bytes from 192.168.123.11: icmp_seq=68 ttl=64 time=0.174 ms
64 bytes from 192.168.123.11: icmp_seq=69 ttl=64 time=0.184 ms
64 bytes from 192.168.123.11: icmp_seq=70 ttl=64 time=0.180 ms
64 bytes from 192.168.123.11: icmp_seq=71 ttl=64 time=0.179 ms
64 bytes from 192.168.123.11: icmp_seq=72 ttl=64 time=0.170 ms
64 bytes from 192.168.123.11: icmp_seq=73 ttl=64 time=0.178 ms
64 bytes from 192.168.123.11: icmp_seq=74 ttl=64 time=0.171 ms
64 bytes from 192.168.123.11: icmp_seq=81 ttl=64 time=0.267 ms
64 bytes from 192.168.123.11: icmp_seq=82 ttl=64 time=0.187 ms
64 bytes from 192.168.123.11: icmp_seq=83 ttl=64 time=0.181 ms
64 bytes from 192.168.123.11: icmp_seq=84 ttl=64 time=0.190 ms
64 bytes from 192.168.123.11: icmp_seq=85 ttl=64 time=0.181 ms
64 bytes from 192.168.123.11: icmp_seq=86 ttl=64 time=0.172 ms
64 bytes from 192.168.123.11: icmp_seq=87 ttl=64 time=0.172 ms
64 bytes from 192.168.123.11: icmp_seq=88 ttl=64 time=0.173 ms
64 bytes from 192.168.123.11: icmp_seq=89 ttl=64 time=0.182 ms
64 bytes from 192.168.123.11: icmp_seq=90 ttl=64 time=0.180 ms
64 bytes from 192.168.123.11: icmp_seq=91 ttl=64 time=0.176 ms
^C
--- 192.168.123.11 ping statistics ---
91 packets transmitted, 85 received, 6% packet loss, time 9020ms
rtt min/avg/max/mdev = 0.162/0.186/0.267/0.019 ms
[root@overcloud-novacomputeiha-0 ~]# 
~~~

Comment 2 Andreas Karis 2018-11-21 23:57:40 UTC
I just tried this again and this finally isn't as easy to reproduce as I thought. While I could reproduce this originally in the software reproducer, the same system now shows no packet loss.

Comment 9 Red Hat Bugzilla 2023-09-15 00:14:00 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.