Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
I was able to reproduce the problem, identify the fix and verify it.
This is fixed by the following upstream commit:
commit 88340160f3ad22401b00f4efcee44f7ec4769b19
Author: Martin KaFai Lau <kafai>
Date: Fri Jan 16 10:11:00 2015 -0800
ip_tunnel: Create percpu gro_cell
In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
All skb will be queued to the same cell->napi_skbs. The
gro_cell_poll is pinned to one core under load. In production traffic,
we also see severe rx_dropped in the tunl iface and it is probably due to
this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog.
This patch is trying to alloc_percpu(struct gro_cell) and schedule
gro_cell_poll to process the skb in the same core.
Signed-off-by: Martin KaFai Lau <kafai>
Acked-by: Eric Dumazet <edumazet>
Signed-off-by: David S. Miller <davem>
Reproduction script:
#!/bin/bash
iface=em1
h=1
# h=2 for the other side
oh=$((3 - h))
ip l s em1 mtu 9000 up
ip -4 a f em1
ip a a 192.168.99.$h/24 dev em1
ethtool -U em1 rx-flow-hash udp4 sdfn
ovs-vsctl del-br ovs0
ovs-vsctl add-br ovs0
ovs-vsctl add-port ovs0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=192.168.99.$oh
ovs-vsctl add-port ovs0 i0 -- set interface i0 type=internal
ip l s i0 up
ip a a 192.168.98.$h/24 dev i0
if [[ $h = 2 ]]; then
iperf3 -s
else
iperf3 -c 192.168.98.2 -P 100 -w 200K
fi
Reproduced on 3.10.0-514.6.1.el7 on ixgbe NIC. Throughput at ~8 Gbit/s
Verified on kernel 3.10.0-655.el7 on ixgbe NIC. The performance numbers got back to ~9.1 Gbit/s.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2017:1842