Bug 772317

Summary: Disable LRO for all NICs that have LRO enabled
Product: Red Hat Enterprise Linux 6 Reporter: Mike Burns <mburns>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Liang Zheng <lzheng>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.3CC: acathrow, agospoda, apevec, atzhang, bsarathy, cpelland, dhoward, djuran, dledford, dyasny, fyu, gouyang, greg.wickham, jboggs, jturner, kzhang, leiwang, llim, lzheng, mburns, moli, mwagner, nhorman, ovirt-maint, pcao, plundin, plyons, sforsber, sghosh, tvvcox, vbian, ycui, yeylon, zhchen
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-230.el6 Doc Type: Bug Fix
Doc Text:
Previously, network drivers that had Large Receive Offload (LRO) enabled by default caused the system to run slow, lose frame, and eventually prevent communication, when using software bridging. With this update, LRO is automatically disabled by the kernel on systems with a bridged configuration, thus preventing this bug.
Story Points: ---
Clone Of: 692656
: 772319 772806 772809 (view as bug list) Environment:
Last Closed: 2012-06-20 08:13:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 692656, 772319    
Bug Blocks: 692864, 772809, 818504    
Attachments:
Description Flags
patch to force LRO off on all bond slaves
none
updated fix for bonding none

Description Mike Burns 2012-01-06 19:40:18 UTC
+++ This bug was initially created as a clone of Bug #692656 +++

There are significant performance issues reported for NICs that use LRO.  We need to disable LRO for all nics that have it enabled.

Comment 1 Mike Burns 2012-01-06 19:57:27 UTC
Patch posted Upstream:

http://gerrit.ovirt.org/927

Comment 2 Mike Burns 2012-01-09 14:32:11 UTC
Moving out of Post.  This should be fixed in kernel already, need to investigate more.

Comment 3 Mike Burns 2012-01-09 18:58:08 UTC
Original mail comments:

So, running RHEV 3 beta for a customer this week and we've been seeing 
horrible performance on the RHEV-H hosts running the bnx2x driver. It 
turns out this is a problem with LRO. So we have a fix and it works 
(ethtool -K eth$ lro off).

However how do we make this change persistent across reboots ? We want 
to verify that the "normal" method of putting the appropriate option 
(options bnx2x disable_tpa=1) in modprobe.conf is supported. (There is 
no /etc/modprobe.conf and / is ro ... ).

Comment 4 Mike Burns 2012-01-09 18:59:56 UTC
Applying the workaround is doable by placing commands in /etc/rc.local and persisting.  

This issue originally came up in 5.6/5.7 but was supposed to be fixed in the kernel.  Can I get some help from the kernel team with debugging/triaging this problem?

Comment 5 Neil Horman 2012-01-09 20:56:50 UTC
Mike, can you tell me:

1) What the environment looks like?  Specifically what kind of network interfaces are in play here?  Specific effected drivers, vlans in use, bridges in use vs. sriov or other offload technologies?

2) The specific nature of the failure.  Are frames getting dropped, and if so, where?  Specific netstat, ethtool, and /proc/net/dev|snmp stats are useful here

3) History.  You said this came up in 5.6/5.7. Is the problem fixed there, or does it persist there the same way it does in RHEL6?

Comment 6 Mike Burns 2012-01-09 21:08:40 UTC
Paul,  Can you provide the information for 1 and 2 above?? 

Neil,

In 5.6/5.7, we explicitly disabled LRO on all nics where it was enabled by default.  The rhev-hypervisor bug (bug 696374) mentioned bug 696374.  I don't know if this partitcular environment has vlans or not though.  In the 5.7/5.8 branches, we still have that workaround in place, but it was never ported forward to the RHEL 6 stream.

Comment 7 Risar 2012-01-09 21:48:36 UTC
In response to the above:

1. A single RHEV-M instance managing a cluster of 6 HP nodes running RHEV-H, all using the bnx2x driver (as is normal with HP kit). No tagging, STP or SRIOV in use. Interfaces were however mode 1 bonded (active/failover) pairs.

2. It appeared to mimic a bug I found online when debugging the issue (duplicate responses/acks), but truthfully we were under the gun and did not save the tcpdump output. No errors or collisions shown on the interfaces, and everything else was defaults (eg nothing fancy here). 

Upon making the above LRO change network speeds increased significantly. The specific test use case was kickstarting VM's over the network. A base RHEL install took over 4 hours (as the only VM running on the hypervisor) before disabling LRO. Once LRO was disabled in the hypervisor the install took less than 5 minutes. (Not scientific, but it pointed us where we needed to go)

Comment 8 Neil Horman 2012-01-09 21:54:39 UTC
Thank you Mike, if you could also provide some details as to what exactly needed to be fixed in RHEL5 so we can compare to RHEL6.  IIRC the only thing that had to be done in RHEL5 was the disabling of lro automatically when a device was added to a bridge.  That functionality should already be in RHEL6. If you are using some offload technology like sriov or some other pci virtual function technology, manual lro disabling (or some other per-device-driver automatic disabling is still going to be required).

Comment 9 Mike Burns 2012-01-09 22:05:19 UTC
The fix in RHEL5 was to simply disable LRO in all instances on all nics that supported it.  It was a hack and workaround, but was sufficient for our use.  

There should be no sr-iov or anything like that in this situation.  

My recollection of the issue was the same.  We needed to disable lro when adding the nic to a bridge.  Based on what Risar is saying, this wasn't happening for them.  The nic was added to a bridge, but they were still seeing problems until they explicitly disabled LRO on that interface.

Comment 10 Neil Horman 2012-01-09 22:05:50 UTC
Paul, thank you.  so it sounds like no vlans are in use, which is good.  That confirms that this is no relation to the vlan lro bug I fixed in RHEL5.  That said, if you're using bonding, then I think thats where the problem lies.  I don't see any way that the bonding driver can disable slave lro at the moment, or for that matter, tell its slaves to do so.  Can we test this theory.  Does the problem go away if you stop using the bond? If you attach a single interface to your bridge, does lro get disabled, and does your performance increase?

Mike, I can take this bug over if you like.

Comment 11 Andy Gospodarek 2012-01-09 22:15:58 UTC
I suspect Neil is correct on this one.  The bonding driver does not have a set_flags ethtool op and this would be required to pass down the need to disable LRO on all slave devices.

Comment 12 Risar 2012-01-09 22:19:23 UTC
Neil, I can ask the customer if they are willing to test this (The problem was encountered during a RH Consulting engagement which ended last week) but it may be a few days until they get a chance to do so.

Comment 13 Mike Burns 2012-01-09 22:24:12 UTC
(In reply to comment #10)

> 
> Mike, I can take this bug over if you like.

Neil, go ahead.  I'll clone if I end up needing to put a workaround into RHEV-H directly.

Comment 14 Mike Burns 2012-01-10 01:45:02 UTC
Moving to kernel

Comment 16 RHEL Program Management 2012-01-10 01:59:46 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 17 Neil Horman 2012-01-11 17:00:01 UTC
Created attachment 552171 [details]
patch to force LRO off on all bond slaves

gospo and I are still discussing the best approach for this, but just disabling LRO on the slaves of a bond, since the RHEL6 bond driver doesn't itself support LRO currently seems like a sane approach. Heres a patch to do that

Comment 18 Neil Horman 2012-01-11 17:05:04 UTC
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=3933731

Heres a test build.  Please try this out and confirm that it fixes the problem please.

Comment 19 Neil Horman 2012-01-12 11:31:57 UTC
Just as a heads up, there will be another build comming in a bit, as gospo and I discussed and this we should leave bonding enabled until we have to disable it (to allow for improved performance).

Comment 20 Neil Horman 2012-01-12 19:23:36 UTC
Created attachment 552491 [details]
updated fix for bonding

heres an updated fix for the bonding driver that allows LRO to stay enabled when not attached to a bridge

Comment 21 Neil Horman 2012-01-12 19:37:47 UTC
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=3937226

New test build.  Please confirm that this fixes the problem.

Comment 22 Weibing Zhang 2012-01-13 03:56:18 UTC
In ethtool part of NIC driver test, we check wehther LRO is supported.
We will add checking LRO disabled for all nics that have it enabled as well as bonding.
set qa_ack+.

Comment 23 Neil Horman 2012-01-16 19:00:01 UTC
ping mike, this working for your environment?

Comment 24 Aristeu Rozanski 2012-02-10 23:00:58 UTC
Patch(es) available on kernel-2.6.32-230.el6

Comment 26 Liang Zheng 2012-02-17 09:09:41 UTC
Hi Neil,
I think this patch introduces a regression bug.
See https://bugzilla.redhat.com/show_bug.cgi?id=794647

[root@hp-dl580g7-01 ~]# modprobe bonding mode=0 miimon=100
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)
bonding: MII link monitoring set to 100 ms
------------[ cut here ]------------
WARNING: at net/core/dev.c:1234 dev_disable_lro+0x7b/0x80() (Not tainted)
Hardware name: ProLiant DL580 G7
Modules linked in: bonding(+) ip6table_filter ip6_tables ebtable_nat ebtables
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables
bridge stp llc sunrpc pcc_cpufreq ipv6 dm_mirror dm_region_hash dm_log
vhost_net macvtap macvlan tun kvm_intel kvm uinput power_meter be2net ixgbe dca
mdio netxen_nic sg microcode serio_raw iTCO_wdt iTCO_vendor_support hpilo hpwdt
i7core_edac edac_core shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif
lpfc scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix hpsa radeon ttm
drm_kms_helper drm i2c_algo_bit i2c_core dm_mod [last unloaded: scsi_wait_scan]
Pid: 4699, comm: modprobe Not tainted 2.6.32-232.el6.x86_64 #1
Call Trace:
 [<ffffffff81069b67>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff81069bba>] ? warn_slowpath_null+0x1a/0x20
 [<ffffffff8142c83b>] ? dev_disable_lro+0x7b/0x80
 [<ffffffff81494428>] ? inetdev_init+0x168/0x200
 [<ffffffff81494958>] ? inetdev_event+0x408/0x4c0
 [<ffffffff8126e7eb>] ? kobject_uevent+0xb/0x10
 [<ffffffff81441e6e>] ? netdev_queue_update_kobjects+0xee/0x110
 [<ffffffff814f56d5>] ? notifier_call_chain+0x55/0x80
 [<ffffffff81096e06>] ? raw_notifier_call_chain+0x16/0x20
 [<ffffffff8142f3fb>] ? call_netdevice_notifiers+0x1b/0x20
 [<ffffffff81433f74>] ? register_netdevice+0x304/0x3d0
 [<ffffffffa05aaf38>] ? bond_create+0x68/0x130 [bonding]
 [<ffffffffa04d992f>] ? bonding_init+0x92f/0x999 [bonding]
 [<ffffffffa04d9000>] ? bonding_init+0x0/0x999 [bonding]
 [<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0
 [<ffffffff810af9f1>] ? sys_init_module+0xe1/0x250
 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
---[ end trace 6d497b6e4373c064 ]---

Comment 28 Liang Zheng 2012-03-26 08:27:46 UTC
1.create a bond with 2 slaves
modprobe bonding mode=1 miimon=100
ifconfig bond0 up
ifenslave bond0 eth4 eth5
2.create a vlan on the bond0
vconfig add bond0 3
ifconfig bond0.3 up

3.
[root@hp-dl580g7-01 ~]# ethtool -k eth4 | grep large
large-receive-offload: on
[root@hp-dl580g7-01 ~]# ethtool -k eth5 | grep large
large-receive-offload: on
[root@hp-dl580g7-01 ~]# ethtool -k bond0 | grep large
large-receive-offload: on
[root@hp-dl580g7-01 ~]# ethtool -k bond0.3 | grep large
large-receive-offload: on

4.Disable lro on the slaves interface
[root@hp-dl580g7-01 ~]# ethtool -K eth4 lro off
[root@hp-dl580g7-01 ~]# bonding: bond0: link status definitely down for interface eth4, disabling it
bonding: bond0: making interface eth5 the new active one.
ixgbe 0000:11:00.0: eth4: detected SFP+: 5
ixgbe 0000:11:00.0: eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX
bond0: link status definitely up for interface eth4, 10000 Mbps full duplex.

5.[root@hp-dl580g7-01 ~]# ethtool -k bond0 | grep large
large-receive-offload: off

I have some questions about this bug:
First, I can't disable LRO on bond0/bond0.3/ interface
[root@hp-dl580g7-01 ~]# ethtool -K bond0 lro off
Cannot set large receive offload settings: Operation not supported

Secondly,When I disable lro on slaves and the LRO on bond0 is off ,but I can't enable it anymore.

Comment 29 Neil Horman 2012-03-26 12:35:00 UTC
comment 26 is being handled in bz 794647.



you're test is presumably working properly.  Presumably at least one of the slaves in the bond clears the NETIF_F_LRO flag when disabling LRO, and the bond takes that to mean that LRO is not supported, and so the bond acts as though it doesn't support LRO either.  If you want to see the bond be able to disable LRO properly, then leave LRO enabled on the slaves, then try to disable LRO

Comment 30 Liang Zheng 2012-03-26 14:35:36 UTC
(In reply to comment #29)
> you're test is presumably working properly.  Presumably at least one of the
> slaves in the bond clears the NETIF_F_LRO flag when disabling LRO, and the bond
> takes that to mean that LRO is not supported, and so the bond acts as though it
> doesn't support LRO either.  If you want to see the bond be able to disable LRO
> properly, then leave LRO enabled on the slaves, then try to disable LRO

Yes.My test steps are as follows:
1.Create a bond with 2 slaves
2.Enable LRO on the 2 slaves
3.Disable LRO on the bond

Then, I can't disable LRO on the bond0
[root@hp-dl580g7-01 ~]# ethtool -K bond0 lro off
Cannot set large receive offload settings: Operation not supported

So is there anything wrong with my test?

Comment 31 Neil Horman 2012-03-26 14:59:32 UTC
Thats not what you said in comment 28

In step 4, you clearly disabled LRO on the slave interfaces prior to attempting to manipulate LRO on the bond:
4.Disable lro on the slaves interface
[root@hp-dl580g7-01 ~]# ethtool -K eth4 lro off


If you point me to the the system you're using I'll take a look at it myself and tell you if its working or not.

Comment 32 Neil Horman 2012-03-26 15:07:06 UTC
Also, please confirm your test results on bz 794647 ASAP.  Its getting late in the release cycle, and I'd like to get that fix in place, but I want to be sure it works for you first.

Comment 33 Neil Horman 2012-03-26 15:34:17 UTC
Update: Please note I have a new build for you that improves a flag test that mschmidt noted.  It will make minor changes to some of the behavior you are noting here and hopefully clarify it for you.  Please test asap.

Comment 34 Liang Zheng 2012-03-26 15:41:46 UTC
(In reply to comment #31)
> Thats not what you said in comment 28
> 
> In step 4, you clearly disabled LRO on the slave interfaces prior to attempting
> to manipulate LRO on the bond:
> 4.Disable lro on the slaves interface
> [root@hp-dl580g7-01 ~]# ethtool -K eth4 lro off
Yes,I ran two tests on it :
Test one
Disable LRO on slaves to manipulate LRO on the bond and it works.
But I can't  enable it anymore.

Test two
 I re-configured the bonding driver in Commet 30 and can't disable LRO on bond directly.
Sorry I did not make it clear to make you confused.

You can use this system to test with eth4 & eth5
hp-dl580g7-01.rhts.eng.nay.redhat.com root/redhat

For bz794647, I can't get the kernel from https://bugzilla.redhat.com/show_bug.cgi?id=794647#c10
and is the patch in https://bugzilla.redhat.com/attachment.cgi?id=563978  the latest version ?

Comment 35 Liang Zheng 2012-03-26 15:45:58 UTC
I got your new patch for it  and it will be tested assp.

Comment 36 Neil Horman 2012-03-26 15:52:02 UTC
the brew build is still building, so you need to wait or build it yourself.  if you can build yourself that would be great.

I'm looking at the above system now, and I see what your saying now, the patch from bz 772317 should fix that. If you can validate this bz, I'll fix your observation here in bz 772317 once you validate its fixed.  Please update that bug with results.  Thanks!

Comment 37 Liang Zheng 2012-03-26 16:17:57 UTC
The build in https://brewweb.devel.redhat.com/taskinfo?taskID=4059746 is failed.
I am going to build it myself.

Comment 38 Neil Horman 2012-03-26 16:27:22 UTC
yeah, brew seems to be having a problem.  Building it yourself would be good while I figure out whats going wrong.

Comment 39 Neil Horman 2012-03-26 17:47:06 UTC
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=4193073

Found the problem.  The git branch I have for this bug is old, and had problems that arose from the git-dist conversion.  I needed to point the build at a newer build collection to build properly.  The above build should run to completion, although I think you'll be better off just building yourself in the interest of time

Comment 40 Liang Zheng 2012-04-16 11:19:03 UTC
Verified on kernel 2.6.32-265.el6.x86_64 and no regression issue.
Set verified.

Comment 43 Tomas Capek 2012-06-14 10:40:27 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, network drivers that had Large Receive Offload (LRO) enabled by default caused the system to run slow, lose frame, and eventually prevent communication, when using software bridging. With this update, LRO is automatically disabled by the kernel on systems with a bridged configuration, thus preventing this bug.

Comment 45 errata-xmlrpc 2012-06-20 08:13:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0862.html