Bug 816011 - iwlwifi performance regression
iwlwifi performance regression
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.3
x86_64 Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Stanislaw Gruszka
Desktop QE
:
Depends On: 841578
Blocks: 840683
  Show dependency treegraph
 
Reported: 2012-04-25 00:17 EDT by Alex Williamson
Modified: 2012-11-13 09:12 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-09-20 04:02:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
full ping test log on 2.6.32-262.el6.x86_64 (2.26 KB, text/plain)
2012-04-25 00:18 EDT, Alex Williamson
no flags Details
full ping test log on 2.6.32-220.13.1.el6.x86_64 (2.20 KB, text/plain)
2012-04-25 00:18 EDT, Alex Williamson
no flags Details
full ping test log on 2.6.32-268.el6.x86_64 (2.26 KB, text/plain)
2012-04-27 00:23 EDT, Alex Williamson
no flags Details
full ping test log on 2.6.32-268.el6.x86_64 5ghz_disable (739 bytes, text/plain)
2012-04-27 00:24 EDT, Alex Williamson
no flags Details
full ping test log on 2.6.32-268.el6.x86_64 5ghz_disable 11n_disable (704 bytes, text/plain)
2012-04-27 00:27 EDT, Alex Williamson
no flags Details

  None (edit)
Description Alex Williamson 2012-04-25 00:17:00 EDT
Description of problem:

The device:

02:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
	Subsystem: Intel Corporation Centrino Ultimate-N 6300 3x3 AGN
	Flags: bus master, fast devsel, latency 0, IRQ 30
	Memory at f2400000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [c8] Power Management version 3
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [e0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 00-24-d7-ff-ff-04-11-f8
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi

System: Lenovo X201

cmdline: ro root=/dev/mapper/vg_x201-lv_rhel6 rd_LVM_LV=vg_x201/lv_rhel6 rd_LVM_LV=vg_x201/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us rhgb quiet crashkernel=129M@0M cgroup_disable=memory intel_iommu=on,igfx_off selinux=0

Test: for i in $(seq 1 10); do ping -q -i 0.1 -c 1000 192.168.1.1; done
(Worst individual results picked from 10 runs)

2.6.32-220.13.1.el6.x86_64:

rtt min/avg/max/mdev = 1.074/6.863/186.472/15.276 ms

2.6.32-262.el6.x86_64:

rtt min/avg/max/mdev = 0.957/62.237/1988.860/201.012 ms
                             ^^^^^^^^^^^^^^^^^^^^^^^  ~10x worse

wireless has noticeable stalls

Version-Release number of selected component (if applicable):
2.6.32-262.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Use wifi
2.
3.
  
Actual results:
poor performance

Expected results:
no regression

Additional info:
Will upload full test rsults
Comment 1 Alex Williamson 2012-04-25 00:18:11 EDT
Created attachment 580058 [details]
full ping test log on 2.6.32-262.el6.x86_64
Comment 2 Alex Williamson 2012-04-25 00:18:59 EDT
Created attachment 580059 [details]
full ping test log on 2.6.32-220.13.1.el6.x86_64
Comment 4 John W. Linville 2012-04-26 10:33:41 EDT
I do not seem to be able to reproduce this locally.  What can you tell me about your environment?  What kind of encryption are you using?  What sort of AP?

Can we see the output of dmesg and /var/log/message after a bad run?

-262.el6 is getting a bit old, and in particular -265.el6 has a collection of wireless fixes included in it.  Can you reproduce this with a current kernel?
Comment 5 Alex Williamson 2012-04-26 11:45:55 EDT
AP is an Asus RT-N16 running: Tomato Firmware v1.28.9054 MIPSR2-beta K26 USB vpn3.6.  This is a 2.4GHz only AP.

Wireless is in access point mode, auto (enabling b/g/n), channel 1 (20MHz channel width) with broadcast ssid, security is WPA/WPA2 Personal w/ AES encryption.  Note that for the test I ran the series of pings on -262, then immediately rebooted into -220 and ran the identical set of tests, so there was only a slight time shift and no change in configuration, other device, and likely no change in overall congestion of the 2.4GHz band.  Rebooting back into -262 afterwards and re-testing show the same poor performance.

I've noticed the same problem on F16 kernels and in fact have stopped using this system as my primary device because iwlwifi has gotten so bad there.  Actually switched to an ath9k in another laptop to avoid iwlwifi because it was having similar problems.

I'll retest with the latest kernel tonight.
Comment 6 John W. Linville 2012-04-26 12:02:10 EDT
FWIW, Stanislaw just posted a patch to allow disabling 5GHz on the iwlwifi driver.  It might be worthwhile to try one of his test kernels with that patch already applied and with the option to disable 5GHz utilized.
Comment 7 Alex Williamson 2012-04-27 00:23:29 EDT
Created attachment 580626 [details]
full ping test log on 2.6.32-268.el6.x86_64

no significant change from -262
Comment 8 Alex Williamson 2012-04-27 00:24:37 EDT
Created attachment 580627 [details]
full ping test log on 2.6.32-268.el6.x86_64 5ghz_disable

no significant change
Comment 9 Alex Williamson 2012-04-27 00:27:18 EDT
Created attachment 580629 [details]
full ping test log on 2.6.32-268.el6.x86_64 5ghz_disable 11n_disable

This might be a hint, good latency performance with 802.11n disabled.  Note, verified network environment still performance well with -220 kernel after running these tests.
Comment 10 John W. Linville 2012-04-27 10:42:59 EDT
Good catch!  The aggregation in the .11n case is almost certainly related to the extra latency.

Wey-yi, we are experiencing much higher latency with RHEL 6.3 (iwlwifi based on upstream 3.2) than we had with RHEL 6.2 (iwlwifi based on upstream 2.6.37).  Can you give us some attention toward figuring-out what is causing that?
Comment 11 John W. Linville 2012-04-30 13:25:30 EDT
Alex, are you able to recreate this on a current Fedora kernel?  That might be more helpful for the Intel team...?
Comment 12 Alex Williamson 2012-04-30 15:38:32 EDT
Running 3.3.4-1.fc17.x86_64 iwlwifi is just as bad, if not worse.  In addition to the excessive ping times, F17 hits me with random firmware restarts of the device.

I did discover a key test variable though that may help in reproducing the problem.  If the laptop is within ~5ft of the access point, I see good behavior (rtt: <=1ms min, <10ms avg, 75-150ms max, 0% packet loss, no firmware restarts... so far).  If I move back to my previous test location (~30ft from the AP) I go back to poor performance (rtt: 75+ms avg, 1500+ms max, ~10% packet loss, frequent firmware restarts).  The Android Wifi Analyzer app shows that I'm competing with about a dozen other access points.  At the near spot I generally have 40dBm on the neighbors and at the far spot I have about 20dBm.  So, if you're unable to reproduce, try moving a bit further away from the router and add some neighbors (ie. a real world environment).
Comment 13 Alex Williamson 2012-04-30 16:12:02 EDT
I hadn't noticed previously, perhaps you did, the 11n_disable log shows 0% packet loss while even the -220 kernel with good performance showed roughly 10%.  I also note that with 11n_disable the bit rate reported by iwconfig is very stable, while with 802.11n enabled it bounces all over the place, 1Mb/s to 144.4Mb/s.
Comment 18 RHEL Product and Program Management 2012-07-19 09:41:09 EDT
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.
Comment 23 Stanislaw Gruszka 2012-09-20 04:02:48 EDT
I can not reproduce this on RHEL6.3 nor on new iwlwifi RHEL6.4 update - I'm closing this with WORKSFORME resolution.

Alex, if you get a chance to test this on RHEL6.4 and problem still persist just reopen this bug report.

Note You need to log in before you can comment on or make changes to this bug.