Bug 666866
Summary: | Heavy load on ath5k wireless device makes system unresponsive | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Simon Matter <simon.matter> | ||||||||||||
Component: | kernel | Assignee: | Stanislaw Gruszka <sgruszka> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | low | ||||||||||||||
Version: | 5.6 | CC: | agospoda, benl, chyang, jfeeney, jiajyang, linville, lwang, mschmidt, peterm, qcai, sgruszka, tburke, thenzl, tpelka, vbenes, vsharapo, zcerza | ||||||||||||
Target Milestone: | rc | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2011-07-21 10:29:12 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Simon Matter
2011-01-03 14:35:25 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Interesting. I thought in RHEL5 we did not enable ASPM at all. Simon, would you attach the output of "lspci -nnvvv" with the unpatched kernel? Thanks. OK, I did it with both kernels. The unpatched is this one http://people.redhat.com/jwilson/el5/238.el5/i686/kernel-2.6.18-238.el5.i686.rpm. The patched is my own wih the mentioned patch. See here: [simix@wurro ~]$ diff 2.6.18-238.el5 2.6.18-238.invoca1.el5 -Nau --- 2.6.18-238.el5 2011-02-17 13:58:56.000000000 +0100 +++ 2.6.18-238.invoca1.el5 2011-02-17 13:58:56.000000000 +0100 @@ -142,7 +142,7 @@ Device: MaxPayload 128 bytes, MaxReadReq 128 bytes Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 3 Link: Latency L0s <256ns, L1 <4us - Link: ASPM L0s L1 Enabled RCB 64 bytes CommClk+ ExtSynch- + Link: ASPM L1 Enabled RCB 64 bytes CommClk+ ExtSynch- Link: Speed 2.5Gb/s, Width x1 Slot: AtnBtn- PwrCtrl- MRL- AtnInd- PwrInd- HotPlug+ Surpise+ Slot: Number 2, PowerLimit 6.500000 @@ -332,7 +332,7 @@ Device: MaxPayload 128 bytes, MaxReadReq 512 bytes Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 0 Link: Latency L0s <512ns, L1 <64us - Link: ASPM L0s L1 Enabled RCB 128 bytes CommClk+ ExtSynch- + Link: ASPM L1 Enabled RCB 128 bytes CommClk+ ExtSynch- Link: Speed 2.5Gb/s, Width x1 Capabilities: [90] MSI-X: Enable- Mask- TabSize=1 Vector table: BAR=0 offset=00000000 Regards, Simon Created attachment 479316 [details]
lspci unpatched
Created attachment 479317 [details]
lspci patched
Matthew Garrett tells me that the RHEL5 kernel never enables ASPM. In this case the BIOS must have done it. So a patch like the proposed one is indeed necessary. I'd only suggest a cleanup. Instead of copying the code from e1000e, it should be put into a common function. Stanislaw will do that. If ASPM can be enabled by BIOS, should we disable it explicitly also in RHEL5 on the same drivers we did it in RHEL6 (i.e. ath5k, ath9k, iwlwifi, r8169, aacraid ...)? Matthew, your opinion? In general the BIOS won't set up ASPM modes that break - this case seems to be an exception. I think we could get away with just doing ath5k. Created attachment 496804 [details]
/0001-ath5k-disable-ASPM-L0s-for-all-cards.patch
This is a slightly different patch from that was proposed. It check if device is PCIe one, and add comments that was in upstream commit.
(In reply to comment #15) > So a patch like the proposed one is indeed necessary. I'd only suggest a > cleanup. Instead of copying the code from e1000e, it should be put into a > common function. Stanislaw will do that. Hmm, I told that by changed my mind, now I think it's not worth doing so ... Created attachment 496807 [details]
test_ath5k_aspm.patch
As my BIOS does not enable ASPM, I was using this for testing.
Patch(es) available in kernel-2.6.18-261.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Somehow kernel-2.6.18-261.el5 is not available in http://people.redhat.com/jwilson/el5, maybe a missing sync? I can confirm that kernel-2.6.18-261.el5 works for me as expected. Couldn't reproduce the bug. We have a machine with the same PCI device but a different subsystem. The bug does not appear in this configuration. Our device configuration doesn't have Capabilities: [60] Express Legacy Endpoint IRQ 0 And high volume transfers (8GiB) over various protocols (scp, http (wget), and rsync) are all fine. I never got kernel activity in dmesg throughout this. Setting verified to SanityOnly and Customer. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html |