Bug 753729
Summary: | system cannot suspend with "stopping tasks timed out - bnx2i_thread/0 remaining" | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Guangze Bai <gbai> | ||||
Component: | kernel | Assignee: | Mike Christie <mchristi> | ||||
Status: | CLOSED ERRATA | QA Contact: | Storage QE <storage-qe> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 5.8 | CC: | bprakash, ccui, coughlan, czhang, eddie.wai, fge, mschmidt, nhorman, syeghiay, yshao | ||||
Target Milestone: | beta | Keywords: | Regression | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-2.6.18-300.el5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 765724 (view as bug list) | Environment: | |||||
Last Closed: | 2012-02-21 04:01:34 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 757620, 758797, 765724 | ||||||
Attachments: |
|
Description
Guangze Bai
2011-11-14 10:11:52 UTC
Tested on -288.el5 and T400 can successfully suspend. Also, no bnx2i_thread lived in system. So marking "Regression". I'll bisect and provide more info later. The kernel thread's main loop is in bnx2i_percpu_io_thread(). The thread neither calls try_to_freeze(), nor marks itself unfreezable (PF_NOFREEZE). It needs to do one of these as described in Documentation/power/kernel_threads.txt. Adding bnx2i maintainer Eddie from broadcom. It looks like this could be a problem in fcoe.ko and bnx2fc.ko in rhel 6 too. (In reply to comment #6) > Adding bnx2i maintainer Eddie from broadcom. > > It looks like this could be a problem in fcoe.ko and bnx2fc.ko in rhel 6 too. I guess this does not apply to rhel6? The kernel_threads.txt is not there anymore and I see it is removed. But for rhel5 does qla2xxx have the problem? (In reply to comment #7) > I guess this does not apply to rhel6? The kernel_threads.txt is not there > anymore and I see it is removed. In RHEL6 there is Documentation/power/freezing_of_tasks.txt instead. There is one significant difference. In RHEL6 kernel threads are non-freezable by default. See commit 83144186 "Freezer: make kernel threads nonfreezable by default". > But for rhel5 does qla2xxx have the problem? Looking at the code... yes, it does. Created attachment 534072 [details]
bnx2i patch to add explicit PF_NOFREEZE setting for I/O kthreads
It looks like the correct fix for the bnx2i I/O kthread is to add the explicit setting of the PF_NOFREEZE flag. This will align the bnx2i I/O kthread behavior between the RHEL6/upstream and RHEL5.8.
The enclosed patch was created based off of the linux-2.6.18-295.el5 kernel source. Please review, thanks.
Eddie
Patch(es) available in kernel-2.6.18-300.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5/ Detailed testing feedback is always welcomed. If you require guidance regarding testing, please ask the bug assignee. Tried on server platform with bnx2i iscsi session, but server only support suspend to disk. kernel -300 Server _cannot_ boot up, the console of that server is down, so this is manually type: ==== begin fw dump (mark 0x3c67a0) 0x80071b4 mcp intr[0.0]: 0x4:SPAD RPTY => 0x PC 0x800650c ==== Will provide the detailed output once eng-ops fix the console. I see no customer need to suspend a server to disk, so if you guys don't want to fix it, we can close this bug. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0150.html |