Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 479090 - Panic in do_cciss_intr removeQ
Panic in do_cciss_intr removeQ
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.9
All Linux
low Severity medium
: rc
: ---
Assigned To: Tomas Henzl
Storage QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-06 20:45 EST by Wade Mealing
Modified: 2018-10-19 21:28 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-02-16 10:36:04 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
backport (5.73 KB, patch)
2009-09-22 10:50 EDT, Tomas Henzl
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0263 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 4.9 kernel security and bug fix update 2011-02-16 10:14:55 EST

  None (edit)
Description Wade Mealing 2009-01-06 20:45:05 EST
Description of problem:

The CCISS module has a panic panic in do_cciss_intr.

CPU:    0
EIP:    0060:[<f8855437>]    Tainted: P      VLI
EFLAGS: 00010087   (2.6.9-34.0.2.ELsmp)
EIP is at do_cciss_intr+0xdc/0x4b4 [cciss]
eax: 00000000   ebx: 00000004   ecx: 00000004   edx: 00000000
esi: f7400000   edi: 00000000   ebp: c3765800   esp: c03eafbc
ds: 007b   es: 007b   ss: 0068
Process kjournald (pid: 1629, threadinfo=c03ea000 task=c37fedb0)
Stack: 00000000 00000001 00000001 00000082 f7dd4800 00000001 00000000 c37a4ab8
      c0107472 c37a4a9c c03ea000 c0387900 c37a4000 c01079d2 00000032 c37a4ab8
      f7dd4800
Call Trace:
[<c0107472>] handle_IRQ_event+0x25/0x4f
[<c01079d2>] do_IRQ+0x11c/0x1ae
=======================
[<c02d304c>] common_interrupt+0x18/0x20
[<f885510e>] do_cciss_request+0x9e/0x2eb [cciss]
[<c0142742>] mempool_alloc+0x7b/0x135
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c0142742>] mempool_alloc+0x7b/0x135
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c022a6ce>] __cfq_get_queue+0x91/0xf6
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c022a763>] cfq_get_queue+0x30/0x37
[<c022aa13>] cfq_set_request+0x33/0x6b
[<c022a9e0>] cfq_set_request+0x0/0x6b
[<c0223557>] get_request+0x1de/0x1e8
[<c012026d>] finish_wait+0x2c/0x50
[<c0222b9a>] ll_back_merge_fn+0x175/0x1de
[<c022174b>] elv_merged_request+0x9/0xa
[<c0224174>] __make_request+0x452/0x46c
[<c014285c>] mempool_free+0x60/0x64
[<c022a55a>] cfq_dispatch_requests+0x55/0x80
[<c022a5a6>] cfq_next_request+0x21/0x35
[<c0222fa0>] __generic_unplug_device+0x2b/0x2d
[<c0222fb7>] generic_unplug_device+0x15/0x21
[<c0222fd2>] blk_backing_dev_unplug+0xf/0x10
[<c015b3d9>] sync_buffer+0x2c/0x2d
[<c015b4d7>] __wait_on_buffer+0x67/0x83
[<c015b384>] bh_wake_function+0x0/0x29
[<c015e199>] submit_bh+0x15a/0x166
[<c015b384>] bh_wake_function+0x0/0x29
[<f8863ac2>] journal_commit_transaction+0x8a7/0xfc1 [jbd]
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c011dcf7>] find_busiest_group+0xdd/0x2ba
[<c011e115>] load_balance_newidle+0x56/0x82
[<c02d05c1>] schedule+0x83d/0x8d3
[<c02d05f1>] schedule+0x86d/0x8d3
[<c0129d4a>] del_timer_sync+0x7a/0x9c
[<f8865e8d>] kjournald+0xc7/0x219 [jbd]
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c0120291>] autoremove_wake_function+0x0/0x2d
[<c011d549>] schedule_tail+0x31/0xa7
[<f8865dc0>] commit_timeout+0x0/0x5 [jbd]
[<f8865dc6>] kjournald+0x0/0x219 [jbd]
[<c01041f5>] kernel_thread_helper+0x5/0xb
Code: 95 30 03 00 00 74 38 8b 86 3c 02 00 00 39 f0 74 2e 39 b5 30 03 00 00 75
06
89 85 30 03 00 00 8b 86 38 02 00 00 8b 96 3c 02 00 00 <89> 90 3c 02 00 00 8b
96
3c 02 00 00 89 82 38 02 00 00 eb 06 c7


After investigating, it looks like c->prev is NULL.

Version-Release number of selected component (if applicable):


2.6.9-34.0.2 

How reproducible:

unknown

It seems that the the removeQ in cciss.c is having the problem.  It doesn't look like this has changed in more recent EL4 kernels however, a http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=8a3173de;hp=7c0990c7ee988aa193abbb7da3faeb9279146dbf mentions that detect the spurious case of a command attempted being removed from a queue it doesn't belong to.

I think that the problem I'm seeing is due to this being the case.
Comment 2 Mike Miller (OS Dev) 2009-02-11 10:18:27 EST
Does RH need HP to port that change into rhel4.9?
Comment 3 Tomas Henzl 2009-02-11 10:50:44 EST
(In reply to comment #2)
> Does RH need HP to port that change into rhel4.9?

I'm not sure if it is still possible for this to go into rhel4.8, but yes please port it into rhel4.8.
Comment 5 Tomas Henzl 2009-09-22 10:50:16 EDT
Created attachment 362097 [details]
backport

The patch is backported from upstream and not so complicated, I think we can take it for 4.9.
Comment 6 RHEL Product and Program Management 2009-09-22 11:02:57 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 7 Tomas Henzl 2009-10-15 10:48:33 EDT
Posted today.
Comment 12 Vivek Goyal 2010-09-23 09:01:16 EDT
Committed in 89.37.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 17 Gris Ge 2011-01-25 01:46:55 EST
RHEL4 don't support kdump.

Netdump for ccissp was verified at https://beaker.engineering.redhat.com/recipes/74648

Code reviewed. Patch linux-2.6.9-cciss-switch-to-using-hlist-to-fix-panic.patch was applied into kernel-2.6.9-95.EL

Sanity only.
Comment 18 errata-xmlrpc 2011-02-16 10:36:04 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html

Note You need to log in before you can comment on or make changes to this bug.