230717 – kernel panic with ciss driver upon kdump/kexec execution w/DLx85 platforms

Bug 230717 - kernel panic with ciss driver upon kdump/kexec execution w/DLx85 platforms

Summary: kernel panic with ciss driver upon kdump/kexec execution w/DLx85 platforms

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.2
Hardware:	i686
OS:	Linux
Priority:	urgent
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Tomas Henzl
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	222082 244993 246139 296411 319091 RHEL5u2_relnotes 420521 422431 422441
TreeView+	depends on / blocked

Reported:	2007-03-02 14:11 UTC by Joshua Giles
Modified:	2018-10-21 16:58 UTC (History)
CC List:	26 users (show)
Fixed In Version:	RHBA-2008-0314
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-05-21 14:41:35 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
console output from the dl360 showing, among other things the panic (46.59 KB, text/plain) 2007-03-02 16:04 UTC, Joshua Giles	no flags	Details
csv Openoffice spreadsheet showing test matrix (1.20 KB, text/x-comma-separated-values) 2007-03-04 15:26 UTC, Joshua Giles	no flags	Details
kdump support for cciss (9.86 KB, patch) 2008-01-18 16:58 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
Screenshot (115.15 KB, image/png) 2008-01-30 11:58 UTC, Tomas Henzl	no flags	Details
boot log with p400 and p800 in ML370G5 (50.84 KB, text/plain) 2008-01-31 16:59 UTC, Mike Miller (OS Dev)	no flags	Details
kdump support patch (9.86 KB, patch) 2008-01-31 17:03 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
working + non working version (52.04 KB, application/octet-stream) 2008-02-07 16:33 UTC, Tomas Henzl	no flags	Details
polling mode patch for kexec/kdump (17.23 KB, patch) 2008-03-14 22:08 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
polling mode patch for kexec/kdump redone (17.04 KB, patch) 2008-03-18 20:36 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
updated kdump patch (15.88 KB, patch) 2008-03-19 16:13 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
cleanup debug in kdump patch (1.46 KB, patch) 2008-03-21 20:17 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
This patch obsoletes all others in this bug (15.21 KB, patch) 2008-03-26 16:24 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
My apologies, the last patch had compile warnings. (15.73 KB, patch) 2008-03-26 17:42 UTC, Mike Miller (OS Dev)	no flags	Details \| Diff
use PCI power management to reset the controller (6.74 KB, patch) 2008-04-17 15:15 UTC, Chip Coldwell	no flags	Details \| Diff
Use PCI power management to reset the controller (7.46 KB, patch) 2008-04-18 21:00 UTC, Chip Coldwell	no flags	Details \| Diff
New rev of previous patch for Smart Array 5i (7.82 KB, patch) 2008-04-20 21:17 UTC, Chip Coldwell	no flags	Details \| Diff
Show Obsolete (14) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2008:0314	0	normal	SHIPPED_LIVE	Updated kernel packages for Red Hat Enterprise Linux 5.2	2008-05-20 18:43:34 UTC

Description Joshua Giles 2007-03-02 14:11:23 UTC

Description of problem:
A panic is observed upon reboot into the kexec environment on DL platforms:
dl385-01.rhts.boston.redhat.com
dl585-01.rhts.boston.redhat.com

1.)Follow the HOWTO contained int the kexec-tools rpm. With the kexec tools
installed, rebuild the initrd image with `service kdump restart`
2.)Force a panic:
`cat c >/proc/sysrq-trigger`

Version-Release number of selected component (if applicable):
2.6.18-8.el5
1.101-164.el5

How reproducible:
100%

Steps to Reproduce:
1. Explained in Description
2.
3.
  
Actual results:
Panic

Expected results:


Additional info:

HP CISS Driver (v 3.6.14-RH1)
ACPI: PCI Interrupt 0000:02:04.0[A] -> GSI 18 (level, low) -> IRQ 169
cciss0: <0xb178> at PCI 0000:02:04.0 IRQ 169 using DAC
cciss cciss0: SendCmd Invalid command list address returned! (4)
------------[ cut here ]------------
kernel BUG at drivers/block/cciss.c:2232!
invalid opcode: 0000 [#1]
SMP
last sysfs file:
Modules linked in: cciss sd_mod scsi_mod
CPU:    0
EIP:    0060:[<c9867a7e>]    Not tainted VLI
EFLAGS: 00010292   (2.6.18-8.el5 #1)
EIP is at sendcmd+0x263/0x29e [cciss]
eax: 00000044   ebx: c8400000   ecx: c986b5e5   edx: c8dced8c
esi: 00000000   edi: 00004e20   ebp: c8e7b800   esp: c8dced94
ds: 007b   es: 007b   ss: 0068
Process exe (pid: 464, ti=c8dce000 task=c8dc6000 task.ti=c8dce000)
Stack: 0026bd80 c8e69ac0 00000012 00000000 00000040 00000000 c8e68cc0
c8e69ac0
       c9867d62 00000024 00000000 00000000 00000000 00000000 00000000
c10087a7
       00000000 c8df2bc0 00000003 00000000 00000040 00000020 c8e69b40
00000000
Call Trace:
 [<c9867d62>] cciss_getgeometry+0x9e/0x23f [cciss]
 [<c10087a7>] dma_alloc_coherent+0xaa/0xde
 [<c986a0a8>] cciss_init_one+0x6be/0xa9e [cciss]
 [<c11439eb>] __driver_attach+0x0/0x6b
 [<c10e6bf4>] pci_device_probe+0x36/0x57
 [<c1143945>] driver_probe_device+0x42/0x8b
 [<c1143a2f>] __driver_attach+0x44/0x6b
 [<c114344a>] bus_for_each_dev+0x37/0x59
 [<c11438af>] driver_attach+0x11/0x13
 [<c11439eb>] __driver_attach+0x0/0x6b
 [<c1143152>] bus_add_driver+0x64/0xfd
 [<c10e6d22>] __pci_register_driver+0x47/0x63
 [<c103d0d4>] sys_init_module+0x16e7/0x186a
 [<c12f4700>] pcibios_irq_init+0xfe/0x47e
 [<c1003eff>] syscall_call+0x7/0xb
 =======================
Code: 88 bc 03 00 00 8b 40 24 83 c0 02 39 c7 7c 0f 56 68 2d b6 86 c9 e8
f7 c6 7b f7 58 5a eb 0d 8b 41 04 89 14 b8 ff 01 e9 7
EIP: [<c9867a7e>] sendcmd+0x263/0x29e [cciss] SS:ESP 0068:c8dced94
 <0>Kernel panic - not syncing: Fatal exception

Comment 1 Joshua Giles 2007-03-02 15:30:49 UTC

Additional Description:

The panic happens upon boot of the kexec environment and before copy of the
vmcore happens; So flow looks like this:

configure kexec/kdump -> Trigger a crash dump (i.e. `cat /proc/sysrq-trigger`)
-> kexec kernel loads from the configured memory spot (kernel cmdline
"crashkernel=128M@16M") storage module (cciss driver) loads and panics the system.

Comment 2 Don Zickus 2007-03-02 15:44:41 UTC

This panic doesn't look unusual.  Last year when we started pushing hard to
integrate kdump as our crash analysis tool, we knew there were a lot of drivers
(mainly scsi) that were not kdump friendly.  

What that means is they could not handle transactions properly from the previous
running kernel, namely pci responses and dma interrupts.  Vivek helped create a
reset mechanism to work around these issues and for the most part any driver
that utilized that mechanism had their issues disappear.  

The cciss panic seems to stem from sendcmd() receiving an illegal response. 
Again being a scsi device this panic isn't uncommon at all.  I presume if we
were to implement the reset mechanisms as described above, this problem can be
solved.  

We tested lots of i/o drivers leading up to rhel-5.  Apparently the cciss device
wasn't at the top of the food chain.  

Vivek,
Does my conclusion from the previously attached panic log seem correct?  And do
you have any links to what reset mechanisms this driver may need?

-Don

Comment 3 Joshua Giles 2007-03-02 16:04:51 UTC

Created attachment 149128 [details]
console output from the dl360 showing, among other things the panic

Comment 4 Joshua Giles 2007-03-04 15:24:27 UTC

Comprehensive test results summary:

x86_64 :
7-8 test runs (trigger dump) on ibm, dl and nec machines 
-dl360-01 was the only machine to have a panic


i386:
7-8 test runs (trigger dump) on ibm, dl and nec machines 
-dl360-01 was the only machine to have a panic


*Due to the testing done; The likelyhood or reproduceability % should be
something like 50% on dl360-01 and 20% on dl[3-5]85-01

Will attach a csv (openoffice format) spreadsheet.

Comment 5 Joshua Giles 2007-03-04 15:26:35 UTC

Created attachment 149208 [details]
csv Openoffice spreadsheet showing test matrix

Comment 6 Tom Coughlan 2007-03-09 17:37:10 UTC

Vivek, will you be able to help on this?

I'm also adding Mile Miller, the cciss maintainer at HP.

Comment 7 Mike Miller (OS Dev) 2007-03-09 19:13:56 UTC

This has been an issue for Smart Array for some time. I'm working with the
firmware team to ensure the reset and abort messages are actually honored by the
controller firmware. The latest firmware reportedly does honor the reset but I
have not yet tested that functionality.
I am out of the office recovering from a motorcycle accident. I do not except to
return before March 19.

Comment 8 Tom Coughlan 2007-03-12 14:45:03 UTC

Vivek,

Are there any likely work-arounds, based on your experience with other HBAs that
had similar problems? 

Some customers will be slow to update HBA firmware. 

Tom

Comment 9 IBM Bug Proxy 2007-03-13 05:47:14 UTC

Hi Tom,

Can't think of a work-around for this issue. As Mike mentioned, that firmware
team needs to make sure controller responds to RESET and ABORT messages then
only this problem can be solved.

Mike looks like this problem is also related to some pending messages like
megaraid. There we issues some kind of FLUSH meesage to the controller to flush
all the pending meesage in the queue. Does ciss support something like that?

Vivek

Comment 10 Mike Miller (OS Dev) 2007-04-03 20:32:32 UTC

Looks like firmware on at least some of our SAS controllers supports the reset
message defined in the cciss specification. If someone knows of a flag for which
I can test during init that tells me this is a kexec'ed kernel the fix should be
fairly simple and straightforward.
I still need to do further testing because the P400 locked up when resetting.
The P800 and E500 controllers seems to wotk OK.

Comment 11 Mike Miller (OS Dev) 2007-04-03 20:33:59 UTC

Vivek, we do support cache flush on Smart Array.

Comment 13 Don Zickus 2007-04-04 13:59:07 UTC

If the flag you are looking for is only for testing purposes, then checking to
see if /proc/vmcore is non-zero is one way to do it.  

But if you want to use this flag as part of the final solution, I would advise
against it.

Cheers,
Don

Comment 14 Tom Coughlan 2007-04-04 14:13:00 UTC

(In reply to comment #10)
> Looks like firmware on at least some of our SAS controllers supports the reset

Mike, as we move forward, keep in mind that we will need to write a release note
that clearly states which cciss-based systems work with kdump and which do not.
This will need to refer to customer-consumable model numbers (for systems with
cciss built-in, and add-on cards) and min. fw revs. Maybe your fw guys can help
with that.

Comment 15 Mike Miller (OS Dev) 2007-04-04 16:00:28 UTC

I can help with that. As I test the various configs I can make sure our release
notes are updated. I'll also make sure that cciss.txt is updated with accurate
information.

Comment 16 Tom Coughlan 2007-04-10 12:20:36 UTC

From Vivek Goyal <vgoyal.com>

Mike,

I had added a command line parameter "reset_devices" to give drivers an
indication that they need to first reset their device and then go ahead with
rest of the initializaiton. This is available in upstream kernels. I am not
sure if this is part of RHEL5 kernels or not. If it is not available in RHEL5
kernels, then for testing you can use upstream kernels, pass "reset_devices"
command line option while loading kdump kernel and make use of it for
resetting the cciss controller. Once you are successful in your testing,
we can think of taking this patch in RHEL5.
("reset_devices" is a very non-intrusive patch.)

Regarding flushing the caches, you can try that and see if solves the
problem. There are high chances that it will. It did for megaraid.

Thanks
Vivek

Comment 18 Mike Miller (OS Dev) 2007-06-08 19:42:28 UTC

Even with Vivek's patch we still have an issue with msi. When the kernel crashes
we have no way to free our msi-x vectors. So when the dump kernel initializes we
cannot allocate and register new vectors. 
I'm not sure how msi determines what resources get which vectors. It seems that
it may be based on the PCI bus address. So that a card in a particular slot will
always be allocated the same vector(s). There was some work going on upstream to
fix that. But a couple of weeks ago I was playing with kdump and still
encountered this issue.
So I think we're a ways out from making this work as designed.

Comment 20 Mike Miller (OS Dev) 2007-07-17 18:50:21 UTC

We still have some apparent firmware issues trying to support kdump. After
resetting the controller it will not accept any more commands. I'm working with
the firmware group to resolve this problem. No ETA.

Comment 21 Tom Coughlan 2007-11-01 15:52:00 UTC

Mike,

We really need to write a release note warning customers that kdump on cciss
does not work, at least on some hw/fw combinations. Please let us know if there
are any cciss models and fw versions that are known to work, and you are willing
to support. Otherwise the release note will just make a blanket statement. 

Don, this should go in the 5.1 release note update as soon as Mike replies. 

Tomas, please keep an eye on any potential fixese for this from HP during 5.2
development.  

Tom

Comment 22 Mike Miller (OS Dev) 2007-11-01 20:18:54 UTC

Proposed support statement:
At this time kexec/kdump does not work with any HP Smart Array controller.
Support for kexec/kdump is planned for a future release but no scheduling
information is available.

Is this appropriate for the release notes? 

Not for public release: I looked at this issue again with one of our firmware
engineers. It looks like the firmware is completing commands after the reset but
we get stuck in wait for completion in the ioctl path. Right now I'm working 2
critical issues for a new product. When I resolve those I will add debug to
figure out why we're stuck. I suspect a corrupt command tag.

Comment 23 Jarod Wilson 2007-11-01 21:10:54 UTC

Hrm... I don't think saying it doesn't work with *any* Smart Array controller is
accurate. I've definitely been able to capture a dump on at least one
cciss-equipped box (and one cpqarray-equipped box) while testing our mkdumprd
tweaks to support dumping to manually specified cciss & cpqarray devices (see
bug 228685).

Of course, I don't know how to identify those that work vs. those that don't, so...

Comment 24 Don Domingo 2007-11-01 23:07:00 UTC

for now, in the absence of a definitive list of Smart Array controllers where
kdump/kexec is not supported, lets just say "some Smart Array controllers".

also, this is only for the X86-64 architectures, right?

<quote>
(x86_64) Some Smart Array controllers do not support kexec and kdump.
</quote>

Comment 25 Mike Miller (OS Dev) 2007-11-02 14:22:44 UTC

Jarod, If you've been successful I'd like to know which controllers, firmware
version, and driver version have worked. I have had no luck with any of the
controllers I've tested. I've only looked at the newest controllers, though.

Comment 26 Jarod Wilson 2007-11-02 17:37:41 UTC

Ugh. So the cciss system I swear I was able to capture a dump on previously now
panics. Its an HP DL380 G5 here in our lab, with an HP Smart Array P400
Controller, firmware version 1.18. The cciss driver is 3.6.16-RH1, as found in
kernel-2.6.18-53.el5. From what I can tell, it was mid-August I last actually
tried this, so I can give some kernels from that time frame a go to see if one
of 'em actually works. Not sure if anyone has changed the controller firmware
lately or not though.

Comment 27 Tom Coughlan 2007-11-02 22:05:09 UTC

(In reply to comment #24)
> for now, in the absence of a definitive list of Smart Array controllers where
> kdump/kexec is not supported, lets just say "some Smart Array controllers".
> 
> also, this is only for the X86-64 architectures, right?
> 
> <quote>
> (x86_64) Some Smart Array controllers do not support kexec and kdump.
> </quote>

I would prefer:

Crash dump using kexec/kdump may not function reliably with HP Smart Array
controllers (these adapters use the cciss driver). 

I believe the architectures are i686, x86_64, and ia64.

Comment 28 Don Domingo 2007-11-04 22:20:41 UTC

thanks Tom, edited as follows:

<quote>
(x86_64;ia64) Crash dumping through kexec and kdump may not function reliably
with HP Smart Array controllers. This is because these controllers use the cciss
driver.
</quote>

Comment 29 Tom Coughlan 2007-11-05 14:43:46 UTC

(In reply to comment #28)

> <quote>
> (x86_64;ia64) 

Did you overlook x86? 

> Crash dumping through kexec and kdump may not function reliably
> with HP Smart Array controllers. This is because these controllers use the 
> cciss
> driver.
> </quote>

The problem is with the firmware, not the cciss driver. I wanted to include
"cciss" because that helps prople identify the hw involved, and they may be
searching the docs. on that term. So something like: 

"... (These controllers use the cciss driver.) A solutioon to this problem,
which is likely to involve a firmware update to the controller, is being
investigated." 

Tom

Comment 30 Don Domingo 2007-11-05 22:09:51 UTC

thanks for clearing that up Tom. release note revised.

Comment 31 Sam Knuth 2007-11-21 13:06:13 UTC

Hello - wall street customers are starting to ask about this. Any ETA on the
firmware update? This is definitely going to make RHEL 5 a non-starter for Wall
St until we get it resolved.

-Sam

Comment 32 Tom Coughlan 2007-12-06 21:53:06 UTC

There is nothing we can do to resolve this without HP fw/driver changes. I have
requested management attention.

Comment 35 Mike Miller (OS Dev) 2008-01-16 22:18:58 UTC

We have kdump working on ia32 based systems using a kernel.org 2.6.22.9 kernel.
I'm having problems with rhel5.1 on x86_64. The driver loads and sucessfully
discovers the attached disks but I get the message "unexpected IRQ trap at
vector 82" for each controller in the system. Then the system panics with
"Kernel panic - not syncing: Attempted to kill init!"
According to the kdump doc you must use the uncompressed kernel image on x86_64.
That image doesn't seem to be included on the distribution. Can someone please
explain RH's expectations for kdump support? In other words, how would a
customer set up a crash kernel?

Comment 36 Don Zickus 2008-01-16 22:50:48 UTC

The normal kernel is a relocatable kernel thus allowing kdump to use the same
kernel for its purposes.  Installing the kexec-tools package (the init script
takes care of everything) and setting up the correct memory region on the kernel
command line (rebooting to have it take effect) will get you there.  This is
also done by the First Boot install scripts.

Disregard the kdump documentation as it is a little out of date.

cc'ing Neil to help you with other little issues.

Comment 37 Neil Horman 2008-01-16 23:46:52 UTC

Mike, don is right, the docs are old, you don't need to used the uncompressed
kernel any more.  Regarding comment #35 and your panic, I'll look at your
console logs shortly.

Comment 38 Neil Horman 2008-01-17 00:10:47 UTC

Mike, I've looked over your console logs.  do me a favor and try the following:
in  /etc/sysconfig/kdump, you'll see a line defining the variable
KDUMP_COMMANDLINE_APPEND.  Please add the parameter:
reset_devices
to that variable, along with the others already there.  Restart the kdump
service and try again.  This looks like an old cciss problem thats fixed in
kernels 2.6.18-26.el5 and later, but requires that reset_devices be passed on
the kernel command line for the kdump kernel.  The fix is actually a hack to
work around some  firmware issues, IIRC, but should get you going until its
properly repaired.

Comment 39 Tomas Henzl 2008-01-17 10:45:53 UTC

Neil,
I think that Mike is trying to solve the same issue you've solved in
2.6.18-26.el5  and he is also using the same variable reset_devices as you did.
Could I then, when Mike succeeds remove your patch from cciss driver ?
I mean with that, only that part which is in cciss.c not general handling of
variable reset_devices. This :
@@ -2074,6 +2074,13 @@
 		       ctlr, complete);
 		/* not much we can do. */
 #ifdef CONFIG_CISS_SCSI_TAPE
+		/* We might get notification of completion of commands
+		 * which we never issued in this kernel if this boot is
+		 * taking place after previous kernel's crash. Simply
+		 * ignore the commands in this case.
+		 */
+		if (reset_devices)
+			return 0;
 		return 1;
 	}

Comment 40 Neil Horman 2008-01-17 12:46:24 UTC

Tomas, Yes, you are correct.  Assuming that the test that I asked mike to
preform in comment 38 is successful, then he is trying to solve the same problem
I did with the changeset you reference in comment 39.  When he manages to fix
it, then yes, you can remove the segment that you reference from the cciss driver.

Comment 41 Mike Miller (OS Dev) 2008-01-17 19:55:08 UTC

I'm getting soft lockups on one of the CPUs during driver initialization. For
waht ever reason the stack is not printing on my serial console, but the
interesting part is:

<IRQ> [<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7
[<ffffffff800930e2>] update_process_times+0x42/0x68
[<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47
[<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47
[<ffffffff8007bc8e>] apic_timer_interrupt+0x66/0x6c
[<ffffffff8000c505>] __delay+0x8/0x10
[<ffffffff880b8ad4>] :cciss:cciss_init_one+0x295/0x11ed

Everything before this looks like normal init stuff like pci_probe_device, etc.
At this point I've jumped off into the function that resets the controller.
After reset we wait 60 seconds for the controller to become ready. I may be able
to reduce that delay but I know already 20 seconds is not enough. (20 seconds is
our default timeout when polling.)
On my initial development system everything would just stop and wait after the
cciss version printk. Then we did our device discovery and add the disks, etc.
On this 5.1 system (2.6.18-53.el5) after the version printk it halts for a
couple of seconds and then I notice other things like USB initialize. About 10
seconds after the version printk I get the soft lockups. Of course, now we're hosed.
Any thoughts or suggestions?

Comment 42 Don Domingo 2008-01-17 23:15:35 UTC

adding same "Known Issues" release note to RHEL5.2 release note.

Comment 43 Tomas Henzl 2008-01-17 23:34:24 UTC

Mike,
I don't know, but maybe you could post the current version of your patch so that
we can easily help you with debugging.

Comment 44 Mike Miller (OS Dev) 2008-01-18 15:00:11 UTC

Which kernel to want the patch made against? I'm currently using 2.6.18-53.el5.

Comment 45 Mike Miller (OS Dev) 2008-01-18 16:58:53 UTC

Created attachment 292178 [details]
kdump support for cciss

Arggggh, now the behavior is different. In the earlier testing I was building
an rpm with the kdump support and then installing it on the system. That's when
I saw the soft lockups I reported.
With the attached patch I used a vanilla 2.6.18-53.el5 kernel, applied the
patch, and built a new kernel and modules. I no longer see the soft lockups but
it almost  seems I'm not getting my interrupts. I notice that MSI-X init fails
with a -22. That just means it's an unknown error. At that point I try to get
IOAPIC interrupts. I'm not sure that will work. If I remember correctly you can
go from IOAPIC ----> MSI-X but not the other way around. I tried forcing the
the controllers to IOAPIC mode from the beginning and then crashing the system.
I still get failure trying to mount root and the subsequent panic. I don't see
this problem on my original development box.
Please look at the patch and see if anything stands out. Maybe I'm doing
something wrong.

Comment 46 Mike Miller (OS Dev) 2008-01-18 20:31:13 UTC

I managed to capture this dump:

HP CISS Driver (v 3.6.16-RH1)
usb 5-1: new full speed USB device using uhci_hcd and address 2
usb 5-1: configuration #1 chosen from 1 choice
irq 169: nobody cared (try booting with the "irqpoll" option)

Call Trace:
 <IRQ>  [<ffffffff800b5d60>] __report_bad_irq+0x30/0x7d
 [<ffffffff800b5f93>] note_interrupt+0x1e6/0x227
 [<ffffffff800b54a5>] __do_IRQ+0xc7/0x105
 [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5
 [<ffffffff8005b615>] ret_from_intr+0x0/0xa
 [<ffffffff80010792>] handle_IRQ_event+0x1b/0x58
 [<ffffffff800b5482>] __do_IRQ+0xa4/0x105
 [<ffffffff80011cb4>] __do_softirq+0x5e/0xd5
 [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5
 [<ffffffff8005b615>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff801efe01>] input_print_modalias_bits+0x48/0x95
 [<ffffffff801efee4>] input_print_modalias+0x96/0x1e4
 [<ffffffff801f0bfa>] input_dev_uevent+0x3c4/0x3ff
 [<ffffffff801abf87>] class_uevent+0x1b9/0x1c8
 [<ffffffff80056069>] kobject_get_path+0x99/0xc1
 [<ffffffff80055db8>] kobject_uevent+0x1fb/0x413
 [<ffffffff80100854>] sysfs_create_link+0xfe/0x10c
 [<ffffffff801ac742>] class_device_add+0x2fb/0x44b
 [<ffffffff801f1465>] input_register_device+0xf5/0x271
 [<ffffffff801eabc5>] hidinput_connect+0x1bab/0x1bbc
 [<ffffffff801e76ff>] hid_probe+0xa7f/0xc38
 [<ffffffff801dd94c>] usb_probe_interface+0x6c/0x9e
 [<ffffffff801ab876>] driver_probe_device+0x52/0xaa
 [<ffffffff801ab8ce>] __device_attach+0x0/0x5
 [<ffffffff801ab198>] bus_for_each_drv+0x40/0x72
 [<ffffffff801ab925>] device_attach+0x52/0x5f
 [<ffffffff801aae64>] bus_attach_device+0x1a/0x35
 [<ffffffff801aa247>] device_add+0x24a/0x361
 [<ffffffff801dcaac>] usb_set_configuration+0x36b/0x3f1
 [<ffffffff801d8754>] usb_new_device+0x253/0x2c4
 [<ffffffff801d98b0>] hub_thread+0x74b/0xb10
 [<ffffffff8009b446>] autoremove_wake_function+0x0/0x2e
 [<ffffffff801d9165>] hub_thread+0x0/0xb10
 [<ffffffff8009b283>] keventd_create_kthread+0x0/0x61
 [<ffffffff800321d8>] kthread+0xfe/0x132
 [<ffffffff8005bfb1>] child_rip+0xa/0x11
 [<ffffffff8009b283>] keventd_create_kthread+0x0/0x61
 [<ffffffff800320da>] kthread+0x0/0x132
 [<ffffffff8005bfa7>] child_rip+0x0/0x11

handlers:
[<ffffffff801dadad>] (usb_hcd_irq+0x0/0x55)
[<ffffffff801dadad>] (usb_hcd_irq+0x0/0x55)
Disabling IRQ #169

Here's the command line I use to load the crash kernel:

kexec -p /boot/vmlinuz-rhel51-kdump \
         --initrd=/boot/initrd-rhel51-kdump \
         --append="root=/dev/cciss/c0d0p2 reset_devices 3 irqpoll \
         maxcpus=1 console=ttyS0,115200 console=tty1"

Does anybody see a problem in the command line? The kernel hints to use irqpoll.
I am using that parameter.

On a maybe more positive note: my kernel.org 2.6.22.9 blows up much the same
same way on this platform. 

I am installing rhel5.1 ia32 on the system on which I did the initial
development. I'll post the results when I'm done.

Comment 47 Neil Horman 2008-01-18 20:53:17 UTC

what kexec version are you using mike?  A command line length overrun problem
was recently reported to me. I don't have the fixed checked in yet, but I have a
patch floating out there for kexec-tools-1.102pre-<rev> to hopefully fix it.

Comment 48 Mike Miller (OS Dev) 2008-01-18 22:08:23 UTC

I have kexec-tools-1.101-194.el5. Send me a link to your patch.

Comment 49 Neil Horman 2008-01-19 01:36:41 UTC

you can find it attached to bz 428310

Comment 50 Mike Miller (OS Dev) 2008-01-22 14:08:33 UTC

I don't have access to that BZ.

Comment 51 Mike Miller (OS Dev) 2008-01-22 15:00:43 UTC

I found the problem!!!! I was specifying the wrong root= in my command line to
load the crash kernel. I was specifying "root=/dev/cciss/c0d0p2" when it should
have been "root=/dev/VolGroup00/LogVol00."

The only thing that bothers me now is the MSI-X init failing with the unknown
error of -22. I did successfully register for an IOAPIC interrupt but I'm
concerned since this was not what I observed during development. Anyone have any
ideas about that behavior?

Comment 52 Sandy Garza 2008-01-22 21:48:51 UTC

Mike Miller indicates this patch can be submitted for RHEL 5.2. The patch 
supports all architectures.

Comment 54 Tomas Henzl 2008-01-23 09:33:04 UTC

Mike,
thanks for the effort and for solving this problem.I'm sorry I couldn't help you
with debugging,because your patch was running on my hw without problems. 

Is the problem you mentioned in Comment #51 also successfully solved ?

Comment 55 Sandy Garza 2008-01-23 16:02:36 UTC

(In reply to comment #54)
> Mike,
> thanks for the effort and for solving this problem.I'm sorry I couldn't help 
you
> with debugging,because your patch was running on my hw without problems. 
> Is the problem you mentioned in Comment #51 also successfully solved ?

Tomas,
Mike indicated this was an end-user mistype and did not affect the code. He was 
trying to pass an invalid argument.

Thank you.

Comment 56 Mike Miller (OS Dev) 2008-01-23 16:32:05 UTC

I'm still working on the MSI-X init failure.

Comment 57 Sandy Garza 2008-01-23 20:05:25 UTC

(In reply to comment #54)
> Mike,
> thanks for the effort and for solving this problem.I'm sorry I couldn't help 
you
> with debugging,because your patch was running on my hw without problems. 
> Is the problem you mentioned in Comment #51 also successfully solved ?

Tomas,
Mike indicated this was an end-user mistype and did not affect the code. He was 
trying to pass an invalid argument.

Thank you.

Comment 58 Tomas Henzl 2008-01-24 15:49:44 UTC

(In reply to comment #56)
Mike, 
it seems to me that the problem could be here in pci_enable_msix(msi.c)
        pci_read_config_word(dev, msi_control_reg(pos), &control);
        if (control & PCI_MSIX_FLAGS_ENABLE)
here ->               return -EINVAL;                       /* Already in MSI-X
mode */
This could mean that the device is still in msi-x mode, so maybe a call to
pci_disable_msix or other clean up when reset_devices is set could help.

Comment 59 Mike Miller (OS Dev) 2008-01-24 21:35:04 UTC

I tried adding the code to free_irq and pci_disable_msix but it did not resolve
the issue. When I made the call to free_irq the kernel complained that I was
trying to free an already freed IRQ0. 
There are a lot of diffs in the 2.6.22.9 kernel versus the 2.6.18-53.el5.
Glancing thru the code it may be this:

 msix_set_enable(dev, 0);/* Ensure msix is disabled as I set it up */

in msix_capability_init() in drivers/pci.msi.c that resolves the issue. I'm
testing again on a 2.6.16-xx kernel as a sanity check. 
Tomas, do you see the same MSI-X init failure?

Comment 60 Tomas Henzl 2008-01-25 10:25:10 UTC

Yes, the test machine which is MSI-X capable I see the same failure and the
pci_disable_msix also didn't help. The kernel 2.6.22.9 is working well with your
patch ?

Comment 61 Tony Camuso 2008-01-25 12:26:54 UTC

Mike,

Please let me know how I can help. 

I've had a lot of experience with MSI/MSI-X.

Comment 62 Mike Miller (OS Dev) 2008-01-25 15:38:59 UTC

Tomas,
Yes, testing with the 2.6.22.9 kernel is going well. I see no failures or errors
while booting the crash kernel. In your testing are you able to successfully
boot to the crash kernel, even though the MSI-X init fails? I guess what I'm
asking is are you able to boot up and save off the vmcore file to another system?

Tony,
If you're in Houston please stop by my office or lab.

Comment 63 Tomas Henzl 2008-01-28 15:52:00 UTC

Mike,
I was busy with another task,sorry for the late answer. 
When the MSI-X init fails the system is hanging then and vmcore is not created.

Comment 64 Mike Miller (OS Dev) 2008-01-29 20:20:16 UTC

Tomas,
Can you tell me which kernel, driver, server, and controller you're using? I've
had success on g5 servers using the P400 controller. So I'm a bit stumped on why
your setup is failing. I'm still investigating the initial failure.

Comment 65 Tomas Henzl 2008-01-30 11:58:08 UTC

Created attachment 293393 [details]
Screenshot

Comment 66 Tomas Henzl 2008-01-30 12:01:54 UTC

Mike,
I've on my system two boards, maybe this could be the difference ?
# cat /proc/driver/cciss/*
cciss0: HP Smart Array P400 Controller
Board ID: 0x3234103c
Firmware Version: 1.18
IRQ: 130
Logical drives: 1
Sector size: 2048
Current Q depth: 0
Current # commands on controller: 0
Max Q depth since init: 43
Max # commands on controller since init: 159
Max SG entries since init: 31
Sequential access devices: 0

cciss/c0d0:       72.77GB       RAID 1(1+0)
cciss1: HP Smart Array P800 Controller
Board ID: 0x3223103c
Firmware Version: 2.08
IRQ: 162
Logical drives: 0
Sector size: 2048
Current Q depth: 0
Current # commands on controller: 0
Max Q depth since init: 0
Max # commands on controller since init: 1
Max SG entries since init: 0
Sequential access devices: 0
------------------------------------------
kernel is RHEL5.1; 2.6.18
In the /etc/sysconfig/kdump this is set : KDUMP_COMMANDLINE_APPEND="irqpoll
maxcpus=1 reset_devices"
Kdump is operational
after echo c > /proc/sysrq-trigger, is the system booting 
the new kernel and then it locks - see screenshot in Comment #65
On this system(it's not MIS_X capable) the same procedure works and the vmcore
is created.
# cat /proc/driver/cciss/cciss0 
cciss0: HP Smart Array 5i Controller
Board ID: 0x40800e11
Firmware Version: 2.58
IRQ: 169
Logical drives: 1
Sector size: 2048
Current Q depth: 0
Current # commands on controller: 0
Max Q depth since init: 17
Max # commands on controller since init: 122
Max SG entries since init: 31
Sequential access devices: 0

cciss/c0d0:       18.18GB       RAID 0

Comment 67 Mike Miller (OS Dev) 2008-01-30 16:03:58 UTC

I think the problem is the soft lockup and not the MSI-X failure. I also saw
that in my testing. But now I can't remember how I got past that. Getting old
sucks. I'll add a P800 to my configs and see if I can recreate the soft lockup.

Comment 68 Mike Miller (OS Dev) 2008-01-31 16:59:50 UTC

Created attachment 293602 [details]
boot log with p400 and p800 in ML370G5

This boot log shows all of the unexpected IRQ messages during init

Comment 69 Mike Miller (OS Dev) 2008-01-31 17:03:02 UTC

Created attachment 293604 [details]
kdump support patch

Tomas,
Are you using this patch? This is the one I'm using in my testing and although
I see some nasty looking messages about unexpected IRQ's I am still able to
boot up and save the vmcore file. I attached a log that shows those messages.

Comment 70 Tomas Henzl 2008-02-01 10:30:04 UTC

Mike,
yes the latest patch is the same as the one from 2008-01-18,to be sure I'll send
you my cciss.c from the test system.

Comment 73 Tomas Henzl 2008-02-07 16:33:34 UTC

Created attachment 294226 [details]
working + non working version

Mike, after echo c >/proc/sysrq-trigger with the cciss.c from the attachment
I'm getting softlockup's the cciss.c.orig works. Additionally there is also
difference on line 2136 - the lines 
if (reset_devices)
	return 0;
are removed.

Comment 74 Mike Miller (OS Dev) 2008-02-07 19:09:59 UTC

Thank you, Tomas. I'll check out the diffs.

Comment 75 Mike Miller (OS Dev) 2008-02-21 15:57:55 UTC

Are the systems that fail Opteron based? I see at the top of the bug where the
DL385 is listed. I've been doing all my testing on Intel based systems. Do you
have any Intel based HP platforms?

Comment 76 Tomas Henzl 2008-02-25 14:38:21 UTC

The failing system with the MSI-X capability is Intel based (Intel(R) Xeon(R)
CPU 5160 @ 3.00GHz).

Comment 77 Mike Miller (OS Dev) 2008-02-25 23:05:27 UTC

I'm on the road right now returning 20080303. I also have an internal HP
customer seeing the same problems you are. When I get back I'll dig into this
and find the resolution.

Comment 78 Dilip Daya 2008-02-29 20:20:20 UTC

HP ProLiant DL365 with Dual-Core AMD Opteron™ Processor 2216 (2.4 GHz)
- Embedded Smart Array SAS Controller P400i

RHEL5.1 (kernel 2.6.18-53.1.13.el5)
# uname -a
Linux dl365g1 2.6.18-53.1.13.el5 #1 SMP Mon Feb 11 13:27:27 EST 2008 x86_64
x86_64 x86_64 GNU/Linux

# modinfo cciss
filename:       /lib/modules/2.6.18-53.1.13.el5/kernel/drivers/block/cciss.ko
license:        GPL
version:        3.6.16-RH1
description:    Driver for HP Controller SA5xxx SA6xxx version 3.6.16-RH1
...
...

# cat /sys/kernel/kexec_crash_loaded 
1

# echo c > /proc/sysrq-trigger
...
...
Starting RPC idmapd:            [ OK ]
EDAC k8 MC1: GART TLB error: transaction type(generic), cache level(generic)
EDAC k8 MC1: extended error code: GART error
Kernel panic - not syncing: MC1: processort context corrupt

Comment 80 Mike Miller (OS Dev) 2008-03-14 21:31:44 UTC

So after having various problems with interrupts that differed from kernel to
kernel I decided to try a polling mode crash driver. The flow of operation is
"identical" to the interrupt driven code except that I do not do a request_irq.
Instead we start up a thread that calls the interrupt handler to complete
commands. If I use an upstream kernel I can successfully boot in a crashkernel.
When I try the 2.6.18-84.elPAE kernel included in the 20080303 5.2 beta it
fails. It fails in fs/blockdev.c in the do_open function. Specifically:

          if (!part) {
                        struct backing_dev_info *bdi;
                        if (disk->fops->open) {
                                ret = disk->fops->open(bdev->bd_inode, file);
                                if (ret)
                                        goto out_first;
                        }

ret = -16 so we jump to out_first. 

We get to do_open via add_disk -> register_disk. In the upstream kernels these
functions are completely different so it's difficult to tell what's broken. I
also don't know where or how to find disk->fops->open(bdev->bd_inode, file). Can
someone at RH help out here?

Comment 81 Mike Miller (OS Dev) 2008-03-14 22:08:08 UTC

Created attachment 298090 [details]
polling mode patch for kexec/kdump

Here is my patch for polling mode while in a crashkernel.

Comment 82 Chip Coldwell 2008-03-18 15:25:36 UTC

In Documentation/pci.txt:

There are (at least) two really good reasons for using MSI:
	1) MSI is an exclusive interrupt vector by definition.
	   This means the interrupt handler doesn't have to verify
	   its device caused the interrupt.

But I note that 

static inline long interrupt_not_for_us(ctlr_info_t *h)
{
#ifdef CONFIG_CISS_SCSI_TAPE
        return (((h->access.intr_pending(h) == 0) ||
                 (h->interrupts_enabled == 0))
                && (h->scsi_rejects.ncompletions == 0));
#else
        return (((h->access.intr_pending(h) == 0) ||
                 (h->interrupts_enabled == 0)));
#endif
}

static irqreturn_t do_cciss_intr(int irq, void *dev_id, struct pt_regs *regs)
{
        ctlr_info_t *h = dev_id;
        CommandList_struct *c;
        unsigned long flags;
        __u32 a, a1, a2;

        if (interrupt_not_for_us(h))
                return IRQ_NONE;

so even if the cciss driver has MSI/-X enabled, it checks the interrupt source
as if it were a shared interrupt vector.

This isn't really the root of the problem, but a micro-optimization that I
thought I would put here so I don't forget ....

Chip

Comment 83 Chip Coldwell 2008-03-18 16:22:22 UTC

(In reply to comment #80)
> I
> also don't know where or how to find disk->fops->open(bdev->bd_inode, file). Can
> someone at RH help out here?

disk is of type struct gendisk, so fops is of type block_device_operations,
initialized in cciss_init_one to cciss_fops.  From

static struct block_device_operations cciss_fops = {
        .owner = THIS_MODULE,
        .open = cciss_open,
        .release = cciss_release,
        .ioctl = cciss_ioctl,
        .getgeo = cciss_getgeo,
#ifdef CONFIG_COMPAT
        .compat_ioctl = cciss_compat_ioctl,
#endif
        .revalidate_disk = cciss_revalidate,
};

I would infer that the open method is cciss_open.  The return value is -16,
which is -EBUSY, so I would assume you are hitting this

        if (host->busy_initializing || drv->busy_configuring)
                return -EBUSY;

in cciss_open.

Chip

Comment 84 Mike Miller (OS Dev) 2008-03-18 16:27:04 UTC

I've printed out both host->busy_initializing and drv->busy_initializing. Both
are zero. So unless there's some asynch thread checking before they're cleared
that does not seem to be the issue.

Comment 85 Mike Miller (OS Dev) 2008-03-18 16:29:31 UTC

Can we skip the MSI issues and try to resolve this polling problem? Different
kernels exhibit different interrupt problems so polling may be the best option.

Comment 86 Chip Coldwell 2008-03-18 16:38:16 UTC

(In reply to comment #84)
> I've printed out both host->busy_initializing and drv->busy_initializing. Both
> are zero. So unless there's some asynch thread checking before they're cleared
> that does not seem to be the issue.

Were you able to verify that disk->fops->open == cciss_open?

Chip

Comment 87 Mike Miller (OS Dev) 2008-03-18 18:57:47 UTC

Not yet. But I think you're right about the busy_initializing. I added a flag to
skip that test in the polling driver and it wants to boot. I need to pull some
more debug out but it looks promising.  :)

After I test it a bit more I'll post the latest patch.

-- mikem

Comment 88 Mike Miller (OS Dev) 2008-03-18 20:36:47 UTC

Created attachment 298451 [details]
polling mode patch for kexec/kdump redone

This patch enables kdump support for the cciss driver. If we're booting to a
crashkernel we use polling mode in the driver to avoid any interrupt related
issues. Different kernels exhibit different failures when using interrupts
including failing to get an MSI-X vector or interrupt sharing issues when
there are multiple controllers in the system.
The down side about this approach is we must wait approximately 1 minute for
each controller to complete initialization after the reset. This could be
mitigated by initializing only the first controller. That should be adequate
since /proc will only exist on the root filesystem. I'm looking for feedback
on this assumption.
Please review and test this patch in your labs.

NOTE: Since upstream kernels do not exhibit the MSI-X failures the upstream
patch may be significantly different than this patch.

Comment 90 Chip Coldwell 2008-03-19 15:33:42 UTC

(In reply to comment #88)
> Created an attachment (id=298451) [edit]
> polling mode patch for kexec/kdump redone

A bit of debugging code leaked through; I'm dropping this:

diff --git a/fs/block_dev.c b/fs/block_dev.c
index d7b9a66..a6a06d5 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -899,6 +899,7 @@ static int do_open(struct block_device *bdev, struct file
*file, int for_part)
                        struct backing_dev_info *bdi;
                        if (disk->fops->open) {
                                ret = disk->fops->open(bdev->bd_inode, file);
+                               printk("cciss: do_open: ret = %d\n", ret);
                                if (ret)
                                        goto out_first;
                        }

Comment 91 Chip Coldwell 2008-03-19 15:36:14 UTC

(In reply to comment #88)
> Created an attachment (id=298451) [edit]
> polling mode patch for kexec/kdump redone

-ENOCOMPILE

  CC [M]  drivers/block/cciss.o
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c: In function
‘cciss_seq_show’:
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:350: warning: format ‘%d’
expects type ‘int’, but argument 4 has type ‘loff_t’
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c: In function
‘cciss_init_one’:
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3444: error: ‘reset_c0’
undeclared (first use in this function)
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3444: error: (Each
undeclared identifier is reported only once
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3444: error: for each
function it appears in.)
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c: In function
‘cciss_completion_thread’:
/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3792: warning: unused
variable ‘i’
make[2]: *** [drivers/block/cciss.o] Error 1
make[1]: *** [_module_drivers/block] Error 2
make: *** [modules] Error 2

Comment 92 Mike Miller (OS Dev) 2008-03-19 16:13:14 UTC

Created attachment 298533 [details]
updated kdump patch

Sorry, I was doing a little hacking by hand and left a variable that I should
have deleted. I also cleaned up the warnings. This patch has been compile
tested.

I'm wondering if a spinlock around
 
			if (host->busy_initializing || drv->busy_configuring)
				  return -EBUSY;

may be better than just skipping the test Comments?

Comment 93 Chip Coldwell 2008-03-20 18:39:10 UTC

A test kernel which includes this patch is available from

http://people.redhat.com/coldwell/kernel/bugs/230717/

Chip

Comment 94 Chip Coldwell 2008-03-20 19:26:39 UTC

When I tried kexec with this kernel on a DL-585 and it failed.  Here's what I did:

# kver=`uname -r`
# kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img
--command-line="`cat /proc/cmdline`"
# reboot

During the kexec boot, I get this message first

irq 177: nobody cared (try booting with the "irqpoll" option)           
                                                                                
Call Trace:                                                                     
 <IRQ>  [<ffffffff800b799e>] __report_bad_irq+0x30/0x7d                         
 [<ffffffff800b7bd1>] note_interrupt+0x1e6/0x227                                
 [<ffffffff800b70db>] __do_IRQ+0xbd/0x103                          
 [<ffffffff80011e47>] __do_softirq+0x5e/0xd6                                    
 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5                                         
 [<ffffffff8006ad28>] default_idle+0x0/0x50                                     
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa                                    
 <EOI>  [<ffffffff8006ad51>] default_idle+0x29/0x50                             
 [<ffffffff80048a90>] cpu_idle+0x95/0xb8                                        
 [<ffffffff803d9801>] start_kernel+0x220/0x225                                 
 [<ffffffff803d922f>] _sinittext+0x22f/0x236                                    
                                                                                
handlers:                                                          
[<ffffffff8811ab1b>] (do_cciss_intr+0x0/0x8b7 [cciss])             
Disabling IRQ #177                                                 

followed by an oops that ends with this:

RBP: ffff8103fe68c7c8 R08: 0000000000000001 R09: ffff8100010503d4
R10: 0000000000000010 R11: ffffffff8015bb8e R12: ffff8103fe0ebae0
R13: ffff8103fe0ebae0 R14: ffff8103fe0ebad8 R15: ffff8103fe68c8c8
FS:  00000000110d98f0(0063) GS:ffff8103ffe6cbc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000110ef000 CR3: 00000003ffe1d000 CR4: 00000000000006e0
Process init (pid: 1, threadinfo ffff8103fff18000, task ffff8103fff037a0)
Stack:  ffff8103fe68c7c0 ffff8103fdc7e680 ffff8103fe0ebae0 ffffffff8005467b
 ffff8103fe6af050 ffffffff80025656 ffff8103fff19f38 ffff8103fdc7e680
 00000000fffffffe ffff8103fe6af050 ffff8103fe6af108 ffffffff80025656
Call Trace:
 [<ffffffff8005467b>] sysfs_readdir+0x14f/0x171
 [<ffffffff80025656>] filldir+0x0/0xb7
 [<ffffffff80025656>] filldir+0x0/0xb7
 [<ffffffff80034dce>] vfs_readdir+0x77/0xa9
 [<ffffffff80038677>] sys_getdents+0x75/0xbd
 [<ffffffff8002e55c>] sys_fcntl+0x2d0/0x2dc
 [<ffffffff8005d116>] system_call+0x7e/0x83


Code: 0f 0b 68 89 ba 29 80 c2 1a 00 48 8b 55 00 48 39 da 74 1b 48
RIP  [<ffffffff801466b6>] __list_add+0x24/0x68
 RSP <ffff8103fff19e88>
 <0>Kernel panic - not syncing: Fatal exception

Not sure if the oops had anything to do with cciss.

I will try adding "irqpoll" to the command line and see what happens.

Chip

Comment 95 Chip Coldwell 2008-03-20 19:42:37 UTC

With "irqpoll" I still get the "nobody cared" message, but the kexec kernel does
boot.

Chip

Comment 96 Neil Horman 2008-03-20 19:58:51 UTC

you should always specify irqpoll on the command line when doing a kexec, pretty
well as a rule.  kexec has a difficult time quiesing and re-assigning interrupts
during a reboot.

Comment 97 Mike Miller (OS Dev) 2008-03-20 21:15:39 UTC

Since we're polling in cciss interrupts are explicitly turned off so I doubt the
spurious interrupt is related to cciss.

Comment 98 Mike Miller (OS Dev) 2008-03-21 20:17:37 UTC

Created attachment 298806 [details]
cleanup debug in kdump patch

My testing looks good on both Intel and AMD Proliants. 
There was still some debug that bled thru. This patch cleans that up. It also
adds a spin_lock around the busy_initializing test. This approach is safer than
just bypassing the test in a crashkernel.

Comment 99 Doug Chapman 2008-03-24 22:04:28 UTC

I built an ia64 kernel with this patch and it resolves the issue we were seeing.
 Our problem presented itself as a hang rather than a panic however.  With the
patch the kexec'ed kernel runs properly (kdump still broken on ia64 but for
other unrelated reasons).

Comment 100 Tomas Henzl 2008-03-26 14:37:02 UTC

I built an kernel with combined from patches #98,#92 and removed the part
mentioned in comment #39.
On an i686 everything works well, but on the other machine with MSI capability
(x86-64 system) I'm still getting the softlockup when the cciss.ko is
loading(kdump). The kexec also doesn't work.

Comment 101 Doug Chapman 2008-03-26 15:17:04 UTC

Would it be possible to upload _one_ patch that is the current version and
obsolete all the others?  There are so many patches attached here that I am not
sure if I tested with the right one or not.

Comment 102 Mike Miller (OS Dev) 2008-03-26 16:24:18 UTC

Created attachment 299186 [details]
This patch obsoletes all others in this bug

This patch combines the polling patch and the cleanup patch into one. This one
patch obsoletes all others in this bug. Chip, can you build and post a kernel
using this patch?

-- mikem

Comment 103 Mike Miller (OS Dev) 2008-03-26 17:42:30 UTC

Created attachment 299200 [details]
My apologies, the last patch had compile warnings.

The last patch generated compile time warnings. This cleans that up and is the
only patch that should be used for testing kdump.

This patch definitely obsoletes all others. Chip, please use this patch to
build and post a new kernel.

Comment 104 Chip Coldwell 2008-03-26 18:20:48 UTC

If I understand comment #96 and comment #97 correctly, we have implemented a
polling mode in the driver on top of a polling mode in the kernel.  Is that correct?

Comment 105 Chip Coldwell 2008-03-26 18:25:09 UTC

(In reply to comment #103)
> Created an attachment (id=299200) [edit]
> My apologies, the last patch had compile warnings.

I'm still getting them with the current patch:

/usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:193: warning: ‘print_cmd’
declared ‘static’ but never defined

Looks like more debugging code that leaked through.

Chip

Comment 106 Mike Miller (OS Dev) 2008-03-26 19:32:40 UTC

I don't understand how. The last hunk of the patch:

@@ -293,6 +293,8 @@ print_bytes (unsigned char *c, int len,
        }
 }

+#endif
+
 static void
 print_cmd(CommandList_struct *cp)
 {
@@ -339,8 +341,6 @@ print_cmd(CommandList_struct *cp)

 }

-#endif
-
 static int
 find_bus_target_lun(int ctlr, int *bus, int *target, int *lun)
 {

moves the endif to before print_cmd. Can you ensure you actually have the right
patch? I obsoleted all others with my last post.

For comment #104 the kernel is not polling. All other devices get an interrupt.
We poll in cciss to workaround the various interrupt related issues. For
instance in 2.6.25-rc6 interrupt driven mode works fine even in the crashkernel.
The MSI/MSI-X is so different between -18.xxel5 and .25-rc6 you'd think you were
looking at 2 different OS's.

Comment 107 Chip Coldwell 2008-03-27 17:49:47 UTC

Fresh kernels here:

http://people.redhat.com/coldwell/kernel/bugs/230717/

Chip

Comment 108 Doug Chapman 2008-03-27 19:29:37 UTC

(In reply to comment #107)
> Fresh kernels here:
> 
> http://people.redhat.com/coldwell/kernel/bugs/230717/
> 
> Chip
> 

Kdump working fine w/cciss using this kernel on ia64.

Comment 109 Don Domingo 2008-03-30 22:51:34 UTC

as per previous comment, release note revised, added to RHEL5.2 "Resolved Issues":

<quote>
(x86;x86_64;ia64) Crash dumping through kexec and kdump now functions reliably
with HP Smart Array controllers. Note that these controllers use the cciss driver.
</quote>

please advise if any further revisions are required. thanks!

Comment 110 Mike Miller (OS Dev) 2008-04-01 18:08:38 UTC

Chip, Can you post the kernel-*-devel-2.6.18-86* packages for me? We're trying
to build rpms for the test groups and need those packages for our build environment.

Thanks,
mikem

Comment 111 Chip Coldwell 2008-04-01 19:25:44 UTC

(In reply to comment #110)
> Chip, Can you post the kernel-*-devel-2.6.18-86* packages for me? We're trying
> to build rpms for the test groups and need those packages for our build
environment.

OK, done.

Chip

Comment 112 Mike Miller (OS Dev) 2008-04-01 19:31:54 UTC

Thanks.

Comment 113 Chip Coldwell 2008-04-01 20:41:17 UTC

We have a system that is exhibiting soft lockups with this latest kernel.
The BIOS banner shows


                            4096 MB Installed

ProLiant System BIOS - P57 (11/08/2006)
Copyright 1982, 2006 Hewlett-Packard Development Company, L.P. 


Proc 1: Dual-Core Intel(R) Xeon(TM) Processor (3.00 GHz/1333 MHz, 4MB L2)
Proc 2: Dual-Core Intel(R) Xeon(TM) Processor (3.00 GHz/1333 MHz, 4MB L2)
Power Regulator Mode: Dynamic Power Savings

Advanced Memory Protection Mode: Advanced ECC Support
Redundant ROM Detected - This system contains a valid backup system ROM.

Integrated Lights-Out 2 Advanced                          
iLO 2 v1.26 Nov 17 2006 192.168.52.196

Slot 1  HP Smart Array P400 Controller       (512MB, v1.18)   1 Logical Drive
Slot 7  HP Smart Array P800 Controller       (512MB, v2.08)   0 Logical Drives

After bringing up the system, I run

# export kver=`uname -r`
# kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img
--command-line="`cat /proc/cmdline` irqpoll reset_devices"
# reboot

The kexec kernel does not get beyond loading the cciss driver, and these
messages are displayed on the console every 10 seconds:

BUG: soft lockup - CPU#2 stuck for 10s! [insmod:479]
CPU 2:
Modules linked in: cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U)
ohci_hcd(U) uhci_hcd(U)
Pid: 479, comm: insmod Tainted: G      2.6.18-86.el5.bz230717 #1
RIP: 0010:[<ffffffff8000c5f9>]  [<ffffffff8000c5f9>] __delay+0xa/0x10
RSP: 0018:ffff81012ea9dd70  EFLAGS: 00000212
RAX: 00000000002d9dda RBX: ffff810037f4c000 RCX: 0000000073c24819
RDX: 0000000000000098 RSI: ffff810037f4c000 RDI: 00000000002dc493
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff81000565cffc
R10: 0000000000008000 R11: ffff81012f9a0000 R12: ffff81012fd35870
R13: ffff81012ff9c000 R14: ffffffff80149d80 R15: 0000000000000202
FS:  0000000016fe8850(0063) GS:ffff81012ff24e40(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000170085df CR3: 000000012e8e5000 CR4: 00000000000006e0

Call Trace:
 [<ffffffff880b7bff>] :cciss:cciss_init_one+0x272/0x11d3
 [<ffffffff80062efb>] thread_return+0x0/0xdf
 [<ffffffff8014f2d0>] pci_device_probe+0x100/0x180
 [<ffffffff801aef9d>] driver_probe_device+0x52/0xaa
 [<ffffffff801af0cc>] __driver_attach+0x65/0xb6
 [<ffffffff801af067>] __driver_attach+0x0/0xb6
 [<ffffffff801ae9de>] bus_for_each_dev+0x43/0x6e
 [<ffffffff801ae624>] bus_add_driver+0x7e/0x130
 [<ffffffff8014f4a8>] __pci_register_driver+0x4b/0x6c
 [<ffffffff800a3d4d>] sys_init_module+0xaf/0x1e8
 [<ffffffff8005d116>] system_call+0x7e/0x83

Comment 114 Chip Coldwell 2008-04-01 20:54:39 UTC

(In reply to comment #113)

> Call Trace:
>  [<ffffffff880b7bff>] :cciss:cciss_init_one+0x272/0x11d3

This corresponds to the function

static int cciss_reset_controller(struct pci_dev *pdev,
        int c, ctlr_info_t *hba)

in particular, it is getting stuck in this loop:


        /* Wait some time for the scratchpad to be reset. */
        do {
                mdelay(25);
                scratchpad = readl(hba->vaddr + SA5_SCRATCHPAD_OFFSET);
        } while (scratchpad == CCISS_FIRMWARE_READY);

Comment 115 Don Domingo 2008-04-02 02:14:28 UTC

Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don

Comment 118 Mike Miller (OS Dev) 2008-04-02 20:42:20 UTC

Do you see anything like:

ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 169
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:1d.0: irq 169, io base 0x00001000
irq 169: nobody cared (try booting with the "irqpoll" option)

Call Trace:
 <IRQ>  [<ffffffff800b799e>] __report_bad_irq+0x30/0x7d
 [<ffffffff800b7bd1>] note_interrupt+0x1e6/0x227
 [<ffffffff800b70db>] __do_IRQ+0xbd/0x103
 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa  [<ffffffff80064aa8>]
_spin_unlock_irqrestore+0x8/0x9
 [<ffffffff801f12d2>] i8042_interrupt+0x42/0x1ec
 [<ffffffff800108f3>] handle_IRQ_event+0x29/0x58
 [<ffffffff800b70c2>] __do_IRQ+0xa4/0x103
 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 [<ffffffff8015bb8e>] vgacon_cursor+0x0/0x1a5
 [<ffffffff80011e3c>] __do_softirq+0x53/0xd6
 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
 [<ffffffff8006c55e>] do_softirq+0x2c/0x85
 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c
 <EOI>  [<ffffffff8015bb8e>] vgacon_cursor+0x0/0x1a5
 [<ffffffff8008fd09>] vprintk+0x290/0x2dc
 [<ffffffff800fd524>] proc_mkdir_mode+0x4c/0x63
 [<ffffffff800b8a66>] register_handler_proc+0x9e/0xb0
 [<ffffffff8008fda7>] printk+0x52/0xbd
 [<ffffffff800b7681>] setup_irq+0x178/0x1c1
 [<ffffffff801de557>] usb_hcd_irq+0x0/0x55
 [<ffffffff800b777a>] request_irq+0xb0/0xd6
 [<ffffffff801de276>] usb_add_hcd+0x2fc/0x52b
 [<ffffffff801e643d>] usb_hcd_pci_probe+0x1e4/0x28b
 [<ffffffff8014f2d0>] pci_device_probe+0x100/0x180
 [<ffffffff801aef9d>] driver_probe_device+0x52/0xaa
 [<ffffffff801af0cc>] __driver_attach+0x65/0xb6
 [<ffffffff801af067>] __driver_attach+0x0/0xb6
 [<ffffffff801ae9de>] bus_for_each_dev+0x43/0x6e
 [<ffffffff801ae624>] bus_add_driver+0x7e/0x130
 [<ffffffff8014f4a8>] __pci_register_driver+0x4b/0x6c
 [<ffffffff88011060>] :uhci_hcd:uhci_hcd_init+0x60/0xa7
 [<ffffffff800a3d4d>] sys_init_module+0xaf/0x1e8
 [<ffffffff8005d116>] system_call+0x7e/0x83

handlers:
[<ffffffff801de557>] (usb_hcd_irq+0x0/0x55)
Disabling IRQ #169

before you see the cciss driver load?

Comment 119 Don Domingo 2008-04-03 01:49:08 UTC

hmmm... since still unresolved, reverting back to old release note:

<quote>
(x86;x86_64;ia64) Crash dumping through kexec and kdump may not function
reliably with HP Smart Array controllers. Note that these controllers use the
cciss driver.
</quote>

please advise (before April 15) if any further revisions are required. thanks!

Comment 120 Sandy Garza 2008-04-03 13:46:40 UTC

I thought this Bugzilla was a "blocker" for RHEL 5.2. If so, we need to allow 
for further modifications to the release notes until this issue is resolved.

Comment 122 Mike Miller (OS Dev) 2008-04-03 15:01:41 UTC

I can only reproduce this failure on an ML370. From the info in comment #113 it
appears the same is true in Red Hat's lab. I'm working now to determine what's
different on the ML370 from the other systems I've tested. Those systems include
the ML570 G4, DL580 G5, and DL385 G5.

Comment 123 Tom Coughlan 2008-04-04 18:39:49 UTC

(In reply to comment #120)
> I thought this Bugzilla was a "blocker" for RHEL 5.2. If so, we need to allow 
> for further modifications to the release notes until this issue is resolved.

It is a blocker for the release. I am not sure whether we will block the release
notes from going to translation on April 15, though. Instead, maybe we can put
something in the release notes that says "refer to the web-based release note
updates for the latest status on cciss with kdump".

Comment 124 Mike Miller (OS Dev) 2008-04-04 19:45:53 UTC

I have been able to determine the problem you're seeing on the ML370 is the old
firmware on the P400. For whatever reason after we issue the reset message the
scratchpad register never gets reset. That's why we loop forever. Please update
the firmware using the  iso image available at: 

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=1157689&prodTypeId=329290&prodSeriesId=1157687&swLang=8&taskId=135&swEnvOID=4004#2913

Unzip the image, burn a CD, boot to the CD, click on the Firmware Update tab,
then click install.
Please update the firmware on both controllers and re-run your test.

Comment 126 Tomas Henzl 2008-04-08 13:33:29 UTC

Hi,
I had problems with the remote fw update, good news is that now is kexec+kdump
working for me on the ML370.

Chip, are you also satisfied with patch now ?

Latest kernels could be found on
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1251415

Comment 127 Mike Miller (OS Dev) 2008-04-08 13:58:31 UTC

Chip,
We have some testers reporting the /proc/vmcore file is size zero. Any ideas
what may cause this? I have not seen that problem in my testing.

Comment 128 Doug Chapman 2008-04-08 14:15:35 UTC

(In reply to comment #127)
> Chip,
> We have some testers reporting the /proc/vmcore file is size zero. Any ideas
> what may cause this? I have not seen that problem in my testing.

Mike,

Was that on ia64?  If so that sounds like this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=434927

- Doug

Comment 129 Chip Coldwell 2008-04-08 14:36:54 UTC

(In reply to comment #126)
> Hi,
> I had problems with the remote fw update, good news is that now is kexec+kdump
> working for me on the ML370.
> 
> Chip, are you also satisfied with patch now ?
> 
> Latest kernels could be found on
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1251415

I can kexec with this kernel, although I still get this message on bootup:

irq 177: nobody cared (try booting with the "irqpoll" option)

Call Trace:
 <IRQ>  [<ffffffff800b7a5c>] __report_bad_irq+0x30/0x7d
 [<ffffffff800b7c8f>] note_interrupt+0x1e6/0x227
 [<ffffffff800b7199>] __do_IRQ+0xbd/0x103
 [<ffffffff80011ed2>] __do_softirq+0x5e/0xd6
 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5
 [<ffffffff8006ad28>] default_idle+0x0/0x50
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff800d464b>] cache_reap+0x0/0x219
 [<ffffffff8006ad51>] default_idle+0x29/0x50
 [<ffffffff80048ae4>] cpu_idle+0x95/0xb8
 [<ffffffff803d9801>] start_kernel+0x220/0x225
 [<ffffffff803d922f>] _sinittext+0x22f/0x236

handlers:
[<ffffffff8812cb1b>] (do_cciss_intr+0x0/0x8b7 [cciss])
Disabling IRQ #177

(I am booting the kexec kernel with the "irqpoll" option.)

Comment 130 Mike Miller (OS Dev) 2008-04-08 15:52:30 UTC

For comment #128: this is not ia64.
For comment #129: I sometimes see similar messages such as what I posted in
comment #118. I don't think it has anything to do with cciss. Comments?

Comment 132 Chip Coldwell 2008-04-08 20:36:43 UTC

I've instrumented the code a little bit and found that the reason we get this error

cciss: MSI-X init failed -22

is because of this test in drivers/pci/msi.c:pci_enable_msix

        pci_read_config_word(dev, msi_control_reg(pos), &control);
        if (control & PCI_MSIX_FLAGS_ENABLE)
                return -EINVAL;                 /* Already in MSI-X mode */

In the kexec kernel, the value read from the control register is 0x00008003 and

#define PCI_MSIX_FLAGS_ENABLE           (1 << 15)

In other words, MSIX was enabled by the pre-kexec kernel, and this state remains
in the PCI configuration memory of the device after the kexec kernel starts.  So
this check succeeds, and thus the kexec kernel fails to allocate an MSIX vector
to the CCISS device.

I took a peek upstream, and it appears that the sanity checks in pci_enable_msix
have been consolidated into one function, pci_msi_check_device,

commit 24334a12533e9ac70dcb467ccd629f190afc5361
Author: Brice Goglin <brice>
Date:   Thu Aug 31 01:55:07 2006 -0400

    MSI: Factorize common code in pci_msi_supported()
    
    pci_enable_msi() and pci_enable_msix() use the same code to detect
    whether MSI might be enabled on this device. Factorize this code in
    pci_msi_supported(). And improve the documentation about the fact
    that only the root chipset must support MSI, but it is hard to
    find the root bus so we check all parent busses MSI flags.
    
    Signed-off-by: Brice Goglin <brice>
    Signed-off-by: Greg Kroah-Hartman <gregkh>

As Mike Miller pointed out, there has been a fair amount of churn in the MSI
code upstream.  It appears that during this process, this particular test
(checking the PCI configuration space MSI control register to see if MSI is
already enabled) got dropped, although I haven't been able to find a commit log
that specifically says this was intentional.

I am continuing to dig around.

Chip

Comment 133 Mike Miller (OS Dev) 2008-04-08 20:45:24 UTC

I also came to that conclusion but I'm not sure it's absolutely correct. I
notice on my Opteron systems that MSI-X always fails even when booting a fresh
"production" kernel. I'm adding debug to see how the MSI-X table is being
initialized. I also got my hands on a Hardware Diagnostic Tool to try and see
what the hardware tells me. Now I'm getting dangerous. :)

Comment 134 Bryan Stillwell 2008-04-08 21:16:12 UTC

I've tested the 2.6.18-86.el5.bz230717 kernel RPM on rhel5.2s3 (installed using
--oldpackage) and I was able to do multiple kdumps on two different ia64-based
rx6600s (one with 4GiB memory and the other with 96GiB).  I used
'crashkernel=768M' on both machines.

I've also tried using 2.6.18-86.el5.bz230717 on rhel5.2s2, but it fails there. 
Even after upgrading kexec-tools to the rhel5.2s3 version (1.102pre-16.el5) it
wouldn't work.  You really need to start with an rhel5.2s3 install.

Chuck Morrison tested kdump using the 2.6.18-86.el5.bz230717 kernel on rhel5.2s3
on an BL860c and it worked on that machine.

Marilise Cover also tested kdump an rx2660 using the same
snapshot3+2.6.18-86.el5.bz230717 combination with the same positive result.

Comment 135 Tomas Henzl 2008-04-09 11:13:04 UTC

Chip, for comment #132: 
when is reset_devices set (and it should be with kdump), then the
pci_enable_msix shouldn't be even called.
/* If the kernel supports MSI/MSI-X we will try to enable that functionality,
 * else we use the IO-APIC interrupt assigned to us by system ROM. If we're
 * booting into a crashkernel we use polling mode.
 */
	if (!reset_devices)
		cciss_interrupt_mode(c, pdev, board_id);

Comment 138 Sam Knuth 2008-04-15 16:24:49 UTC

Tom, Chip, Mike - is it possible to get a quick summary of this issue? 

Specific questions:

1) Are there specific models where kdump wont work relibably, or is this across
the board for CCISS?

2) Is the scope of the issue that kdump "wont work reliably" or could this cause
other problems (like cause a panic that would not have happened if kdump wasn't
running)?

Thanks,
Sam

Comment 139 Mike Miller (OS Dev) 2008-04-15 19:43:43 UTC

My quick summary is kdump is working with the later cciss controllers. The
controller firmware must be updated. We know that early versions of controller
firmware may have issues and recommend users update the firmware using the
Firmware Maintenance CD available on hp.com.
We are still performing regression tests in our labs to determine exactly what
controller/firmware combinations are required.

Comment 143 Chip Coldwell 2008-04-17 15:15:19 UTC

Created attachment 302760 [details]
use PCI power management to reset the controller

Hi Mike,

We have developed an alternate approach to your polling mode, and we would
appreciate any feedback you can give us.  The idea is to use PCI power
management to reset the controller, and possibly reset the MSI/MSI-X
configuration if necessary on kexec.  This approach seems to be a much lighter
touch to the driver, and has been successfully tested on x86_64 and ia64.

The patch is attached; note that cciss_reset_controller is defined but never
used and cciss_hard_reset_controller/cciss_reset_msi are the functions that get
used.  Kernels built with this patch are available from

http://people.redhat.com/coldwell/kernel/bugs/230717/cmc/

(n.b. trailing /cmc/ on the URL).

Chip

Comment 145 Mike Miller (OS Dev) 2008-04-17 19:06:33 UTC

Chip,
Overall it looks pretty good, but
1. I pulled down the x86_64 kernel from the link in comment #143. Your patch was
not in there.
2. Why leave cciss_reset_controller if it's not used?
3. Your patch still puts us into polling mode. See below:

[root@rover ~]# cat /proc/cmdline
root=/dev/VolGroup00/LogVol00 3 irq_poll max_cpus=1 reset_devices
memmap=exactmap memmap=640K@0K memmap=5112K@16384K memmap=59768K@22136K
elfcorehdr=81904K memmap=32K#2095424K

[root@rover ~]# cat /proc/interrupts
           CPU0       CPU1
  0:     608922          0    IO-APIC-edge  timer
  1:         36        127    IO-APIC-edge  i8042
  4:         12          0    IO-APIC-edge  serial
  5:         38          0   IO-APIC-level  ohci_hcd:usb2, ohci_hcd:usb3,
ehci_hcd:usb4
  8:          1          0    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 12:        104          0    IO-APIC-edge  i8042
 14:         26       4304    IO-APIC-edge  ide0
 58:         97          3   IO-APIC-level  uhci_hcd:usb1
201:     205364       2368   IO-APIC-level  eth0
209:         47          0   IO-APIC-level  ioc0
NMI:        216         86
LOC:     608836     608766
ERR:          1
MIS:          0

[root@rover ~]# dmesg
~~~~~~~~~~~~~~~SNIP
HP CISS Driver (v 3.6.20-RH1)
cciss: using PCI PM to reset controller
ACPI: PCI Interrupt 0000:46:00.0[A] -> GSI 32 (level, low) -> IRQ 217
cciss0: <0x3230> at PCI 0000:46:00.0 IRQ 0 using DAC
      blocks= 429925920 block_size= 512
      heads= 255, sectors= 32, cylinders= 52687

      blocks= 429925920 block_size= 512
      heads= 255, sectors= 32, cylinders= 52687

 cciss/c0d0: p1 p2
~~~~~~~~~~~~~~~SNIP

Note IRQ0 is used.

I like using the PCI power management, that's a good idea. I plan to steal it
for an internal request. :) It's much quicker than doing the soft reset. Not
sure why that is the case.

Did you mean to strip out my patch completely and replace it with yours? I'm
guessing that's why we're still polling.

-- mikem

Comment 146 Chip Coldwell 2008-04-17 21:04:52 UTC

(In reply to comment #145)

> 2. Why leave cciss_reset_controller if it's not used?

I left it in there for testing purposes only.  It the "Reset Controller" message
CDB seems to put the firmware into a strange state; although, even the PCI power
management reset doesn't seem to do the same thing as a warm boot (e.g. the
MSI-X bit remains set in PCI configuration space).

> 3. Your patch still puts us into polling mode. See below:
> 
> [root@rover ~]# cat /proc/cmdline
> root=/dev/VolGroup00/LogVol00 3 irq_poll max_cpus=1 reset_devices
> memmap=exactmap memmap=640K@0K memmap=5112K@16384K memmap=59768K@22136K
> elfcorehdr=81904K memmap=32K#2095424K
> 
> [root@rover ~]# cat /proc/interrupts
>            CPU0       CPU1
>   0:     608922          0    IO-APIC-edge  timer
>   1:         36        127    IO-APIC-edge  i8042
>   4:         12          0    IO-APIC-edge  serial
>   5:         38          0   IO-APIC-level  ohci_hcd:usb2, ohci_hcd:usb3,
> ehci_hcd:usb4
>   8:          1          0    IO-APIC-edge  rtc
>   9:          0          0   IO-APIC-level  acpi
>  12:        104          0    IO-APIC-edge  i8042
>  14:         26       4304    IO-APIC-edge  ide0
>  58:         97          3   IO-APIC-level  uhci_hcd:usb1
> 201:     205364       2368   IO-APIC-level  eth0
> 209:         47          0   IO-APIC-level  ioc0
> NMI:        216         86
> LOC:     608836     608766
> ERR:          1
> MIS:          0
> 
> [root@rover ~]# dmesg
> ~~~~~~~~~~~~~~~SNIP
> HP CISS Driver (v 3.6.20-RH1)
> cciss: using PCI PM to reset controller
> ACPI: PCI Interrupt 0000:46:00.0[A] -> GSI 32 (level, low) -> IRQ 217
> cciss0: <0x3230> at PCI 0000:46:00.0 IRQ 0 using DAC
>       blocks= 429925920 block_size= 512
>       heads= 255, sectors= 32, cylinders= 52687
> 
>       blocks= 429925920 block_size= 512
>       heads= 255, sectors= 32, cylinders= 52687
> 
>  cciss/c0d0: p1 p2
> ~~~~~~~~~~~~~~~SNIP
> 
> Note IRQ0 is used.

That's strange.  On my test machines, it does not do that.  The "irq_poll"
kernel command line should cause the kernel to poll IRQs, however.

> Did you mean to strip out my patch completely and replace it with yours? I'm
> guessing that's why we're still polling.

Yes; my patch applies to the original cciss.c source code.  Which is why I'm
surprised that you're seeing it grab IRQ 0.  We don't see that here.

There is a problem with this version, however.  I discovered today that on an
ia64 system with two cciss hba's, the kexec kernel is able to use the one that
does MSI-X, but not the one that uses the standard irq via the IO-SAPIC.  Ugh. 
I'm digging into that right now.

Chip

Comment 147 Mike Miller (OS Dev) 2008-04-18 14:59:49 UTC

Chip,
I reversed my patch and now your patch is working as you expected, except that
MSI-X initialization still fails. As I've said before, I do not believe the
MSI-X issue is related to kexec or the crashkernel. MSI-X init fails even when
booting normally.
I suggest we open a new BZ for the MSI-X failure.

-- mikem

Comment 148 Chip Coldwell 2008-04-18 20:07:41 UTC

(In reply to comment #147)
> 
> There is a problem with this version, however.  I discovered today that on an
> ia64 system with two cciss hba's, the kexec kernel is able to use the one that
> does MSI-X, but not the one that uses the standard irq via the IO-SAPIC.  Ugh. 
> I'm digging into that right now.

I got to the bottom of this just now.  The issue was that the P600 controller
was taking much longer to recover from the PCI power-management reset than the
other controller in that system.  What I did was to use the "No-op" CDB message
to determine when the controller had recovered from reset.  This takes a couple
of minutes, but it kexecs fine if one waits long enough.  I'm going to respin my
patch and post some updated kernels later today.

Chip

Comment 149 Chip Coldwell 2008-04-18 20:12:09 UTC

(In reply to comment #147)
> Chip,
> I reversed my patch and now your patch is working as you expected, except that
> MSI-X initialization still fails.

Just to clarify; MSI-X initialization is failing in the original kernel and the
kexec kernel, right?  And even if MSI-X initialization fails, you are able to
get a core dump, right?

> As I've said before, I do not believe the
> MSI-X issue is related to kexec or the crashkernel. MSI-X init fails even when
> booting normally.
> I suggest we open a new BZ for the MSI-X failure.

If you are able to get a core dump even when MSI-X initialization fails, then
indeed it is a separate issue and should have a new BZ.

Chip

Comment 150 Mike Miller (OS Dev) 2008-04-18 20:50:41 UTC

Correct, I was able to get a core dump even though MSI-X init failed. I'm trying
to figure out to use this hardware diagnostic tool. I'm hoping we can pinpoint
the root cause.

Comment 151 Chip Coldwell 2008-04-18 21:00:13 UTC

Created attachment 302938 [details]
Use PCI power management to reset the controller

    Proposed fix for bz230717
    
    The proposed fix resets the CCISS hardware in three steps
    in the kexec kernel:
    
    1.	Use PCI power management states to reset the controller
	in the kexec kernel.
    2.	Clear the MSI/MSI-X bits in PCI configuration space so
	that MSI initialization in the kexec kernel doesn't fail.
    3.	Use the CCISS "No-op" message to determine when the
	controller firmware has recovered from the PCI PM reset.

Comment 152 Chip Coldwell 2008-04-20 21:17:41 UTC

Created attachment 303074 [details]
New rev of previous patch for Smart Array 5i

Further testing revealed that the SmartArray 5i controller needs a long pause
between the PCI reset and the first No-op probe.  This patch implements a 30s
pause for all device types, just in case there are others out there in the wild
with the same quirk.

This patch has been tested on the following controllers:

HP Smart Array 5i Controller
Board ID: 0x40800e11
Firmware Version: 2.62 (x86_64)

HP Smart Array P400 Controller
Board ID: 0x3234103c
Firmware Version: 2.08 (ia64) & 4.12 (x86_64)

HP Smart Array P600 Controller
Board ID: 0x3225103c
Firmware Version: 1.88 (ia64)

HP Smart Array P800 Controller
Board ID: 0x3223103c
Firmware Version: 4.12 (x86_64)

Pre-build binary kernels with this patch are available from

http://people.redhat.com/coldwell/kernel/bugs/230717/

Any further testing/reports are much appreciated.

Chip

Comment 153 Tomas Henzl 2008-04-21 12:09:38 UTC

Kdump is successful here on i386
cciss0: HP Smart Array 5i Controller
Board ID: 0x40800e11
Firmware Version: 2.58

Mike, do you have any objections related to the latest pacth ?

Comment 155 Mike Miller (OS Dev) 2008-04-21 19:42:34 UTC

Tomas/Chip:
I'm OK with this latest patch. When can we expect to see it in a snapshot?

-- mikem

Comment 156 Chip Coldwell 2008-04-21 21:06:09 UTC

(In reply to comment #155)
> Tomas/Chip:
> I'm OK with this latest patch. When can we expect to see it in a snapshot?

Possibly as soon as tomorrow.  In the meantime, there are test kernels available
from the URL in comment #152.

Chip

Comment 157 Mike Miller (OS Dev) 2008-04-21 21:11:13 UTC

We have pulled down those latest kernels and testing is getting underway locally.

Comment 158 Don Zickus 2008-04-23 19:10:46 UTC

in kernel-2.6.18-91.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 160 John Poelstra 2008-05-01 16:49:11 UTC

Greetings Red Hat Partner,

A fix for this issue should be included in the latest packages contained in
RHEL5.2-Snapshot7--available now on partners.redhat.com.  

We are nearing GA for 5.2--this is the last opportunity to test and confirm that
your issue is fixed.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to ASSIGNED.

If you are receiving this message in Issue Tracker, please reply with a message
to Issue Tracker about your results and I will update bugzilla for you.  If you
need assistance accessing ftp://partners.redhat.com, please contact your Partner
Manager.

Thank you

Comment 161 Sandy Garza 2008-05-01 20:45:34 UTC

HP retested with RHEL 5.2, Snapshot 7. We had to modify grub to reserve space 
for the crashkernel for Snapshot7 to work. Is this expected behavior? If so, 
does RH document this somewhere?

Comment 162 Ronald Pacheco 2008-05-01 20:57:22 UTC

Sandy,

Please see the following knowledge base article:

http://kbase.redhat.com/faq/FAQ_105_9036.shtm

<snip>
 How to configure kdump

   1. Verify the kexec-tools package is installed:

      # rpm -q kexec-tools

   2. Configure the /etc/kdump.conf file to specify the location where the
vmcore should be dumped. This can be another server via scp, a RAW device, or a
local filesystem.
   3. Modify some boot parameters to reserve a chunk of memory for the capture
kernel. For i386 and x86_64 architectures, edit /etc/grub.conf, and append
crashkernel=128M@16M to the end of the kernel line.
<snip>

Comment 163 Dilip Daya 2008-05-01 23:09:51 UTC

Responding to Comment #78 from me...success using:
http://people.redhat.com/dzickus/el5/92.el5/x86_64/kernel-2.6.18-92.el5.x86_64.rpm

System: HP ProLiant DL365 with Dual-Core AMD Opteron™ Processor 2216 (2.4 GHz)
- Embedded Smart Array SAS Controller P400i

# uname -a
Linux dl365g1 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 xx

# cat /proc/cmdline                                             
ro root=LABEL=/ rhgb quiet console=ttyS0 crashkernel=64M@16M 

# grep KDUMP_COMMANDLINE_APPEND /etc/sysconfig/kdump            
KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices"  

# ll -h /var/crash/2008-05-01-17\:52/vmcore                     
-r-------- 1 root root 5.9G May  1 17:53 /var/crash/2008-05-01-17:52/vmcore

# file vmcore                                    
vmcore: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style

Comment 167 Mike Miller (OS Dev) 2008-05-06 13:24:28 UTC

Passed verification. Chip and Tomas, thanks for your help in resolving this
issue. I plan to steal your PCI power management reset in an internal request. :)

Comment 169 errata-xmlrpc 2008-05-21 14:41:35 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.

coldwell
coughlan
darren_lavender
dchapman
ddomingo
dilip.daya
dwa
dzickus
emcnabb
fhirtz
jarod
jburke
karen.skweres
marcobillpeter
mike.miller
nhorman
qcai
rick.beldin
rick.hester
rpacheco
sandy.garza
sfolkwil
stillwell
tao
tcamuso
vgoyal