Description of problem: A panic is observed upon reboot into the kexec environment on DL platforms: dl385-01.rhts.boston.redhat.com dl585-01.rhts.boston.redhat.com 1.)Follow the HOWTO contained int the kexec-tools rpm. With the kexec tools installed, rebuild the initrd image with `service kdump restart` 2.)Force a panic: `cat c >/proc/sysrq-trigger` Version-Release number of selected component (if applicable): 2.6.18-8.el5 1.101-164.el5 How reproducible: 100% Steps to Reproduce: 1. Explained in Description 2. 3. Actual results: Panic Expected results: Additional info: HP CISS Driver (v 3.6.14-RH1) ACPI: PCI Interrupt 0000:02:04.0[A] -> GSI 18 (level, low) -> IRQ 169 cciss0: <0xb178> at PCI 0000:02:04.0 IRQ 169 using DAC cciss cciss0: SendCmd Invalid command list address returned! (4) ------------[ cut here ]------------ kernel BUG at drivers/block/cciss.c:2232! invalid opcode: 0000 [#1] SMP last sysfs file: Modules linked in: cciss sd_mod scsi_mod CPU: 0 EIP: 0060:[<c9867a7e>] Not tainted VLI EFLAGS: 00010292 (2.6.18-8.el5 #1) EIP is at sendcmd+0x263/0x29e [cciss] eax: 00000044 ebx: c8400000 ecx: c986b5e5 edx: c8dced8c esi: 00000000 edi: 00004e20 ebp: c8e7b800 esp: c8dced94 ds: 007b es: 007b ss: 0068 Process exe (pid: 464, ti=c8dce000 task=c8dc6000 task.ti=c8dce000) Stack: 0026bd80 c8e69ac0 00000012 00000000 00000040 00000000 c8e68cc0 c8e69ac0 c9867d62 00000024 00000000 00000000 00000000 00000000 00000000 c10087a7 00000000 c8df2bc0 00000003 00000000 00000040 00000020 c8e69b40 00000000 Call Trace: [<c9867d62>] cciss_getgeometry+0x9e/0x23f [cciss] [<c10087a7>] dma_alloc_coherent+0xaa/0xde [<c986a0a8>] cciss_init_one+0x6be/0xa9e [cciss] [<c11439eb>] __driver_attach+0x0/0x6b [<c10e6bf4>] pci_device_probe+0x36/0x57 [<c1143945>] driver_probe_device+0x42/0x8b [<c1143a2f>] __driver_attach+0x44/0x6b [<c114344a>] bus_for_each_dev+0x37/0x59 [<c11438af>] driver_attach+0x11/0x13 [<c11439eb>] __driver_attach+0x0/0x6b [<c1143152>] bus_add_driver+0x64/0xfd [<c10e6d22>] __pci_register_driver+0x47/0x63 [<c103d0d4>] sys_init_module+0x16e7/0x186a [<c12f4700>] pcibios_irq_init+0xfe/0x47e [<c1003eff>] syscall_call+0x7/0xb ======================= Code: 88 bc 03 00 00 8b 40 24 83 c0 02 39 c7 7c 0f 56 68 2d b6 86 c9 e8 f7 c6 7b f7 58 5a eb 0d 8b 41 04 89 14 b8 ff 01 e9 7 EIP: [<c9867a7e>] sendcmd+0x263/0x29e [cciss] SS:ESP 0068:c8dced94 <0>Kernel panic - not syncing: Fatal exception
Additional Description: The panic happens upon boot of the kexec environment and before copy of the vmcore happens; So flow looks like this: configure kexec/kdump -> Trigger a crash dump (i.e. `cat /proc/sysrq-trigger`) -> kexec kernel loads from the configured memory spot (kernel cmdline "crashkernel=128M@16M") storage module (cciss driver) loads and panics the system.
This panic doesn't look unusual. Last year when we started pushing hard to integrate kdump as our crash analysis tool, we knew there were a lot of drivers (mainly scsi) that were not kdump friendly. What that means is they could not handle transactions properly from the previous running kernel, namely pci responses and dma interrupts. Vivek helped create a reset mechanism to work around these issues and for the most part any driver that utilized that mechanism had their issues disappear. The cciss panic seems to stem from sendcmd() receiving an illegal response. Again being a scsi device this panic isn't uncommon at all. I presume if we were to implement the reset mechanisms as described above, this problem can be solved. We tested lots of i/o drivers leading up to rhel-5. Apparently the cciss device wasn't at the top of the food chain. Vivek, Does my conclusion from the previously attached panic log seem correct? And do you have any links to what reset mechanisms this driver may need? -Don
Created attachment 149128 [details] console output from the dl360 showing, among other things the panic
Comprehensive test results summary: x86_64 : 7-8 test runs (trigger dump) on ibm, dl and nec machines -dl360-01 was the only machine to have a panic i386: 7-8 test runs (trigger dump) on ibm, dl and nec machines -dl360-01 was the only machine to have a panic *Due to the testing done; The likelyhood or reproduceability % should be something like 50% on dl360-01 and 20% on dl[3-5]85-01 Will attach a csv (openoffice format) spreadsheet.
Created attachment 149208 [details] csv Openoffice spreadsheet showing test matrix
Vivek, will you be able to help on this? I'm also adding Mile Miller, the cciss maintainer at HP.
This has been an issue for Smart Array for some time. I'm working with the firmware team to ensure the reset and abort messages are actually honored by the controller firmware. The latest firmware reportedly does honor the reset but I have not yet tested that functionality. I am out of the office recovering from a motorcycle accident. I do not except to return before March 19.
Vivek, Are there any likely work-arounds, based on your experience with other HBAs that had similar problems? Some customers will be slow to update HBA firmware. Tom
Hi Tom, Can't think of a work-around for this issue. As Mike mentioned, that firmware team needs to make sure controller responds to RESET and ABORT messages then only this problem can be solved. Mike looks like this problem is also related to some pending messages like megaraid. There we issues some kind of FLUSH meesage to the controller to flush all the pending meesage in the queue. Does ciss support something like that? Vivek
Looks like firmware on at least some of our SAS controllers supports the reset message defined in the cciss specification. If someone knows of a flag for which I can test during init that tells me this is a kexec'ed kernel the fix should be fairly simple and straightforward. I still need to do further testing because the P400 locked up when resetting. The P800 and E500 controllers seems to wotk OK.
Vivek, we do support cache flush on Smart Array.
If the flag you are looking for is only for testing purposes, then checking to see if /proc/vmcore is non-zero is one way to do it. But if you want to use this flag as part of the final solution, I would advise against it. Cheers, Don
(In reply to comment #10) > Looks like firmware on at least some of our SAS controllers supports the reset Mike, as we move forward, keep in mind that we will need to write a release note that clearly states which cciss-based systems work with kdump and which do not. This will need to refer to customer-consumable model numbers (for systems with cciss built-in, and add-on cards) and min. fw revs. Maybe your fw guys can help with that.
I can help with that. As I test the various configs I can make sure our release notes are updated. I'll also make sure that cciss.txt is updated with accurate information.
From Vivek Goyal <vgoyal.com> Mike, I had added a command line parameter "reset_devices" to give drivers an indication that they need to first reset their device and then go ahead with rest of the initializaiton. This is available in upstream kernels. I am not sure if this is part of RHEL5 kernels or not. If it is not available in RHEL5 kernels, then for testing you can use upstream kernels, pass "reset_devices" command line option while loading kdump kernel and make use of it for resetting the cciss controller. Once you are successful in your testing, we can think of taking this patch in RHEL5. ("reset_devices" is a very non-intrusive patch.) Regarding flushing the caches, you can try that and see if solves the problem. There are high chances that it will. It did for megaraid. Thanks Vivek
Even with Vivek's patch we still have an issue with msi. When the kernel crashes we have no way to free our msi-x vectors. So when the dump kernel initializes we cannot allocate and register new vectors. I'm not sure how msi determines what resources get which vectors. It seems that it may be based on the PCI bus address. So that a card in a particular slot will always be allocated the same vector(s). There was some work going on upstream to fix that. But a couple of weeks ago I was playing with kdump and still encountered this issue. So I think we're a ways out from making this work as designed.
We still have some apparent firmware issues trying to support kdump. After resetting the controller it will not accept any more commands. I'm working with the firmware group to resolve this problem. No ETA.
Mike, We really need to write a release note warning customers that kdump on cciss does not work, at least on some hw/fw combinations. Please let us know if there are any cciss models and fw versions that are known to work, and you are willing to support. Otherwise the release note will just make a blanket statement. Don, this should go in the 5.1 release note update as soon as Mike replies. Tomas, please keep an eye on any potential fixese for this from HP during 5.2 development. Tom
Proposed support statement: At this time kexec/kdump does not work with any HP Smart Array controller. Support for kexec/kdump is planned for a future release but no scheduling information is available. Is this appropriate for the release notes? Not for public release: I looked at this issue again with one of our firmware engineers. It looks like the firmware is completing commands after the reset but we get stuck in wait for completion in the ioctl path. Right now I'm working 2 critical issues for a new product. When I resolve those I will add debug to figure out why we're stuck. I suspect a corrupt command tag.
Hrm... I don't think saying it doesn't work with *any* Smart Array controller is accurate. I've definitely been able to capture a dump on at least one cciss-equipped box (and one cpqarray-equipped box) while testing our mkdumprd tweaks to support dumping to manually specified cciss & cpqarray devices (see bug 228685). Of course, I don't know how to identify those that work vs. those that don't, so...
for now, in the absence of a definitive list of Smart Array controllers where kdump/kexec is not supported, lets just say "some Smart Array controllers". also, this is only for the X86-64 architectures, right? <quote> (x86_64) Some Smart Array controllers do not support kexec and kdump. </quote>
Jarod, If you've been successful I'd like to know which controllers, firmware version, and driver version have worked. I have had no luck with any of the controllers I've tested. I've only looked at the newest controllers, though.
Ugh. So the cciss system I swear I was able to capture a dump on previously now panics. Its an HP DL380 G5 here in our lab, with an HP Smart Array P400 Controller, firmware version 1.18. The cciss driver is 3.6.16-RH1, as found in kernel-2.6.18-53.el5. From what I can tell, it was mid-August I last actually tried this, so I can give some kernels from that time frame a go to see if one of 'em actually works. Not sure if anyone has changed the controller firmware lately or not though.
(In reply to comment #24) > for now, in the absence of a definitive list of Smart Array controllers where > kdump/kexec is not supported, lets just say "some Smart Array controllers". > > also, this is only for the X86-64 architectures, right? > > <quote> > (x86_64) Some Smart Array controllers do not support kexec and kdump. > </quote> I would prefer: Crash dump using kexec/kdump may not function reliably with HP Smart Array controllers (these adapters use the cciss driver). I believe the architectures are i686, x86_64, and ia64.
thanks Tom, edited as follows: <quote> (x86_64;ia64) Crash dumping through kexec and kdump may not function reliably with HP Smart Array controllers. This is because these controllers use the cciss driver. </quote>
(In reply to comment #28) > <quote> > (x86_64;ia64) Did you overlook x86? > Crash dumping through kexec and kdump may not function reliably > with HP Smart Array controllers. This is because these controllers use the > cciss > driver. > </quote> The problem is with the firmware, not the cciss driver. I wanted to include "cciss" because that helps prople identify the hw involved, and they may be searching the docs. on that term. So something like: "... (These controllers use the cciss driver.) A solutioon to this problem, which is likely to involve a firmware update to the controller, is being investigated." Tom
thanks for clearing that up Tom. release note revised.
Hello - wall street customers are starting to ask about this. Any ETA on the firmware update? This is definitely going to make RHEL 5 a non-starter for Wall St until we get it resolved. -Sam
There is nothing we can do to resolve this without HP fw/driver changes. I have requested management attention.
We have kdump working on ia32 based systems using a kernel.org 2.6.22.9 kernel. I'm having problems with rhel5.1 on x86_64. The driver loads and sucessfully discovers the attached disks but I get the message "unexpected IRQ trap at vector 82" for each controller in the system. Then the system panics with "Kernel panic - not syncing: Attempted to kill init!" According to the kdump doc you must use the uncompressed kernel image on x86_64. That image doesn't seem to be included on the distribution. Can someone please explain RH's expectations for kdump support? In other words, how would a customer set up a crash kernel?
The normal kernel is a relocatable kernel thus allowing kdump to use the same kernel for its purposes. Installing the kexec-tools package (the init script takes care of everything) and setting up the correct memory region on the kernel command line (rebooting to have it take effect) will get you there. This is also done by the First Boot install scripts. Disregard the kdump documentation as it is a little out of date. cc'ing Neil to help you with other little issues.
Mike, don is right, the docs are old, you don't need to used the uncompressed kernel any more. Regarding comment #35 and your panic, I'll look at your console logs shortly.
Mike, I've looked over your console logs. do me a favor and try the following: in /etc/sysconfig/kdump, you'll see a line defining the variable KDUMP_COMMANDLINE_APPEND. Please add the parameter: reset_devices to that variable, along with the others already there. Restart the kdump service and try again. This looks like an old cciss problem thats fixed in kernels 2.6.18-26.el5 and later, but requires that reset_devices be passed on the kernel command line for the kdump kernel. The fix is actually a hack to work around some firmware issues, IIRC, but should get you going until its properly repaired.
Neil, I think that Mike is trying to solve the same issue you've solved in 2.6.18-26.el5 and he is also using the same variable reset_devices as you did. Could I then, when Mike succeeds remove your patch from cciss driver ? I mean with that, only that part which is in cciss.c not general handling of variable reset_devices. This : @@ -2074,6 +2074,13 @@ ctlr, complete); /* not much we can do. */ #ifdef CONFIG_CISS_SCSI_TAPE + /* We might get notification of completion of commands + * which we never issued in this kernel if this boot is + * taking place after previous kernel's crash. Simply + * ignore the commands in this case. + */ + if (reset_devices) + return 0; return 1; }
Tomas, Yes, you are correct. Assuming that the test that I asked mike to preform in comment 38 is successful, then he is trying to solve the same problem I did with the changeset you reference in comment 39. When he manages to fix it, then yes, you can remove the segment that you reference from the cciss driver.
I'm getting soft lockups on one of the CPUs during driver initialization. For waht ever reason the stack is not printing on my serial console, but the interesting part is: <IRQ> [<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7 [<ffffffff800930e2>] update_process_times+0x42/0x68 [<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47 [<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47 [<ffffffff8007bc8e>] apic_timer_interrupt+0x66/0x6c [<ffffffff8000c505>] __delay+0x8/0x10 [<ffffffff880b8ad4>] :cciss:cciss_init_one+0x295/0x11ed Everything before this looks like normal init stuff like pci_probe_device, etc. At this point I've jumped off into the function that resets the controller. After reset we wait 60 seconds for the controller to become ready. I may be able to reduce that delay but I know already 20 seconds is not enough. (20 seconds is our default timeout when polling.) On my initial development system everything would just stop and wait after the cciss version printk. Then we did our device discovery and add the disks, etc. On this 5.1 system (2.6.18-53.el5) after the version printk it halts for a couple of seconds and then I notice other things like USB initialize. About 10 seconds after the version printk I get the soft lockups. Of course, now we're hosed. Any thoughts or suggestions?
adding same "Known Issues" release note to RHEL5.2 release note.
Mike, I don't know, but maybe you could post the current version of your patch so that we can easily help you with debugging.
Which kernel to want the patch made against? I'm currently using 2.6.18-53.el5.
Created attachment 292178 [details] kdump support for cciss Arggggh, now the behavior is different. In the earlier testing I was building an rpm with the kdump support and then installing it on the system. That's when I saw the soft lockups I reported. With the attached patch I used a vanilla 2.6.18-53.el5 kernel, applied the patch, and built a new kernel and modules. I no longer see the soft lockups but it almost seems I'm not getting my interrupts. I notice that MSI-X init fails with a -22. That just means it's an unknown error. At that point I try to get IOAPIC interrupts. I'm not sure that will work. If I remember correctly you can go from IOAPIC ----> MSI-X but not the other way around. I tried forcing the the controllers to IOAPIC mode from the beginning and then crashing the system. I still get failure trying to mount root and the subsequent panic. I don't see this problem on my original development box. Please look at the patch and see if anything stands out. Maybe I'm doing something wrong.
I managed to capture this dump: HP CISS Driver (v 3.6.16-RH1) usb 5-1: new full speed USB device using uhci_hcd and address 2 usb 5-1: configuration #1 chosen from 1 choice irq 169: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff800b5d60>] __report_bad_irq+0x30/0x7d [<ffffffff800b5f93>] note_interrupt+0x1e6/0x227 [<ffffffff800b54a5>] __do_IRQ+0xc7/0x105 [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5 [<ffffffff8005b615>] ret_from_intr+0x0/0xa [<ffffffff80010792>] handle_IRQ_event+0x1b/0x58 [<ffffffff800b5482>] __do_IRQ+0xa4/0x105 [<ffffffff80011cb4>] __do_softirq+0x5e/0xd5 [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5 [<ffffffff8005b615>] ret_from_intr+0x0/0xa <EOI> [<ffffffff801efe01>] input_print_modalias_bits+0x48/0x95 [<ffffffff801efee4>] input_print_modalias+0x96/0x1e4 [<ffffffff801f0bfa>] input_dev_uevent+0x3c4/0x3ff [<ffffffff801abf87>] class_uevent+0x1b9/0x1c8 [<ffffffff80056069>] kobject_get_path+0x99/0xc1 [<ffffffff80055db8>] kobject_uevent+0x1fb/0x413 [<ffffffff80100854>] sysfs_create_link+0xfe/0x10c [<ffffffff801ac742>] class_device_add+0x2fb/0x44b [<ffffffff801f1465>] input_register_device+0xf5/0x271 [<ffffffff801eabc5>] hidinput_connect+0x1bab/0x1bbc [<ffffffff801e76ff>] hid_probe+0xa7f/0xc38 [<ffffffff801dd94c>] usb_probe_interface+0x6c/0x9e [<ffffffff801ab876>] driver_probe_device+0x52/0xaa [<ffffffff801ab8ce>] __device_attach+0x0/0x5 [<ffffffff801ab198>] bus_for_each_drv+0x40/0x72 [<ffffffff801ab925>] device_attach+0x52/0x5f [<ffffffff801aae64>] bus_attach_device+0x1a/0x35 [<ffffffff801aa247>] device_add+0x24a/0x361 [<ffffffff801dcaac>] usb_set_configuration+0x36b/0x3f1 [<ffffffff801d8754>] usb_new_device+0x253/0x2c4 [<ffffffff801d98b0>] hub_thread+0x74b/0xb10 [<ffffffff8009b446>] autoremove_wake_function+0x0/0x2e [<ffffffff801d9165>] hub_thread+0x0/0xb10 [<ffffffff8009b283>] keventd_create_kthread+0x0/0x61 [<ffffffff800321d8>] kthread+0xfe/0x132 [<ffffffff8005bfb1>] child_rip+0xa/0x11 [<ffffffff8009b283>] keventd_create_kthread+0x0/0x61 [<ffffffff800320da>] kthread+0x0/0x132 [<ffffffff8005bfa7>] child_rip+0x0/0x11 handlers: [<ffffffff801dadad>] (usb_hcd_irq+0x0/0x55) [<ffffffff801dadad>] (usb_hcd_irq+0x0/0x55) Disabling IRQ #169 Here's the command line I use to load the crash kernel: kexec -p /boot/vmlinuz-rhel51-kdump \ --initrd=/boot/initrd-rhel51-kdump \ --append="root=/dev/cciss/c0d0p2 reset_devices 3 irqpoll \ maxcpus=1 console=ttyS0,115200 console=tty1" Does anybody see a problem in the command line? The kernel hints to use irqpoll. I am using that parameter. On a maybe more positive note: my kernel.org 2.6.22.9 blows up much the same same way on this platform. I am installing rhel5.1 ia32 on the system on which I did the initial development. I'll post the results when I'm done.
what kexec version are you using mike? A command line length overrun problem was recently reported to me. I don't have the fixed checked in yet, but I have a patch floating out there for kexec-tools-1.102pre-<rev> to hopefully fix it.
I have kexec-tools-1.101-194.el5. Send me a link to your patch.
you can find it attached to bz 428310
I don't have access to that BZ.
I found the problem!!!! I was specifying the wrong root= in my command line to load the crash kernel. I was specifying "root=/dev/cciss/c0d0p2" when it should have been "root=/dev/VolGroup00/LogVol00." The only thing that bothers me now is the MSI-X init failing with the unknown error of -22. I did successfully register for an IOAPIC interrupt but I'm concerned since this was not what I observed during development. Anyone have any ideas about that behavior?
Mike Miller indicates this patch can be submitted for RHEL 5.2. The patch supports all architectures.
Mike, thanks for the effort and for solving this problem.I'm sorry I couldn't help you with debugging,because your patch was running on my hw without problems. Is the problem you mentioned in Comment #51 also successfully solved ?
(In reply to comment #54) > Mike, > thanks for the effort and for solving this problem.I'm sorry I couldn't help you > with debugging,because your patch was running on my hw without problems. > Is the problem you mentioned in Comment #51 also successfully solved ? Tomas, Mike indicated this was an end-user mistype and did not affect the code. He was trying to pass an invalid argument. Thank you.
I'm still working on the MSI-X init failure.
(In reply to comment #56) Mike, it seems to me that the problem could be here in pci_enable_msix(msi.c) pci_read_config_word(dev, msi_control_reg(pos), &control); if (control & PCI_MSIX_FLAGS_ENABLE) here -> return -EINVAL; /* Already in MSI-X mode */ This could mean that the device is still in msi-x mode, so maybe a call to pci_disable_msix or other clean up when reset_devices is set could help.
I tried adding the code to free_irq and pci_disable_msix but it did not resolve the issue. When I made the call to free_irq the kernel complained that I was trying to free an already freed IRQ0. There are a lot of diffs in the 2.6.22.9 kernel versus the 2.6.18-53.el5. Glancing thru the code it may be this: msix_set_enable(dev, 0);/* Ensure msix is disabled as I set it up */ in msix_capability_init() in drivers/pci.msi.c that resolves the issue. I'm testing again on a 2.6.16-xx kernel as a sanity check. Tomas, do you see the same MSI-X init failure?
Yes, the test machine which is MSI-X capable I see the same failure and the pci_disable_msix also didn't help. The kernel 2.6.22.9 is working well with your patch ?
Mike, Please let me know how I can help. I've had a lot of experience with MSI/MSI-X.
Tomas, Yes, testing with the 2.6.22.9 kernel is going well. I see no failures or errors while booting the crash kernel. In your testing are you able to successfully boot to the crash kernel, even though the MSI-X init fails? I guess what I'm asking is are you able to boot up and save off the vmcore file to another system? Tony, If you're in Houston please stop by my office or lab.
Mike, I was busy with another task,sorry for the late answer. When the MSI-X init fails the system is hanging then and vmcore is not created.
Tomas, Can you tell me which kernel, driver, server, and controller you're using? I've had success on g5 servers using the P400 controller. So I'm a bit stumped on why your setup is failing. I'm still investigating the initial failure.
Created attachment 293393 [details] Screenshot
Mike, I've on my system two boards, maybe this could be the difference ? # cat /proc/driver/cciss/* cciss0: HP Smart Array P400 Controller Board ID: 0x3234103c Firmware Version: 1.18 IRQ: 130 Logical drives: 1 Sector size: 2048 Current Q depth: 0 Current # commands on controller: 0 Max Q depth since init: 43 Max # commands on controller since init: 159 Max SG entries since init: 31 Sequential access devices: 0 cciss/c0d0: 72.77GB RAID 1(1+0) cciss1: HP Smart Array P800 Controller Board ID: 0x3223103c Firmware Version: 2.08 IRQ: 162 Logical drives: 0 Sector size: 2048 Current Q depth: 0 Current # commands on controller: 0 Max Q depth since init: 0 Max # commands on controller since init: 1 Max SG entries since init: 0 Sequential access devices: 0 ------------------------------------------ kernel is RHEL5.1; 2.6.18 In the /etc/sysconfig/kdump this is set : KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices" Kdump is operational after echo c > /proc/sysrq-trigger, is the system booting the new kernel and then it locks - see screenshot in Comment #65 On this system(it's not MIS_X capable) the same procedure works and the vmcore is created. # cat /proc/driver/cciss/cciss0 cciss0: HP Smart Array 5i Controller Board ID: 0x40800e11 Firmware Version: 2.58 IRQ: 169 Logical drives: 1 Sector size: 2048 Current Q depth: 0 Current # commands on controller: 0 Max Q depth since init: 17 Max # commands on controller since init: 122 Max SG entries since init: 31 Sequential access devices: 0 cciss/c0d0: 18.18GB RAID 0
I think the problem is the soft lockup and not the MSI-X failure. I also saw that in my testing. But now I can't remember how I got past that. Getting old sucks. I'll add a P800 to my configs and see if I can recreate the soft lockup.
Created attachment 293602 [details] boot log with p400 and p800 in ML370G5 This boot log shows all of the unexpected IRQ messages during init
Created attachment 293604 [details] kdump support patch Tomas, Are you using this patch? This is the one I'm using in my testing and although I see some nasty looking messages about unexpected IRQ's I am still able to boot up and save the vmcore file. I attached a log that shows those messages.
Mike, yes the latest patch is the same as the one from 2008-01-18,to be sure I'll send you my cciss.c from the test system.
Created attachment 294226 [details] working + non working version Mike, after echo c >/proc/sysrq-trigger with the cciss.c from the attachment I'm getting softlockup's the cciss.c.orig works. Additionally there is also difference on line 2136 - the lines if (reset_devices) return 0; are removed.
Thank you, Tomas. I'll check out the diffs.
Are the systems that fail Opteron based? I see at the top of the bug where the DL385 is listed. I've been doing all my testing on Intel based systems. Do you have any Intel based HP platforms?
The failing system with the MSI-X capability is Intel based (Intel(R) Xeon(R) CPU 5160 @ 3.00GHz).
I'm on the road right now returning 20080303. I also have an internal HP customer seeing the same problems you are. When I get back I'll dig into this and find the resolution.
HP ProLiant DL365 with Dual-Core AMD Opteron™ Processor 2216 (2.4 GHz) - Embedded Smart Array SAS Controller P400i RHEL5.1 (kernel 2.6.18-53.1.13.el5) # uname -a Linux dl365g1 2.6.18-53.1.13.el5 #1 SMP Mon Feb 11 13:27:27 EST 2008 x86_64 x86_64 x86_64 GNU/Linux # modinfo cciss filename: /lib/modules/2.6.18-53.1.13.el5/kernel/drivers/block/cciss.ko license: GPL version: 3.6.16-RH1 description: Driver for HP Controller SA5xxx SA6xxx version 3.6.16-RH1 ... ... # cat /sys/kernel/kexec_crash_loaded 1 # echo c > /proc/sysrq-trigger ... ... Starting RPC idmapd: [ OK ] EDAC k8 MC1: GART TLB error: transaction type(generic), cache level(generic) EDAC k8 MC1: extended error code: GART error Kernel panic - not syncing: MC1: processort context corrupt
So after having various problems with interrupts that differed from kernel to kernel I decided to try a polling mode crash driver. The flow of operation is "identical" to the interrupt driven code except that I do not do a request_irq. Instead we start up a thread that calls the interrupt handler to complete commands. If I use an upstream kernel I can successfully boot in a crashkernel. When I try the 2.6.18-84.elPAE kernel included in the 20080303 5.2 beta it fails. It fails in fs/blockdev.c in the do_open function. Specifically: if (!part) { struct backing_dev_info *bdi; if (disk->fops->open) { ret = disk->fops->open(bdev->bd_inode, file); if (ret) goto out_first; } ret = -16 so we jump to out_first. We get to do_open via add_disk -> register_disk. In the upstream kernels these functions are completely different so it's difficult to tell what's broken. I also don't know where or how to find disk->fops->open(bdev->bd_inode, file). Can someone at RH help out here?
Created attachment 298090 [details] polling mode patch for kexec/kdump Here is my patch for polling mode while in a crashkernel.
In Documentation/pci.txt: There are (at least) two really good reasons for using MSI: 1) MSI is an exclusive interrupt vector by definition. This means the interrupt handler doesn't have to verify its device caused the interrupt. But I note that static inline long interrupt_not_for_us(ctlr_info_t *h) { #ifdef CONFIG_CISS_SCSI_TAPE return (((h->access.intr_pending(h) == 0) || (h->interrupts_enabled == 0)) && (h->scsi_rejects.ncompletions == 0)); #else return (((h->access.intr_pending(h) == 0) || (h->interrupts_enabled == 0))); #endif } static irqreturn_t do_cciss_intr(int irq, void *dev_id, struct pt_regs *regs) { ctlr_info_t *h = dev_id; CommandList_struct *c; unsigned long flags; __u32 a, a1, a2; if (interrupt_not_for_us(h)) return IRQ_NONE; so even if the cciss driver has MSI/-X enabled, it checks the interrupt source as if it were a shared interrupt vector. This isn't really the root of the problem, but a micro-optimization that I thought I would put here so I don't forget .... Chip
(In reply to comment #80) > I > also don't know where or how to find disk->fops->open(bdev->bd_inode, file). Can > someone at RH help out here? disk is of type struct gendisk, so fops is of type block_device_operations, initialized in cciss_init_one to cciss_fops. From static struct block_device_operations cciss_fops = { .owner = THIS_MODULE, .open = cciss_open, .release = cciss_release, .ioctl = cciss_ioctl, .getgeo = cciss_getgeo, #ifdef CONFIG_COMPAT .compat_ioctl = cciss_compat_ioctl, #endif .revalidate_disk = cciss_revalidate, }; I would infer that the open method is cciss_open. The return value is -16, which is -EBUSY, so I would assume you are hitting this if (host->busy_initializing || drv->busy_configuring) return -EBUSY; in cciss_open. Chip
I've printed out both host->busy_initializing and drv->busy_initializing. Both are zero. So unless there's some asynch thread checking before they're cleared that does not seem to be the issue.
Can we skip the MSI issues and try to resolve this polling problem? Different kernels exhibit different interrupt problems so polling may be the best option.
(In reply to comment #84) > I've printed out both host->busy_initializing and drv->busy_initializing. Both > are zero. So unless there's some asynch thread checking before they're cleared > that does not seem to be the issue. Were you able to verify that disk->fops->open == cciss_open? Chip
Not yet. But I think you're right about the busy_initializing. I added a flag to skip that test in the polling driver and it wants to boot. I need to pull some more debug out but it looks promising. :) After I test it a bit more I'll post the latest patch. -- mikem
Created attachment 298451 [details] polling mode patch for kexec/kdump redone This patch enables kdump support for the cciss driver. If we're booting to a crashkernel we use polling mode in the driver to avoid any interrupt related issues. Different kernels exhibit different failures when using interrupts including failing to get an MSI-X vector or interrupt sharing issues when there are multiple controllers in the system. The down side about this approach is we must wait approximately 1 minute for each controller to complete initialization after the reset. This could be mitigated by initializing only the first controller. That should be adequate since /proc will only exist on the root filesystem. I'm looking for feedback on this assumption. Please review and test this patch in your labs. NOTE: Since upstream kernels do not exhibit the MSI-X failures the upstream patch may be significantly different than this patch.
(In reply to comment #88) > Created an attachment (id=298451) [edit] > polling mode patch for kexec/kdump redone A bit of debugging code leaked through; I'm dropping this: diff --git a/fs/block_dev.c b/fs/block_dev.c index d7b9a66..a6a06d5 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -899,6 +899,7 @@ static int do_open(struct block_device *bdev, struct file *file, int for_part) struct backing_dev_info *bdi; if (disk->fops->open) { ret = disk->fops->open(bdev->bd_inode, file); + printk("cciss: do_open: ret = %d\n", ret); if (ret) goto out_first; }
(In reply to comment #88) > Created an attachment (id=298451) [edit] > polling mode patch for kexec/kdump redone -ENOCOMPILE CC [M] drivers/block/cciss.o /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c: In function ‘cciss_seq_show’: /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:350: warning: format ‘%d’ expects type ‘int’, but argument 4 has type ‘loff_t’ /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c: In function ‘cciss_init_one’: /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3444: error: ‘reset_c0’ undeclared (first use in this function) /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3444: error: (Each undeclared identifier is reported only once /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3444: error: for each function it appears in.) /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c: In function ‘cciss_completion_thread’: /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:3792: warning: unused variable ‘i’ make[2]: *** [drivers/block/cciss.o] Error 1 make[1]: *** [_module_drivers/block] Error 2 make: *** [modules] Error 2
Created attachment 298533 [details] updated kdump patch Sorry, I was doing a little hacking by hand and left a variable that I should have deleted. I also cleaned up the warnings. This patch has been compile tested. I'm wondering if a spinlock around if (host->busy_initializing || drv->busy_configuring) return -EBUSY; may be better than just skipping the test Comments?
A test kernel which includes this patch is available from http://people.redhat.com/coldwell/kernel/bugs/230717/ Chip
When I tried kexec with this kernel on a DL-585 and it failed. Here's what I did: # kver=`uname -r` # kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img --command-line="`cat /proc/cmdline`" # reboot During the kexec boot, I get this message first irq 177: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff800b799e>] __report_bad_irq+0x30/0x7d [<ffffffff800b7bd1>] note_interrupt+0x1e6/0x227 [<ffffffff800b70db>] __do_IRQ+0xbd/0x103 [<ffffffff80011e47>] __do_softirq+0x5e/0xd6 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5 [<ffffffff8006ad28>] default_idle+0x0/0x50 [<ffffffff8005d615>] ret_from_intr+0x0/0xa <EOI> [<ffffffff8006ad51>] default_idle+0x29/0x50 [<ffffffff80048a90>] cpu_idle+0x95/0xb8 [<ffffffff803d9801>] start_kernel+0x220/0x225 [<ffffffff803d922f>] _sinittext+0x22f/0x236 handlers: [<ffffffff8811ab1b>] (do_cciss_intr+0x0/0x8b7 [cciss]) Disabling IRQ #177 followed by an oops that ends with this: RBP: ffff8103fe68c7c8 R08: 0000000000000001 R09: ffff8100010503d4 R10: 0000000000000010 R11: ffffffff8015bb8e R12: ffff8103fe0ebae0 R13: ffff8103fe0ebae0 R14: ffff8103fe0ebad8 R15: ffff8103fe68c8c8 FS: 00000000110d98f0(0063) GS:ffff8103ffe6cbc0(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000110ef000 CR3: 00000003ffe1d000 CR4: 00000000000006e0 Process init (pid: 1, threadinfo ffff8103fff18000, task ffff8103fff037a0) Stack: ffff8103fe68c7c0 ffff8103fdc7e680 ffff8103fe0ebae0 ffffffff8005467b ffff8103fe6af050 ffffffff80025656 ffff8103fff19f38 ffff8103fdc7e680 00000000fffffffe ffff8103fe6af050 ffff8103fe6af108 ffffffff80025656 Call Trace: [<ffffffff8005467b>] sysfs_readdir+0x14f/0x171 [<ffffffff80025656>] filldir+0x0/0xb7 [<ffffffff80025656>] filldir+0x0/0xb7 [<ffffffff80034dce>] vfs_readdir+0x77/0xa9 [<ffffffff80038677>] sys_getdents+0x75/0xbd [<ffffffff8002e55c>] sys_fcntl+0x2d0/0x2dc [<ffffffff8005d116>] system_call+0x7e/0x83 Code: 0f 0b 68 89 ba 29 80 c2 1a 00 48 8b 55 00 48 39 da 74 1b 48 RIP [<ffffffff801466b6>] __list_add+0x24/0x68 RSP <ffff8103fff19e88> <0>Kernel panic - not syncing: Fatal exception Not sure if the oops had anything to do with cciss. I will try adding "irqpoll" to the command line and see what happens. Chip
With "irqpoll" I still get the "nobody cared" message, but the kexec kernel does boot. Chip
you should always specify irqpoll on the command line when doing a kexec, pretty well as a rule. kexec has a difficult time quiesing and re-assigning interrupts during a reboot.
Since we're polling in cciss interrupts are explicitly turned off so I doubt the spurious interrupt is related to cciss.
Created attachment 298806 [details] cleanup debug in kdump patch My testing looks good on both Intel and AMD Proliants. There was still some debug that bled thru. This patch cleans that up. It also adds a spin_lock around the busy_initializing test. This approach is safer than just bypassing the test in a crashkernel.
I built an ia64 kernel with this patch and it resolves the issue we were seeing. Our problem presented itself as a hang rather than a panic however. With the patch the kexec'ed kernel runs properly (kdump still broken on ia64 but for other unrelated reasons).
I built an kernel with combined from patches #98,#92 and removed the part mentioned in comment #39. On an i686 everything works well, but on the other machine with MSI capability (x86-64 system) I'm still getting the softlockup when the cciss.ko is loading(kdump). The kexec also doesn't work.
Would it be possible to upload _one_ patch that is the current version and obsolete all the others? There are so many patches attached here that I am not sure if I tested with the right one or not.
Created attachment 299186 [details] This patch obsoletes all others in this bug This patch combines the polling patch and the cleanup patch into one. This one patch obsoletes all others in this bug. Chip, can you build and post a kernel using this patch? -- mikem
Created attachment 299200 [details] My apologies, the last patch had compile warnings. The last patch generated compile time warnings. This cleans that up and is the only patch that should be used for testing kdump. This patch definitely obsoletes all others. Chip, please use this patch to build and post a new kernel.
If I understand comment #96 and comment #97 correctly, we have implemented a polling mode in the driver on top of a polling mode in the kernel. Is that correct?
(In reply to comment #103) > Created an attachment (id=299200) [edit] > My apologies, the last patch had compile warnings. I'm still getting them with the current patch: /usr/src/kernel/rhel5/src/kernel/drivers/block/cciss.c:193: warning: ‘print_cmd’ declared ‘static’ but never defined Looks like more debugging code that leaked through. Chip
I don't understand how. The last hunk of the patch: @@ -293,6 +293,8 @@ print_bytes (unsigned char *c, int len, } } +#endif + static void print_cmd(CommandList_struct *cp) { @@ -339,8 +341,6 @@ print_cmd(CommandList_struct *cp) } -#endif - static int find_bus_target_lun(int ctlr, int *bus, int *target, int *lun) { moves the endif to before print_cmd. Can you ensure you actually have the right patch? I obsoleted all others with my last post. For comment #104 the kernel is not polling. All other devices get an interrupt. We poll in cciss to workaround the various interrupt related issues. For instance in 2.6.25-rc6 interrupt driven mode works fine even in the crashkernel. The MSI/MSI-X is so different between -18.xxel5 and .25-rc6 you'd think you were looking at 2 different OS's.
Fresh kernels here: http://people.redhat.com/coldwell/kernel/bugs/230717/ Chip
(In reply to comment #107) > Fresh kernels here: > > http://people.redhat.com/coldwell/kernel/bugs/230717/ > > Chip > Kdump working fine w/cciss using this kernel on ia64.
as per previous comment, release note revised, added to RHEL5.2 "Resolved Issues": <quote> (x86;x86_64;ia64) Crash dumping through kexec and kdump now functions reliably with HP Smart Array controllers. Note that these controllers use the cciss driver. </quote> please advise if any further revisions are required. thanks!
Chip, Can you post the kernel-*-devel-2.6.18-86* packages for me? We're trying to build rpms for the test groups and need those packages for our build environment. Thanks, mikem
(In reply to comment #110) > Chip, Can you post the kernel-*-devel-2.6.18-86* packages for me? We're trying > to build rpms for the test groups and need those packages for our build environment. OK, done. Chip
Thanks.
We have a system that is exhibiting soft lockups with this latest kernel. The BIOS banner shows 4096 MB Installed ProLiant System BIOS - P57 (11/08/2006) Copyright 1982, 2006 Hewlett-Packard Development Company, L.P. Proc 1: Dual-Core Intel(R) Xeon(TM) Processor (3.00 GHz/1333 MHz, 4MB L2) Proc 2: Dual-Core Intel(R) Xeon(TM) Processor (3.00 GHz/1333 MHz, 4MB L2) Power Regulator Mode: Dynamic Power Savings Advanced Memory Protection Mode: Advanced ECC Support Redundant ROM Detected - This system contains a valid backup system ROM. Integrated Lights-Out 2 Advanced iLO 2 v1.26 Nov 17 2006 192.168.52.196 Slot 1 HP Smart Array P400 Controller (512MB, v1.18) 1 Logical Drive Slot 7 HP Smart Array P800 Controller (512MB, v2.08) 0 Logical Drives After bringing up the system, I run # export kver=`uname -r` # kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img --command-line="`cat /proc/cmdline` irqpoll reset_devices" # reboot The kexec kernel does not get beyond loading the cciss driver, and these messages are displayed on the console every 10 seconds: BUG: soft lockup - CPU#2 stuck for 10s! [insmod:479] CPU 2: Modules linked in: cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) Pid: 479, comm: insmod Tainted: G 2.6.18-86.el5.bz230717 #1 RIP: 0010:[<ffffffff8000c5f9>] [<ffffffff8000c5f9>] __delay+0xa/0x10 RSP: 0018:ffff81012ea9dd70 EFLAGS: 00000212 RAX: 00000000002d9dda RBX: ffff810037f4c000 RCX: 0000000073c24819 RDX: 0000000000000098 RSI: ffff810037f4c000 RDI: 00000000002dc493 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff81000565cffc R10: 0000000000008000 R11: ffff81012f9a0000 R12: ffff81012fd35870 R13: ffff81012ff9c000 R14: ffffffff80149d80 R15: 0000000000000202 FS: 0000000016fe8850(0063) GS:ffff81012ff24e40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000170085df CR3: 000000012e8e5000 CR4: 00000000000006e0 Call Trace: [<ffffffff880b7bff>] :cciss:cciss_init_one+0x272/0x11d3 [<ffffffff80062efb>] thread_return+0x0/0xdf [<ffffffff8014f2d0>] pci_device_probe+0x100/0x180 [<ffffffff801aef9d>] driver_probe_device+0x52/0xaa [<ffffffff801af0cc>] __driver_attach+0x65/0xb6 [<ffffffff801af067>] __driver_attach+0x0/0xb6 [<ffffffff801ae9de>] bus_for_each_dev+0x43/0x6e [<ffffffff801ae624>] bus_add_driver+0x7e/0x130 [<ffffffff8014f4a8>] __pci_register_driver+0x4b/0x6c [<ffffffff800a3d4d>] sys_init_module+0xaf/0x1e8 [<ffffffff8005d116>] system_call+0x7e/0x83
(In reply to comment #113) > Call Trace: > [<ffffffff880b7bff>] :cciss:cciss_init_one+0x272/0x11d3 This corresponds to the function static int cciss_reset_controller(struct pci_dev *pdev, int c, ctlr_info_t *hba) in particular, it is getting stuck in this loop: /* Wait some time for the scratchpad to be reset. */ do { mdelay(25); scratchpad = readl(hba->vaddr + SA5_SCRATCHPAD_OFFSET); } while (scratchpad == CCISS_FIRMWARE_READY);
Hi, the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at which point no further additions or revisions will be entertained. a mockup of the RHEL5.2 release notes can be viewed at the following link: http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html please use the aforementioned link to verify if your bugzilla is already in the release notes (if it needs to be). each item in the release notes contains a link to its original bug; as such, you can search through the release notes by bug number. Cheers, Don
Do you see anything like: ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 169 uhci_hcd 0000:00:1d.0: UHCI Host Controller uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1 uhci_hcd 0000:00:1d.0: irq 169, io base 0x00001000 irq 169: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff800b799e>] __report_bad_irq+0x30/0x7d [<ffffffff800b7bd1>] note_interrupt+0x1e6/0x227 [<ffffffff800b70db>] __do_IRQ+0xbd/0x103 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5 [<ffffffff8005d615>] ret_from_intr+0x0/0xa [<ffffffff80064aa8>] _spin_unlock_irqrestore+0x8/0x9 [<ffffffff801f12d2>] i8042_interrupt+0x42/0x1ec [<ffffffff800108f3>] handle_IRQ_event+0x29/0x58 [<ffffffff800b70c2>] __do_IRQ+0xa4/0x103 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5 [<ffffffff8005d615>] ret_from_intr+0x0/0xa [<ffffffff8015bb8e>] vgacon_cursor+0x0/0x1a5 [<ffffffff80011e3c>] __do_softirq+0x53/0xd6 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006c55e>] do_softirq+0x2c/0x85 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8015bb8e>] vgacon_cursor+0x0/0x1a5 [<ffffffff8008fd09>] vprintk+0x290/0x2dc [<ffffffff800fd524>] proc_mkdir_mode+0x4c/0x63 [<ffffffff800b8a66>] register_handler_proc+0x9e/0xb0 [<ffffffff8008fda7>] printk+0x52/0xbd [<ffffffff800b7681>] setup_irq+0x178/0x1c1 [<ffffffff801de557>] usb_hcd_irq+0x0/0x55 [<ffffffff800b777a>] request_irq+0xb0/0xd6 [<ffffffff801de276>] usb_add_hcd+0x2fc/0x52b [<ffffffff801e643d>] usb_hcd_pci_probe+0x1e4/0x28b [<ffffffff8014f2d0>] pci_device_probe+0x100/0x180 [<ffffffff801aef9d>] driver_probe_device+0x52/0xaa [<ffffffff801af0cc>] __driver_attach+0x65/0xb6 [<ffffffff801af067>] __driver_attach+0x0/0xb6 [<ffffffff801ae9de>] bus_for_each_dev+0x43/0x6e [<ffffffff801ae624>] bus_add_driver+0x7e/0x130 [<ffffffff8014f4a8>] __pci_register_driver+0x4b/0x6c [<ffffffff88011060>] :uhci_hcd:uhci_hcd_init+0x60/0xa7 [<ffffffff800a3d4d>] sys_init_module+0xaf/0x1e8 [<ffffffff8005d116>] system_call+0x7e/0x83 handlers: [<ffffffff801de557>] (usb_hcd_irq+0x0/0x55) Disabling IRQ #169 before you see the cciss driver load?
hmmm... since still unresolved, reverting back to old release note: <quote> (x86;x86_64;ia64) Crash dumping through kexec and kdump may not function reliably with HP Smart Array controllers. Note that these controllers use the cciss driver. </quote> please advise (before April 15) if any further revisions are required. thanks!
I thought this Bugzilla was a "blocker" for RHEL 5.2. If so, we need to allow for further modifications to the release notes until this issue is resolved.
I can only reproduce this failure on an ML370. From the info in comment #113 it appears the same is true in Red Hat's lab. I'm working now to determine what's different on the ML370 from the other systems I've tested. Those systems include the ML570 G4, DL580 G5, and DL385 G5.
(In reply to comment #120) > I thought this Bugzilla was a "blocker" for RHEL 5.2. If so, we need to allow > for further modifications to the release notes until this issue is resolved. It is a blocker for the release. I am not sure whether we will block the release notes from going to translation on April 15, though. Instead, maybe we can put something in the release notes that says "refer to the web-based release note updates for the latest status on cciss with kdump".
I have been able to determine the problem you're seeing on the ML370 is the old firmware on the P400. For whatever reason after we issue the reset message the scratchpad register never gets reset. That's why we loop forever. Please update the firmware using the iso image available at: http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=1157689&prodTypeId=329290&prodSeriesId=1157687&swLang=8&taskId=135&swEnvOID=4004#2913 Unzip the image, burn a CD, boot to the CD, click on the Firmware Update tab, then click install. Please update the firmware on both controllers and re-run your test.
Hi, I had problems with the remote fw update, good news is that now is kexec+kdump working for me on the ML370. Chip, are you also satisfied with patch now ? Latest kernels could be found on http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1251415
Chip, We have some testers reporting the /proc/vmcore file is size zero. Any ideas what may cause this? I have not seen that problem in my testing.
(In reply to comment #127) > Chip, > We have some testers reporting the /proc/vmcore file is size zero. Any ideas > what may cause this? I have not seen that problem in my testing. Mike, Was that on ia64? If so that sounds like this issue: https://bugzilla.redhat.com/show_bug.cgi?id=434927 - Doug
(In reply to comment #126) > Hi, > I had problems with the remote fw update, good news is that now is kexec+kdump > working for me on the ML370. > > Chip, are you also satisfied with patch now ? > > Latest kernels could be found on > http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1251415 I can kexec with this kernel, although I still get this message on bootup: irq 177: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff800b7a5c>] __report_bad_irq+0x30/0x7d [<ffffffff800b7c8f>] note_interrupt+0x1e6/0x227 [<ffffffff800b7199>] __do_IRQ+0xbd/0x103 [<ffffffff80011ed2>] __do_softirq+0x5e/0xd6 [<ffffffff8006c3e1>] do_IRQ+0xe7/0xf5 [<ffffffff8006ad28>] default_idle+0x0/0x50 [<ffffffff8005d615>] ret_from_intr+0x0/0xa <EOI> [<ffffffff800d464b>] cache_reap+0x0/0x219 [<ffffffff8006ad51>] default_idle+0x29/0x50 [<ffffffff80048ae4>] cpu_idle+0x95/0xb8 [<ffffffff803d9801>] start_kernel+0x220/0x225 [<ffffffff803d922f>] _sinittext+0x22f/0x236 handlers: [<ffffffff8812cb1b>] (do_cciss_intr+0x0/0x8b7 [cciss]) Disabling IRQ #177 (I am booting the kexec kernel with the "irqpoll" option.)
For comment #128: this is not ia64. For comment #129: I sometimes see similar messages such as what I posted in comment #118. I don't think it has anything to do with cciss. Comments?
I've instrumented the code a little bit and found that the reason we get this error cciss: MSI-X init failed -22 is because of this test in drivers/pci/msi.c:pci_enable_msix pci_read_config_word(dev, msi_control_reg(pos), &control); if (control & PCI_MSIX_FLAGS_ENABLE) return -EINVAL; /* Already in MSI-X mode */ In the kexec kernel, the value read from the control register is 0x00008003 and #define PCI_MSIX_FLAGS_ENABLE (1 << 15) In other words, MSIX was enabled by the pre-kexec kernel, and this state remains in the PCI configuration memory of the device after the kexec kernel starts. So this check succeeds, and thus the kexec kernel fails to allocate an MSIX vector to the CCISS device. I took a peek upstream, and it appears that the sanity checks in pci_enable_msix have been consolidated into one function, pci_msi_check_device, commit 24334a12533e9ac70dcb467ccd629f190afc5361 Author: Brice Goglin <brice> Date: Thu Aug 31 01:55:07 2006 -0400 MSI: Factorize common code in pci_msi_supported() pci_enable_msi() and pci_enable_msix() use the same code to detect whether MSI might be enabled on this device. Factorize this code in pci_msi_supported(). And improve the documentation about the fact that only the root chipset must support MSI, but it is hard to find the root bus so we check all parent busses MSI flags. Signed-off-by: Brice Goglin <brice> Signed-off-by: Greg Kroah-Hartman <gregkh> As Mike Miller pointed out, there has been a fair amount of churn in the MSI code upstream. It appears that during this process, this particular test (checking the PCI configuration space MSI control register to see if MSI is already enabled) got dropped, although I haven't been able to find a commit log that specifically says this was intentional. I am continuing to dig around. Chip
I also came to that conclusion but I'm not sure it's absolutely correct. I notice on my Opteron systems that MSI-X always fails even when booting a fresh "production" kernel. I'm adding debug to see how the MSI-X table is being initialized. I also got my hands on a Hardware Diagnostic Tool to try and see what the hardware tells me. Now I'm getting dangerous. :)
I've tested the 2.6.18-86.el5.bz230717 kernel RPM on rhel5.2s3 (installed using --oldpackage) and I was able to do multiple kdumps on two different ia64-based rx6600s (one with 4GiB memory and the other with 96GiB). I used 'crashkernel=768M' on both machines. I've also tried using 2.6.18-86.el5.bz230717 on rhel5.2s2, but it fails there. Even after upgrading kexec-tools to the rhel5.2s3 version (1.102pre-16.el5) it wouldn't work. You really need to start with an rhel5.2s3 install. Chuck Morrison tested kdump using the 2.6.18-86.el5.bz230717 kernel on rhel5.2s3 on an BL860c and it worked on that machine. Marilise Cover also tested kdump an rx2660 using the same snapshot3+2.6.18-86.el5.bz230717 combination with the same positive result.
Chip, for comment #132: when is reset_devices set (and it should be with kdump), then the pci_enable_msix shouldn't be even called. /* If the kernel supports MSI/MSI-X we will try to enable that functionality, * else we use the IO-APIC interrupt assigned to us by system ROM. If we're * booting into a crashkernel we use polling mode. */ if (!reset_devices) cciss_interrupt_mode(c, pdev, board_id);
Tom, Chip, Mike - is it possible to get a quick summary of this issue? Specific questions: 1) Are there specific models where kdump wont work relibably, or is this across the board for CCISS? 2) Is the scope of the issue that kdump "wont work reliably" or could this cause other problems (like cause a panic that would not have happened if kdump wasn't running)? Thanks, Sam
My quick summary is kdump is working with the later cciss controllers. The controller firmware must be updated. We know that early versions of controller firmware may have issues and recommend users update the firmware using the Firmware Maintenance CD available on hp.com. We are still performing regression tests in our labs to determine exactly what controller/firmware combinations are required.
Created attachment 302760 [details] use PCI power management to reset the controller Hi Mike, We have developed an alternate approach to your polling mode, and we would appreciate any feedback you can give us. The idea is to use PCI power management to reset the controller, and possibly reset the MSI/MSI-X configuration if necessary on kexec. This approach seems to be a much lighter touch to the driver, and has been successfully tested on x86_64 and ia64. The patch is attached; note that cciss_reset_controller is defined but never used and cciss_hard_reset_controller/cciss_reset_msi are the functions that get used. Kernels built with this patch are available from http://people.redhat.com/coldwell/kernel/bugs/230717/cmc/ (n.b. trailing /cmc/ on the URL). Chip
Chip, Overall it looks pretty good, but 1. I pulled down the x86_64 kernel from the link in comment #143. Your patch was not in there. 2. Why leave cciss_reset_controller if it's not used? 3. Your patch still puts us into polling mode. See below: [root@rover ~]# cat /proc/cmdline root=/dev/VolGroup00/LogVol00 3 irq_poll max_cpus=1 reset_devices memmap=exactmap memmap=640K@0K memmap=5112K@16384K memmap=59768K@22136K elfcorehdr=81904K memmap=32K#2095424K [root@rover ~]# cat /proc/interrupts CPU0 CPU1 0: 608922 0 IO-APIC-edge timer 1: 36 127 IO-APIC-edge i8042 4: 12 0 IO-APIC-edge serial 5: 38 0 IO-APIC-level ohci_hcd:usb2, ohci_hcd:usb3, ehci_hcd:usb4 8: 1 0 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 12: 104 0 IO-APIC-edge i8042 14: 26 4304 IO-APIC-edge ide0 58: 97 3 IO-APIC-level uhci_hcd:usb1 201: 205364 2368 IO-APIC-level eth0 209: 47 0 IO-APIC-level ioc0 NMI: 216 86 LOC: 608836 608766 ERR: 1 MIS: 0 [root@rover ~]# dmesg ~~~~~~~~~~~~~~~SNIP HP CISS Driver (v 3.6.20-RH1) cciss: using PCI PM to reset controller ACPI: PCI Interrupt 0000:46:00.0[A] -> GSI 32 (level, low) -> IRQ 217 cciss0: <0x3230> at PCI 0000:46:00.0 IRQ 0 using DAC blocks= 429925920 block_size= 512 heads= 255, sectors= 32, cylinders= 52687 blocks= 429925920 block_size= 512 heads= 255, sectors= 32, cylinders= 52687 cciss/c0d0: p1 p2 ~~~~~~~~~~~~~~~SNIP Note IRQ0 is used. I like using the PCI power management, that's a good idea. I plan to steal it for an internal request. :) It's much quicker than doing the soft reset. Not sure why that is the case. Did you mean to strip out my patch completely and replace it with yours? I'm guessing that's why we're still polling. -- mikem
(In reply to comment #145) > 2. Why leave cciss_reset_controller if it's not used? I left it in there for testing purposes only. It the "Reset Controller" message CDB seems to put the firmware into a strange state; although, even the PCI power management reset doesn't seem to do the same thing as a warm boot (e.g. the MSI-X bit remains set in PCI configuration space). > 3. Your patch still puts us into polling mode. See below: > > [root@rover ~]# cat /proc/cmdline > root=/dev/VolGroup00/LogVol00 3 irq_poll max_cpus=1 reset_devices > memmap=exactmap memmap=640K@0K memmap=5112K@16384K memmap=59768K@22136K > elfcorehdr=81904K memmap=32K#2095424K > > [root@rover ~]# cat /proc/interrupts > CPU0 CPU1 > 0: 608922 0 IO-APIC-edge timer > 1: 36 127 IO-APIC-edge i8042 > 4: 12 0 IO-APIC-edge serial > 5: 38 0 IO-APIC-level ohci_hcd:usb2, ohci_hcd:usb3, > ehci_hcd:usb4 > 8: 1 0 IO-APIC-edge rtc > 9: 0 0 IO-APIC-level acpi > 12: 104 0 IO-APIC-edge i8042 > 14: 26 4304 IO-APIC-edge ide0 > 58: 97 3 IO-APIC-level uhci_hcd:usb1 > 201: 205364 2368 IO-APIC-level eth0 > 209: 47 0 IO-APIC-level ioc0 > NMI: 216 86 > LOC: 608836 608766 > ERR: 1 > MIS: 0 > > [root@rover ~]# dmesg > ~~~~~~~~~~~~~~~SNIP > HP CISS Driver (v 3.6.20-RH1) > cciss: using PCI PM to reset controller > ACPI: PCI Interrupt 0000:46:00.0[A] -> GSI 32 (level, low) -> IRQ 217 > cciss0: <0x3230> at PCI 0000:46:00.0 IRQ 0 using DAC > blocks= 429925920 block_size= 512 > heads= 255, sectors= 32, cylinders= 52687 > > blocks= 429925920 block_size= 512 > heads= 255, sectors= 32, cylinders= 52687 > > cciss/c0d0: p1 p2 > ~~~~~~~~~~~~~~~SNIP > > Note IRQ0 is used. That's strange. On my test machines, it does not do that. The "irq_poll" kernel command line should cause the kernel to poll IRQs, however. > Did you mean to strip out my patch completely and replace it with yours? I'm > guessing that's why we're still polling. Yes; my patch applies to the original cciss.c source code. Which is why I'm surprised that you're seeing it grab IRQ 0. We don't see that here. There is a problem with this version, however. I discovered today that on an ia64 system with two cciss hba's, the kexec kernel is able to use the one that does MSI-X, but not the one that uses the standard irq via the IO-SAPIC. Ugh. I'm digging into that right now. Chip
Chip, I reversed my patch and now your patch is working as you expected, except that MSI-X initialization still fails. As I've said before, I do not believe the MSI-X issue is related to kexec or the crashkernel. MSI-X init fails even when booting normally. I suggest we open a new BZ for the MSI-X failure. -- mikem
(In reply to comment #147) > > There is a problem with this version, however. I discovered today that on an > ia64 system with two cciss hba's, the kexec kernel is able to use the one that > does MSI-X, but not the one that uses the standard irq via the IO-SAPIC. Ugh. > I'm digging into that right now. I got to the bottom of this just now. The issue was that the P600 controller was taking much longer to recover from the PCI power-management reset than the other controller in that system. What I did was to use the "No-op" CDB message to determine when the controller had recovered from reset. This takes a couple of minutes, but it kexecs fine if one waits long enough. I'm going to respin my patch and post some updated kernels later today. Chip
(In reply to comment #147) > Chip, > I reversed my patch and now your patch is working as you expected, except that > MSI-X initialization still fails. Just to clarify; MSI-X initialization is failing in the original kernel and the kexec kernel, right? And even if MSI-X initialization fails, you are able to get a core dump, right? > As I've said before, I do not believe the > MSI-X issue is related to kexec or the crashkernel. MSI-X init fails even when > booting normally. > I suggest we open a new BZ for the MSI-X failure. If you are able to get a core dump even when MSI-X initialization fails, then indeed it is a separate issue and should have a new BZ. Chip
Correct, I was able to get a core dump even though MSI-X init failed. I'm trying to figure out to use this hardware diagnostic tool. I'm hoping we can pinpoint the root cause.
Created attachment 302938 [details] Use PCI power management to reset the controller Proposed fix for bz230717 The proposed fix resets the CCISS hardware in three steps in the kexec kernel: 1. Use PCI power management states to reset the controller in the kexec kernel. 2. Clear the MSI/MSI-X bits in PCI configuration space so that MSI initialization in the kexec kernel doesn't fail. 3. Use the CCISS "No-op" message to determine when the controller firmware has recovered from the PCI PM reset.
Created attachment 303074 [details] New rev of previous patch for Smart Array 5i Further testing revealed that the SmartArray 5i controller needs a long pause between the PCI reset and the first No-op probe. This patch implements a 30s pause for all device types, just in case there are others out there in the wild with the same quirk. This patch has been tested on the following controllers: HP Smart Array 5i Controller Board ID: 0x40800e11 Firmware Version: 2.62 (x86_64) HP Smart Array P400 Controller Board ID: 0x3234103c Firmware Version: 2.08 (ia64) & 4.12 (x86_64) HP Smart Array P600 Controller Board ID: 0x3225103c Firmware Version: 1.88 (ia64) HP Smart Array P800 Controller Board ID: 0x3223103c Firmware Version: 4.12 (x86_64) Pre-build binary kernels with this patch are available from http://people.redhat.com/coldwell/kernel/bugs/230717/ Any further testing/reports are much appreciated. Chip
Kdump is successful here on i386 cciss0: HP Smart Array 5i Controller Board ID: 0x40800e11 Firmware Version: 2.58 Mike, do you have any objections related to the latest pacth ?
Tomas/Chip: I'm OK with this latest patch. When can we expect to see it in a snapshot? -- mikem
(In reply to comment #155) > Tomas/Chip: > I'm OK with this latest patch. When can we expect to see it in a snapshot? Possibly as soon as tomorrow. In the meantime, there are test kernels available from the URL in comment #152. Chip
We have pulled down those latest kernels and testing is getting underway locally.
in kernel-2.6.18-91.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
Greetings Red Hat Partner, A fix for this issue should be included in the latest packages contained in RHEL5.2-Snapshot7--available now on partners.redhat.com. We are nearing GA for 5.2--this is the last opportunity to test and confirm that your issue is fixed. After you (Red Hat Partner) have verified that this issue has been addressed, please perform the following: 1) Change the *status* of this bug to VERIFIED. 2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified) If this issue is not fixed, please add a comment describing the most recent symptoms of the problem you are having and change the status of the bug to ASSIGNED. If you are receiving this message in Issue Tracker, please reply with a message to Issue Tracker about your results and I will update bugzilla for you. If you need assistance accessing ftp://partners.redhat.com, please contact your Partner Manager. Thank you
HP retested with RHEL 5.2, Snapshot 7. We had to modify grub to reserve space for the crashkernel for Snapshot7 to work. Is this expected behavior? If so, does RH document this somewhere?
Sandy, Please see the following knowledge base article: http://kbase.redhat.com/faq/FAQ_105_9036.shtm <snip> How to configure kdump 1. Verify the kexec-tools package is installed: # rpm -q kexec-tools 2. Configure the /etc/kdump.conf file to specify the location where the vmcore should be dumped. This can be another server via scp, a RAW device, or a local filesystem. 3. Modify some boot parameters to reserve a chunk of memory for the capture kernel. For i386 and x86_64 architectures, edit /etc/grub.conf, and append crashkernel=128M@16M to the end of the kernel line. <snip>
Responding to Comment #78 from me...success using: http://people.redhat.com/dzickus/el5/92.el5/x86_64/kernel-2.6.18-92.el5.x86_64.rpm System: HP ProLiant DL365 with Dual-Core AMD Opteron™ Processor 2216 (2.4 GHz) - Embedded Smart Array SAS Controller P400i # uname -a Linux dl365g1 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 xx # cat /proc/cmdline ro root=LABEL=/ rhgb quiet console=ttyS0 crashkernel=64M@16M # grep KDUMP_COMMANDLINE_APPEND /etc/sysconfig/kdump KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices" # ll -h /var/crash/2008-05-01-17\:52/vmcore -r-------- 1 root root 5.9G May 1 17:53 /var/crash/2008-05-01-17:52/vmcore # file vmcore vmcore: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style
Passed verification. Chip and Tomas, thanks for your help in resolving this issue. I plan to steal your PCI power management reset in an internal request. :)
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html