Hide Forgot
Description of problem: Booting installation image of RHEL5.8 on storageqe-02.rhts.eng.bos.redhat.com ends up with kernel panic in cciss module. Version-Release number of selected component (if applicable): kernel 2.6.18-304.el5 Linux version 2.6.18-304.el5 (mockbuild.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Mon Jan 9 18:12:44 EST 2012 How reproducible: ??? Steps to Reproduce: 1. submit attached job Actual results: kernel-panic during boot Expected results: normal boot Additional info: Oops: 0002 [#1] SMP last sysfs file: /firmware/edd/int13_dev80/mbr_signature Modules linked in: xts xcbc wp512 twofish tgr192 tea sha512 sha256 serpent seqiv michael_mic md5 md4 khazad hmac gf128mul eseqiv ecb des crypto_null deflate zlib_deflate ctr cryptomgr crypto_hash chainiv ccm cbc cast6 cast5 blowfish authenc crypto_blkcipher anubis krng ansi_cprng rng aes_generic aead dm_crypt crypto_algapi dm_emc dm_round_robin dm_multipath scsi_dh dm_snapshot dm_mirror dm_zero lock_nolock gfs2 ext3 jbd ext4 crc16 jbd2 msdos dm_raid45 dm_message dm_mem_cache dm_region_hash dm_log dm_mod raid456 xor raid10 raid1 raid0 ata_piix libata cciss be2net 8021q bnx2 be2iscsi ehci_hcd uhci_hcd ib_ipoib ib_cm ib_sa ib_mad ib_core ipoib_helper iscsi_ibft iscsi_tcp libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi sr_mod sd_mod scsi_mod ide_cd cdrom ipv6 xfrm_nalgo crypto_api squashfs pcspkr edd loop nfs nfs_acl lockd sunrpc vfat fat cramfs CPU: 0 EIP: 0060:[<c0624b83>] Not tainted VLI EFLAGS: 00010006 (2.6.18-304.el5 #1) EIP is at _spin_lock_irqsave+0x3/0x27 eax: 000008b4 ebx: 000008b4 ecx: 00000286 edx: 00000206 esi: 00000000 edi: 00000000 ebp: dfc35078 esp: c0753f9c ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, ti=c0753000 task=c06933c0 task.ti=c070e000) Stack: f8b6815d 00000001 000008b4 f7800000 00000000 c070ef80 00000000 c0704b08 0000000f c31bce80 dfc35080 00000001 c0704b20 0000000a c04e59ba c0753fd8 c0753fd8 c070ef68 c042a91d 00000000 c070ef68 c070e000 00000046 00000063 Call Trace: [<f8b6815d>] cciss_softirq_done+0x245/0x35f [cciss] [<c04e59ba>] blk_done_softirq+0x4d/0x58 [<c042a91d>] __do_softirq+0x87/0x114 [<c04073f9>] do_softirq+0x4e/0x92 [<c04507b8>] __do_IRQ+0x0/0x118 [<c04074f4>] do_IRQ+0xb7/0xc3 [<c040597a>] common_interrupt+0x1a/0x20 [<c04031f7>] mwait_idle_with_hints+0x4b/0x4f [<c0403207>] mwait_idle+0xc/0x1b [<c0403d14>] cpu_idle+0x9f/0xb9 [<c07139fc>] start_kernel+0x37b/0x383 ======================= Code: d0 c3 f0 81 00 00 00 00 01 8b 04 24 e9 99 5b e0 ff f0 ff 00 8b 04 24 e9 8e 5b e0 ff b2 01 86 10 8b 04 24 e9 82 5b e0 ff 9c 5a fa <f0> fe 08 79 1c f7 c2 00 02 00 00 74 0b fb f3 90 80 38 00 7e f9 EIP: [<c0624b83>] _spin_lock_irqsave+0x3/0x27 SS:ESP 0068:c0753f9c <0>Kernel panic - not syncing: Fatal exception in interrupt
Created attachment 559073 [details] serial console log
Marian, how good is this reproducible - does it happen on every boot?
(In reply to comment #2) > Marian, > how good is this reproducible - does it happen on every boot? could I use that machine?
I have few jobs queued on that machine, I will update with better reproducibility estimate later. Consider the machine in use now. I will ping you once it is free for your experiments.
So far reproducibility 0 out of 4 jobs.
(In reply to comment #5) > So far reproducibility 0 out of 4 jobs. The problem is located only to this single machine there is not much we can do with it now, the likelihood this is a hw problem is high. Mike, on this particular machine the problem was reported twice, still this can be a hw problem. Looks the bug description somehow familiar to you?
Created attachment 559314 [details] System ROM update We've seen a few failures during kdump testing on the DL380 G5. But I've never seen this during install. The system ROM is very outdated. I'm attaching the latest image available on hp.com. NOTE: The update is listed as critical. Copy this file to the system and execute. Do not interrupt the flashing process or it may trash the system.
Hi Marian, have you had a chance to test with the new firmware?
Hi Tomas, I have not retested as I were unable to reproduce with the old firmware, so the confidence of such testing would be low. Closing as CANTFIX - Not Reproducible.