Bug 417661
Summary: | hpacucli cause kernel panic | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Con Tassios <ct> |
Component: | kernel | Assignee: | Tony Camuso <tcamuso> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.1 | CC: | bdonahue, bernhard.furtmueller, david.elliott, dzickus, kremzeek, mike.miller, peterm, pmitcheson, scott.benesh, syeghiay, tcamuso, tjp, vendor-redhat |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-10-01 19:44:14 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 448732, 483701, 533192 |
Description
Con Tassios
2007-12-10 04:34:58 UTC
I am seeing this too. I suggest that the priority be moved to high - as it is at the moment, EL5 cannot be used on HP hardware - which is a serious problem for me and I'm sufre for many other people. Redhat - do e expect a fix from you or from HP? Here may be something useful from /var/log/messages: Dec 12 13:22:54 capfs1 kernel: blocks= 2293469488 block_size= 512 Dec 12 13:22:55 capfs1 snmpd[3393]: Connection from UDP: [127.0.0.1]:32779 Dec 12 13:22:56 capfs1 kernel: blocks= 2293469488 block_size= 512 Dec 12 13:22:56 capfs1 kernel: kobject_add failed for cciss!c2d1 with -EEXIST, don't try to register things with the same name in the same directory. Dec 12 13:22:56 capfs1 kernel: Dec 12 13:22:56 capfs1 kernel: Call Trace: Dec 12 13:22:56 capfs1 kernel: [<ffffffff8014115f>] kobject_add+0x16e/0x199 Dec 12 13:22:56 capfs1 kernel: [<ffffffff80058ee5>] exact_lock+0x0/0x14 Dec 12 13:22:56 capfs1 kernel: [<ffffffff800fcf4c>] register_disk+0x43/0x199 Dec 12 13:22:56 capfs1 kernel: [<ffffffff80138d9a>] add_disk+0x34/0x3d Dec 12 13:22:56 capfs1 kernel: [<ffffffff880b49a7>] :cciss:rebuild_lun_table+0x48f/0x50f Dec 12 13:22:56 capfs1 kernel: [<ffffffff800c35ff>] zone_statistics+0x3e/0x6d Dec 12 13:22:56 capfs1 kernel: [<ffffffff880b4e21>] :cciss:cciss_ioctl+0x3fa/0xc65 Dec 12 13:22:56 capfs1 kernel: [<ffffffff80142a05>] snprintf+0x44/0x4c Dec 12 13:22:56 capfs1 kernel: [<ffffffff8002ae65>] flush_tlb_page+0xac/0xda Dec 12 13:22:56 capfs1 kernel: [<ffffffff80010b1d>] do_wp_page+0x246/0x67d Dec 12 13:22:56 capfs1 kernel: [<ffffffff8003a613>] d_lookup+0x1e/0x42 Dec 12 13:22:56 capfs1 kernel: [<ffffffff880b56b6>] :cciss:do_ioctl+0x2a/0x39 Dec 12 13:22:56 capfs1 kernel: [<ffffffff880b5774>] :cciss:cciss_compat_ioctl+0xaf/0x25f Dec 12 13:22:56 capfs1 kernel: [<ffffffff80022c4a>] flush_tlb_others+0x84/0xbc Dec 12 13:22:56 capfs1 kernel: [<ffffffff80022c5f>] flush_tlb_others+0x99/0xbc Dec 12 13:22:56 capfs1 kernel: [<ffffffff80021d28>] __up_read+0x19/0x7f Dec 12 13:22:56 capfs1 kernel: [<ffffffff80064a9d>] do_page_fault+0x4eb/0x81d Dec 12 13:22:56 capfs1 kernel: [<ffffffff8000df59>] free_pages_and_swap_cache+0x73/0x8f Dec 12 13:22:56 capfs1 kernel: [<ffffffff80137ffc>] compat_blkdev_ioctl+0x4c/0x5f Dec 12 13:22:56 capfs1 kernel: [<ffffffff800edf00>] compat_sys_ioctl+0xc5/0x2b1 Dec 12 13:22:56 capfs1 kernel: [<ffffffff8005f49b>] sysenter_do_call+0x1b/0x67 Thanks, Paul Hi, We're seeing the same problem here with DL580G4 server Jan 8 17:05:12 localhost kernel: kobject_add failed for cciss!c2d1 with -EEXIST, don't try to register things with the same name in the same directory. Jan 8 17:05:12 localhost kernel: Jan 8 17:05:12 localhost kernel: Call Trace: Jan 8 17:05:12 localhost kernel: [<ffffffff80141148>] kobject_add+0x16e/0x199 Jan 8 17:05:12 localhost kernel: [<ffffffff80058eec>] exact_lock+0x0/0x14 Jan 8 17:05:12 localhost kernel: [<ffffffff800fcf35>] register_disk+0x43/0x199 Jan 8 17:05:12 localhost kernel: [<ffffffff80138d83>] add_disk+0x34/0x3d Jan 8 17:05:12 localhost kernel: [<ffffffff880b49a7>] :cciss:rebuild_lun_table+0x48f/0x50f Jan 8 17:05:12 localhost kernel: [<ffffffff880b4e21>] :cciss:cciss_ioctl+0x3fa/0xc65 Jan 8 17:05:12 localhost kernel: [<ffffffff801429ee>] snprintf+0x44/0x4c Jan 8 17:05:12 localhost kernel: [<ffffffff8002ae6c>] flush_tlb_page+0xac/0xda Jan 8 17:05:12 localhost kernel: [<ffffffff80010b0c>] do_wp_page+0x246/0x67d Jan 8 17:05:12 localhost kernel: [<ffffffff880b56b6>] :cciss:do_ioctl+0x2a/0x39 Jan 8 17:05:12 localhost kernel: [<ffffffff880b5774>] :cciss:cciss_compat_ioctl+0xaf/0x25f Jan 8 17:05:12 localhost kernel: [<ffffffff80021d2f>] __up_read+0x19/0x7f Jan 8 17:05:12 localhost kernel: [<ffffffff80064a9d>] do_page_fault+0x4eb/0x81d Jan 8 17:05:12 localhost kernel: [<ffffffff8000df48>] free_pages_and_swap_cache+0x73/0x8f Jan 8 17:05:12 localhost kernel: [<ffffffff80137fe5>] compat_blkdev_ioctl+0x4c/0x5f Jan 8 17:05:12 localhost kernel: [<ffffffff800edee9>] compat_sys_ioctl+0xc5/0x2b1 Jan 8 17:05:12 localhost kernel: [<ffffffff8005f49b>] sysenter_do_call+0x1b/0x67 redhat 5.1 kernel 2.6.18-53.el5 I am also seeing the same as above running running on DL360G5 redhat 5.1 kernel 2.6.18-53.el5PAE. This only seems to be an issue when using multiple logical volumes on the same cciss controller. No issues when only one array configured. The hp cpq_cciss driver isnt supported yet on 5.1 base kernel. Is HP aware of this issue? Need this fixed soon!!!! Based on my interaction with HP support, this may be a dup of bug #429515 which is a copy of bug #426873 which I do not have access to. I've also seen this on an HP-DL-360-G5 with 6-drives and three logical volumes on the sole SmartArray controller running RHEL 5.1 x86_64. Serial console shows: list_add corruption. prev->next should be ffffffff802fcf90, but was ffff81042f434490 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at lib/list_debug.c:31 invalid opcode: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:02.0/0000:09:00.0/0000:0a:00.0/0000 :0b:00.1/irq CPU 0 Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler hidp rfcomm l2cap blueto oth sunrpc ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_multipath dm_mod vide o sbs backlight i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parp ort_pc lp parport shpchp e1000 serio_raw bnx2 pcspkr ata_piix libata cciss sd_mo d scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 8874, comm: .hpacucli Not tainted 2.6.18-53.el5 #1 RIP: 0010:[<ffffffff80143629>] [<ffffffff80143629>] __list_add+0x48/0x68 RSP: 0018:ffff810413cefb78 EFLAGS: 00013282 RAX: 0000000000000058 RBX: ffffffff802fcf90 RCX: ffffffff802e5728 RDX: ffffffff802e5728 RSI: 0000000000000000 RDI: ffffffff802e5720 RBP: ffff81042f434490 R08: ffffffff802e5728 R09: 0000000000003046 R10: 0000000000000000 R11: 0000000000000180 R12: ffff81042f434c90 R13: ffff81042f434c70 R14: 00000000fffffffe R15: ffffffff802fcfa8 FS: 0000000000000000(0000) GS:ffffffff80396000(0063) knlGS:00000000f7fab6c0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00000000085f8004 CR3: 00000004190fd000 CR4: 00000000000006e0 Process .hpacucli (pid: 8874, threadinfo ffff810413cee000, task ffff81040b2dc7e0) Stack: ffff81042f434c70 ffff81042f434c00 ffff81042f9a0000 ffffffff8014108f ffffffff80058eec ffff81042f434c78 ffff81042f434c00 ffff81042f9a0000 ffff81042f434c70 0000000000000001 ffff81042f9a0000 ffffffff800fcf35 Call Trace: [<ffffffff8014108f>] kobject_add+0xb5/0x199 [<ffffffff80058eec>] exact_lock+0x0/0x14 [<ffffffff800fcf35>] register_disk+0x43/0x199 [<ffffffff80138d83>] add_disk+0x34/0x3d [<ffffffff880b49a7>] :cciss:rebuild_lun_table+0x48f/0x50f [<ffffffff800c35e8>] zone_statistics+0x3e/0x6d [<ffffffff880b4e21>] :cciss:cciss_ioctl+0x3fa/0xc65 [<ffffffff801429ee>] snprintf+0x44/0x4c [<ffffffff8002ae6c>] flush_tlb_page+0xac/0xda [<ffffffff880b56b6>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880b5774>] :cciss:cciss_compat_ioctl+0xaf/0x25f [<ffffffff8011d242>] inode_has_perm+0x56/0x63 [<ffffffff80021d2f>] __up_read+0x19/0x7f [<ffffffff8011d2e3>] file_has_perm+0x94/0xa3 [<ffffffff80137fe5>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800edee9>] compat_sys_ioctl+0xc5/0x2b1 [<ffffffff8005f49b>] sysenter_do_call+0x1b/0x67 Code: 0f 0b 68 87 5c 29 80 c2 1f 00 4c 89 63 08 49 89 1c 24 4c 89 RIP [<ffffffff80143629>] __list_add+0x48/0x68 RSP <ffff810413cefb78> <0>Kernel panic - not syncing: Fatal exception I tested with kernel 2.6.18-53.1.14.el5 x86_64 and I cannot trigger the panic any more. However, something is still being tickled in the kernel because several blank lines are printed to the serial console every time I run hpacucli. Updating PM score. Mike, Scott, any comments? I suspect the blank lines are noise where the driver is printing the geometry of the logical volumes. This should have addressed by commit: 983333cb0c445c56808502461bbb34876c63eb2b. According to the git log, this commit was backported into RHEL5 last April and should be commit a1fcf3f8fa7ef40ba3a829f781d639632391bc21 Author: Tomas Henzl <thenzl> Date: Mon Apr 26 12:19:34 2010 -0400 [cciss] remove extraneous printk This patch did not make it into RHEL5.5, but is in the RHEL5.6 code. Has anybody seen this problem in RHEL5.6? Verified with hpacucli-8.70-8.0 on a ProLiant DL380 G6. This was verified and should have been closed. |