Bug 208179

Summary: rawhide fails to boot from cciss driver
Product: [Fedora] Fedora Reporter: Bill Peck <bpeck>
Component: kernelAssignee: Chip Coldwell <coldwell>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: coughlan, davej, mike.miller, triage, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-07 00:53:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 202141    

Description Bill Peck 2006-09-26 19:54:14 UTC
Description of problem:
After installing rawhide on a dl360 the system fails to boot from the onboard
cciss controller.

Version-Release number of selected component (if applicable):
rawhide-20060926
kernel-2.6.18-1.2693.fc6.i686

How reproducible:
Everytime on dl360

Actual results:
Here is the seriel output.  I can attach the full console if needed.

<Sep/26 03:41 pm>Loading jbd.ko module
<Sep/26 03:41 pm>Loading ext3.ko module
<Sep/26 03:41 pm>Loading scsi_mod.ko module
<Sep/26 03:41 pm>SCSI subsystem initialized
<Sep/26 03:41 pm>Loading sd_mod.ko module
<Sep/26 03:41 pm>Loading cciss.ko module
<Sep/26 03:41 pm>HP CISS Driver (v 3.6.10)
<Sep/26 03:41 pm>ACPI: PCI Interrupt 0000:0a:01.0[A] -> GSI 72 (level, low) ->
IRQ 201
<Sep/26 03:41 pm>cciss0: <0x3220> at PCI 0000:0a:01.0 IRQ 201 using DAC
<Sep/26 03:41 pm>      blocks= 71065440 block_size= 512
<Sep/26 03:41 pm>      heads= 255, sectors= 32, cylinders= 8709

<Sep/26 03:41 pm>INFO: trying to register non-static key.
<Sep/26 03:41 pm>the code is fine but needs lockdep annotation.
<Sep/26 03:41 pm>turning off the locking correctness validator.
<Sep/26 03:41 pm> [<c04051ed>] show_trace_log_lvl+0x58/0x16a
<Sep/26 03:41 pm> [<c04057fa>] show_trace+0xd/0x10
<Sep/26 03:41 pm> [<c0405913>] dump_stack+0x19/0x1b
<Sep/26 03:41 pm> [<c043afc6>] __lock_acquire+0xfd/0x99c
<Sep/26 03:41 pm> [<c043bdd6>] lock_acquire+0x4b/0x6d
<Sep/26 03:41 pm> [<c0614a2c>] _spin_lock_irq+0x1f/0x2e
<Sep/26 03:41 pm> [<c06128ac>] wait_for_completion+0x29/0x9e
<Sep/26 03:41 pm> [<f88b8c36>] sendcmd_withirq+0x173/0x289 [cciss]
<Sep/26 03:41 pm> [<f88b9c9a>] cciss_read_capacity+0x2f/0xa6 [cciss]
<Sep/26 03:41 pm> [<f88bbc32>] cciss_revalidate+0xaa/0xfa [cciss]
<Sep/26 03:41 pm> [<c04a7f4c>] rescan_partitions+0x6b/0x1e7
<Sep/26 03:41 pm> [<c047a1d0>] do_open+0x2e9/0x3e9
<Sep/26 03:41 pm> [<c047a341>] blkdev_get+0x71/0x7c
<Sep/26 03:41 pm> [<c04a7e8f>] register_disk+0x115/0x167
<Sep/26 03:41 pm> [<c04e0553>] add_disk+0x2e/0x3d
<Sep/26 03:41 pm> [<f88bc56a>] cciss_init_one+0x8e8/0xa5a [cciss]
<Sep/26 03:41 pm> [<c04f3e01>] pci_device_probe+0x39/0x5b
<Sep/26 03:41 pm> [<c0554ea2>] driver_probe_device+0x45/0x92
<Sep/26 03:41 pm> [<c0554fcf>] __driver_attach+0x68/0x93
<Sep/26 03:41 pm> [<c055491b>] bus_for_each_dev+0x3a/0x5f
<Sep/26 03:41 pm> [<c0554dfd>] driver_attach+0x14/0x17
<Sep/26 03:41 pm> [<c05545f2>] bus_add_driver+0x68/0x106
<Sep/26 03:41 pm> [<c0555278>] driver_register+0x78/0x7d
<Sep/26 03:41 pm> [<c04f3f52>] __pci_register_driver+0x4f/0x69
<Sep/26 03:41 pm> [<f883001c>] cciss_init+0x1c/0x1e [cciss]
<Sep/26 03:41 pm> [<c044232e>] sys_init_module+0x16ad/0x1856
<Sep/26 03:41 pm> [<c0403fb7>] syscall_call+0x7/0xb
<Sep/26 03:41 pm>DWARF2 unwinder stuck at syscall_call+0x7/0xb
<Sep/26 03:41 pm>Leftover inexact backtrace:
<Sep/26 03:41 pm> [<c04057fa>] show_trace+0xd/0x10
<Sep/26 03:41 pm> [<c0405913>] dump_stack+0x19/0x1b
<Sep/26 03:41 pm> [<c043afc6>] __lock_acquire+0xfd/0x99c
<Sep/26 03:41 pm> [<c043bdd6>] lock_acquire+0x4b/0x6d
<Sep/26 03:41 pm> [<c0614a2c>] _spin_lock_irq+0x1f/0x2e
<Sep/26 03:41 pm> [<c06128ac>] wait_for_completion+0x29/0x9e
<Sep/26 03:41 pm> [<f88b8c36>] sendcmd_withirq+0x173/0x289 [cciss]
<Sep/26 03:41 pm> [<f88b9c9a>] cciss_read_capacity+0x2f/0xa6 [cciss]
<Sep/26 03:41 pm> [<f88bbc32>] cciss_revalidate+0xaa/0xfa [cciss]
<Sep/26 03:41 pm> [<c04a7f4c>] rescan_partitions+0x6b/0x1e7
<Sep/26 03:41 pm> [<c047a1d0>] do_open+0x2e9/0x3e9
<Sep/26 03:41 pm> [<c047a341>] blkdev_get+0x71/0x7c
<Sep/26 03:41 pm> [<c04a7e8f>] register_disk+0x115/0x167
<Sep/26 03:41 pm> [<c04e0553>] add_disk+0x2e/0x3d
<Sep/26 03:41 pm> [<f88bc56a>] cciss_init_one+0x8e8/0xa5a [cciss]
<Sep/26 03:41 pm> [<c04f3e01>] pci_device_probe+0x39/0x5b
<Sep/26 03:41 pm> [<c0554ea2>] driver_probe_device+0x45/0x92
<Sep/26 03:41 pm> [<c0554fcf>] __driver_attach+0x68/0x93
<Sep/26 03:41 pm> [<c055491b>] bus_for_each_dev+0x3a/0x5f
<Sep/26 03:41 pm> [<c0554dfd>] driver_attach+0x14/0x17
<Sep/26 03:41 pm> [<c05545f2>] bus_add_driver+0x68/0x106
<Sep/26 03:41 pm> [<c0555278>] driver_register+0x78/0x7d
<Sep/26 03:41 pm> [<c04f3f52>] __pci_register_driver+0x4f/0x69
<Sep/26 03:41 pm> [<f883001c>] cciss_init+0x1c/0x1e [cciss]
<Sep/26 03:41 pm> [<c044232e>] sys_init_module+0x16ad/0x1856
<Sep/26 03:41 pm> [<c0403fb7>] syscall_call+0x7/0xb
<Sep/26 03:41 pm>      blocks= 71065440 block_size= 512
<Sep/26 03:41 pm>      heads= 255, sectors= 32, cylinders= 8709

<Sep/26 03:41 pm> cciss/c0d0: p1 p2 p3
<Sep/26 03:41 pm>Trying to resume from LABEL=SW-cciss/c0d0p3
<Sep/26 03:41 pm>Unable to access resume device (LABEL=SW-cciss/c0d0p3)
<Sep/26 03:41 pm>Creating root device.
<Sep/26 03:41 pm>Mounting root filesystem.
<Sep/26 03:41 pm>mount: could not find filesystem '/dev/root'
<Sep/26 03:41 pm>SeKernel panic - not syncing: Attempted to kill init!
<Sep/26 03:41 pm>tting up other f ilesystems.

The targets are seen but it almost seems like udev is not finishing the piece.

Comment 1 Dave Jones 2006-09-28 22:15:43 UTC
That's a lockdep trace, which shouldn't be the reason that it doesn't work, so
you're seeing two problems here.  I'll add this to the lockdep tracker, but the
'doesnt work' bug will likely need someone with access to the hardware to debug.
Tom?

Comment 2 Tom Coughlan 2006-10-03 02:08:16 UTC
Bill, Can you check to see if this problem is happening on other cciss-based
RHTS systems? Maybe someone can try xeon3. Just wondering how specific this is.
We can also test with cciss as an add-on board, not the boot device. 

Comment 3 Chip Coldwell 2006-10-10 18:29:37 UTC
(In reply to comment #2)
> Bill, Can you check to see if this problem is happening on other cciss-based
> RHTS systems? Maybe someone can try xeon3. Just wondering how specific this is.
> We can also test with cciss as an add-on board, not the boot device. 

I just installed 2.6.18-1.2693.fc6 on xeon3; it came up with no problems.

I note that lspci indicates this device came up as a bus master (see bug #205653
for a possibly related issue).

# lspci -s 0000:00:0e.0 -xxx
00:0e.0 RAID bus controller: Compaq Computer Corporation Smart Array 5i/532 (rev 01)
00: 11 0e 78 b1 57 01 b0 02 01 00 04 01 10 47 00 00
10: 04 00 f8 f5 00 00 00 00 01 24 00 00 0c 00 ef f5
20: 00 00 00 00 00 00 00 00 00 00 00 00 11 0e 80 40
30: 00 00 00 00 c0 00 00 00 00 00 00 00 07 01 00 00
40: 00 00 00 00 00 00 00 00 ff 01 07 00 00 00 00 00
50: 20 01 00 00 68 60 00 02 00 0a 00 00 60 00 00 00
60: c3 21 00 00 00 00 00 a0 01 00 00 e0 00 00 00 00
70: 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 01 00 00 00 03 00 fc ff 00 00 7c 40
90: 01 ff ff ff 00 00 00 98 01 c0 ff ff 00 00 f7 ff
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 01 cc 02 02 00 00 00 00 18 00 00 00 05 dc 82 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 07 00 30 00
e0: 70 ff 81 05 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



Comment 4 Tom Coughlan 2006-10-13 19:37:00 UTC
Mike, we are seeing failures booting recent kernels on DL360 with on-board
cciss. (This is Fedora. Same on RHEL 5 - Bug 200729). The problem did not occur
on Compaq ProLiant DL760. 

It looks like the driver loaded okay, so this may not be your area, but I'm
wondering if you have seen this?


Comment 5 Bug Zapper 2008-04-03 18:20:18 UTC
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 6 Bug Zapper 2008-05-07 00:53:29 UTC
This bug has been in NEEDINFO for more than 30 days since feedback was
first requested. As a result we are closing it.

If you can reproduce this bug in the future against a maintained Fedora
version please feel free to reopen it against that version.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp