Bug 196437 - IBM ServeRAID (IPS) driver fails to load on IBM RS/6000 7025-F50
IBM ServeRAID (IPS) driver fails to load on IBM RS/6000 7025-F50
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
7
powerpc Linux
medium Severity medium
: ---
: ---
Assigned To: Brad Peters
Brian Brock
: Reopened
: 196438 218548 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-06-23 06:09 EDT by Alexey Bozrikov
Modified: 2008-08-02 19:40 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-04-24 23:46:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lspci -v output (3.58 KB, text/plain)
2006-11-13 02:51 EST, Alexey Bozrikov
no flags Details

  None (edit)
Description Alexey Bozrikov 2006-06-23 06:09:26 EDT
Description of problem:
ips.ko driver fails to load on IBM 7025-F50 equipped with IBM ServeRAID 4H 4-
channel Ultra3 SCSI RAID controller. Following recorded in dmesg:
PCI: Enabling device 0001:40:0c.0 (0140 -> 0143)
 0:0:4:0: Attached scsi generic sg0 type 5
sd 1:0:8:0: Attached scsi generic sg1 type 0
sd 1:0:9:0: Attached scsi generic sg2 type 0
....skipped....
ips 0001:40:0c.0: unable to read config from controller.
ips 0001:40:0c.0: Unable to initialize controller
ips: probe of 0001:40:0c.0 failed with error -1

When I try to do 'modprobe ips' manually after system boot, different error is 
logged in dmesg:
[1st try]
ips 0001:40:0c.0: Couldn't allocate IO space feffe800 len 256.
ips: probe of 0001:40:0c.0 failed with error -1
[2nd try]
ips 0001:40:0c.0: Couldn't allocate IO Memory space d7f00000 len 1048576.
ips: probe of 0001:40:0c.0 failed with error -1


Controller particulars (lspci -v):
0001:40:0c.0 RAID bus controller: IBM SCSI RAID Adapter [ServeRAID] (rev 10)
        Subsystem: IBM ServeRAID-4H
        Flags: medium devsel, IRQ 23
        I/O ports at e4000800 [size=256]
        Memory at d7f00000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at d7eb8000 [disabled] [size=32K]
        Capabilities: [40] Vital Product Data
        Capabilities: [48] Power Management version 2

Version-Release number of selected component (if applicable):
Linux version 2.6.16-1.2133_FC5smp 
Kernel command line: root=/dev/md1 ro rhgb splash ips=debug:11

How reproducible:
always
Steps to Reproduce:
1. modprobe ips
2.
3.
  
Actual results:


Expected results:


Additional info: This does not seem to be pure hardware problem. When I boot 
IBM AIX operating system on the machine everything works fine, no errors.
Comment 1 Alexey Bozrikov 2006-06-23 06:17:35 EDT
*** Bug 196438 has been marked as a duplicate of this bug. ***
Comment 2 Dave Jones 2006-10-16 17:53:39 EDT
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.
Comment 3 Alexey Bozrikov 2006-10-17 02:11:08 EDT
Kernel version 2.6.18-1.2200.fc5 SMP PPC does not fix the issue with IPS 
driver. When booting the machine, following is recorded in dmesg:
[quote dmesg]
BUG: soft lockup detected on CPU#2!
Call Trace:
[CF61FB40] [C0008D8C] show_stack+0x50/0x184 (unreliable)
[CF61FB60] [C0066CB8] softlockup_tick+0xe4/0x100
[CF61FB80] [C00440F8] run_local_timers+0x18/0x28
[CF61FB90] [C00443C8] update_process_times+0x48/0x84
[CF61FBB0] [C000F020] timer_interrupt+0x104/0x5e8
[CF61FC20] [C0012AF0] ret_from_except+0x0/0x14
--- Exception: 901 at __delay+0x40/0x5c
    LR = ips_send_wait+0xa8/0xe4 [ips]
[CF61FCE0] [F20C81B0] ips_send_wait+0xa0/0xe4 [ips] (unreliable)
[CF61FD00] [F20C94F0] ips_init_phase2+0x118/0xcd4 [ips]
[CF61FD40] [F20CBC88] ips_insert_device+0x9a0/0xa30 [ips]
[CF61FD90] [C0141F14] pci_device_probe+0x6c/0xa0
[CF61FDB0] [C01C63F0] driver_probe_device+0x60/0xf4
[CF61FDD0] [C01C6610] __driver_attach+0xbc/0x130
[CF61FDF0] [C01C5CC0] bus_for_each_dev+0x50/0x94
[CF61FE20] [C01C6304] driver_attach+0x24/0x34
[CF61FE30] [C01C58A8] bus_add_driver+0x78/0x128
[CF61FE50] [C01C699C] driver_register+0xa0/0xb4
[CF61FE60] [C0141D20] __pci_register_driver+0x64/0xa4
[CF61FE70] [F100C028] ips_module_init+0x28/0x300 [ips]
[CF61FE90] [C0059F88] sys_init_module+0x15d0/0x1768
[CF61FF40] [C0012444] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff2c8b4
    LR = 0x10003890
ips 0001:40:0c.0: unable to read config from controller.
ips 0001:40:0c.0: Unable to initialize controller
ips: probe of 0001:40:0c.0 failed with error -1
[unquote]
Comment 4 Dave Jones 2006-10-17 17:42:45 EDT
I've asked the upstream maintainers (ipslinux@adaptec.com) to take a look at this.
Comment 5 Dave Jones 2006-10-20 18:40:40 EDT
I just merged a patch which may fix this. It'll be in the next update out soon.
Comment 6 Alexey Bozrikov 2006-11-03 02:46:23 EST
Newer kernel in FC6 does not solve the issue.

Kernel version:
2.6.18-1.2798.fc6smp #1 SMP Mon Oct 16 15:44:11 EDT 2006 ppc ppc ppc GNU/Linux

exactly same behavior observed as on 2.6.18-1.2200.fc5 (posted earlier on), 
although some hex values/addresses in stack trace changed
Comment 7 Janice Girouard - IBM on-site partner 2006-11-03 13:37:23 EST
I'm requesting this bug to be mirrored to IBM so that Brian King can take a look
at this.  
Comment 8 Dave Jones 2006-11-12 00:50:24 EST
should be fixed in 2.6.18-1.2239.fc5 now in updates.
Comment 9 Alexey Bozrikov 2006-11-13 02:49:02 EST
Downloaded kernel as directed, version:
[quote]
2.6.18-1.2239.fc5smp #1 SMP Fri Nov 10 13:30:33 EST 2006 ppc
[unquote]
same behavior encountered:
[quote]
BUG: soft lockup detected on CPU#0!
Call Trace:
[CF4ADB40] [C0008D8C] show_stack+0x50/0x184 (unreliable)
[CF4ADB60] [C0066D2C] softlockup_tick+0xe4/0x100
[CF4ADB80] [C0044098] run_local_timers+0x18/0x28
[CF4ADB90] [C004435C] update_process_times+0x48/0x84
[CF4ADBB0] [C000F020] timer_interrupt+0x104/0x5e8
[CF4ADC20] [C0012AF0] ret_from_except+0x0/0x14
--- Exception: 901 at __delay+0x44/0x5c
    LR = ips_send_wait+0xa8/0xe4 [ips]
[CF4ADCE0] [F211E1C8] ips_send_wait+0xa0/0xe4 [ips] (unreliable)
[CF4ADD00] [F211F550] ips_init_phase2+0x118/0xcd4 [ips]
[CF4ADD40] [F2121CE8] ips_insert_device+0x9a0/0xa30 [ips]
[CF4ADD90] [C0141A2C] pci_device_probe+0x6c/0xa0
[CF4ADDB0] [C01C5E7C] driver_probe_device+0x60/0xf4
[CF4ADDD0] [C01C609C] __driver_attach+0xbc/0x130
[CF4ADDF0] [C01C574C] bus_for_each_dev+0x50/0x94
[CF4ADE20] [C01C5D90] driver_attach+0x24/0x34
[CF4ADE30] [C01C5334] bus_add_driver+0x78/0x128
[CF4ADE50] [C01C6428] driver_register+0xa0/0xb4
[CF4ADE60] [C0141838] __pci_register_driver+0x64/0xa4
[CF4ADE70] [F100C028] ips_module_init+0x28/0x300 [ips]
[CF4ADE90] [C0059F94] sys_init_module+0x15d0/0x1768
[CF4ADF40] [C0012444] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff2c954
    LR = 0x100036c0
ips 0001:40:0c.0: unable to read config from controller.
ips 0001:40:0c.0: Unable to initialize controller
ips: probe of 0001:40:0c.0 failed with error -1
[unquote]

Upgraded to FC6, tried following kernels (with same result):
kernel-smp-2.6.18-1.2798.fc6 PPC
kernel-smp-2.6.18-1.2849.fc6 PPC

Attaching 'lspci -v' output, possibly this can help?

Alex
Comment 10 Alexey Bozrikov 2006-11-13 02:51:16 EST
Created attachment 141023 [details]
lspci -v output
Comment 11 J. Adam Hough 2006-12-08 19:44:46 EST
*** Bug 218548 has been marked as a duplicate of this bug. ***
Comment 12 J. Adam Hough 2006-12-08 19:47:40 EST
Fedora Core (2.6.18-1.2239.fc5smp) (i686)

08:02.0 RAID bus controller: Adaptec ServeRAID Controller (rev 02)
        Subsystem: IBM ServeRAID-xx
        Flags: bus master, stepping, 66MHz, medium devsel, latency 64, IRQ 169
        Memory at effff000 (32-bit, non-prefetchable) [size=4K]
        Memory at f0000000 (32-bit, prefetchable) [size=64M]
        [virtual] Expansion ROM at 50060000 [disabled] [size=32K]
        Capabilities: [c0] Power Management version 2
        Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
        Capabilities: [e0] PCI-X non-bridge device


BUG: soft lockup detected on CPU#1!
 [<c04050ef>] dump_trace+0x69/0x1af
 [<c040524d>] show_trace_log_lvl+0x18/0x2c
 [<c0405800>] show_trace+0xf/0x11
 [<c04058fa>] dump_stack+0x15/0x17
 [<c044b779>] softlockup_tick+0xad/0xc4
 [<c042ea75>] update_process_times+0x39/0x5c
 [<c0418912>] smp_apic_timer_interrupt+0x5b/0x61
 [<c04049f3>] apic_timer_interrupt+0x1f/0x24
DWARF2 unwinder stuck at apic_timer_interrupt+0x1f/0x24
Leftover inexact backtrace:
 [<c04e6c7d>] delay_tsc+0x9/0x13
 [<c04e6cb0>] __delay+0x6/0x7
 [<f886e6f7>] ips_init_morpheus+0x7c/0x2d1 [ips]
 [<c04eac7e>] pci_bus_read_config_byte+0x57/0x61
 [<f886b667>] ips_reset_morpheus+0x89/0xc5 [ips]
 [<c0408da6>] dma_alloc_coherent+0xb1/0xec
 [<f886daa1>] ips_insert_device+0x79c/0x848 [ips]
 [<c0550f57>] __driver_attach+0x0/0x8f
 [<c04eeacf>] pci_device_probe+0x36/0x57
 [<c0550e91>] driver_probe_device+0x45/0x9a
 [<c0550fbc>] __driver_attach+0x65/0x8f
 [<c0550916>] bus_for_each_dev+0x37/0x59
 [<c0550df2>] driver_attach+0x16/0x18
 [<c0550f57>] __driver_attach+0x0/0x8f
 [<c055060e>] bus_add_driver+0x6f/0x10d
 [<c04eec01>] __pci_register_driver+0x49/0x63
 [<f8802016>] ips_module_init+0x16/0x158 [ips]
 [<c043f5f9>] sys_init_module+0x17cc/0x1965
 [<c046f2fa>] __fput+0x15e/0x188
 [<c0403f33>] syscall_call+0x7/0xb
 =======================

Comment 13 Alexey Bozrikov 2007-01-22 07:39:20 EST
Just to update, same behavior continues with FC6, latest kernel. Version string 
is:
2.6.19-1.2895.fc6smp #1 SMP Wed Jan 10 19:05:29 EST 2007 ppc ppc ppc GNU/Linux
[quote]
PCI: Enabling device 0001:40:0c.0 (0140 -> 0143)
BUG: soft lockup detected on CPU#1!
Call Trace:
[CF567B40] [C0008F4C] show_stack+0x50/0x184 (unreliable)
[CF567B60] [C0068588] softlockup_tick+0xe4/0x100
[CF567B80] [C0043F24] run_local_timers+0x18/0x28
[CF567B90] [C00441E8] update_process_times+0x48/0x84
[CF567BB0] [C000F194] timer_interrupt+0x124/0x62c
[CF567C20] [C0012C40] ret_from_except+0x0/0x14
--- Exception: 901 at __delay+0x44/0x5c
    LR = ips_send_wait+0xa8/0xe4 [ips]
[unquote]

Could it be that some message above has a relation yo this bug (although I do 
not think so since other PCI devices are OK)?
[quote]
PCI: Probing PCI hardware
PCI: Cannot allocate resource region 0 of device 0000:00:0b.0
PCI: Cannot allocate resource region 0 of device 0000:00:0c.0
PCI: Cannot allocate resource region 0 of device 0000:00:10.0
PCI: Cannot allocate resource region 0 of device 0001:40:0b.0
PCI: Cannot allocate resource region 0 of device 0001:40:0b.1
PCI: Cannot allocate resource region 0 of device 0001:40:0c.0
PCI: Cannot allocate resource region 0 of device 0002:80:0b.0
PCI: Cannot allocate resource region 0 of device 0002:80:0c.0
[unquote]
Comment 14 Alexey Bozrikov 2007-03-06 03:54:15 EST
After latest kernel update situation still the same:
# uname -a
f50.company.com 2.6.19-1.2911.6.4.fc6smp #1 SMP Sat Feb 24 14:19:13 EST 2007 
ppc ppc ppc GNU/Linux

dmesg shows:

BUG: soft lockup detected on CPU#1!
Call Trace:
[CF405B40] [C0008F4C] show_stack+0x50/0x184 (unreliable)
[CF405B60] [C00685B4] softlockup_tick+0xe4/0x100
[CF405B80] [C0043F50] run_local_timers+0x18/0x28
[CF405B90] [C0044214] update_process_times+0x48/0x84
[CF405BB0] [C000F194] timer_interrupt+0x124/0x62c
[CF405C20] [C0012C40] ret_from_except+0x0/0x14
--- Exception: 901 at __delay+0x40/0x5c
    LR = ips_send_wait+0xa8/0xe4 [ips]
[CF405CE0] [F218D1D0] ips_send_wait+0xa0/0xe4 [ips] (unreliable)
[CF405D00] [F218E55C] ips_init_phase2+0x118/0xcd4 [ips]
[CF405D40] [F2190D28] ips_insert_device+0x9d4/0xa64 [ips]
[CF405D90] [C0148318] pci_device_probe+0x6c/0xa0
[CF405DB0] [C01CD748] really_probe+0x54/0x140
[CF405DD0] [C01CDAB0] __driver_attach+0xbc/0x130
[CF405DF0] [C01CCB84] bus_for_each_dev+0x50/0x94
[CF405E20] [C01CD624] driver_attach+0x24/0x34
[CF405E30] [C01CCF54] bus_add_driver+0x68/0x190
[CF405E50] [C01CDE34] driver_register+0x98/0xac
[CF405E60] [C0148124] __pci_register_driver+0x94/0xd4
[CF405E70] [F2106028] ips_module_init+0x28/0x300 [ips]
[CF405E90] [C005B974] sys_init_module+0x1610/0x17bc
[CF405F40] [C0012594] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff2c9a4
    LR = 0x100036c0
ips 0001:40:0c.0: unable to read config from controller.
ips 0001:40:0c.0: Unable to initialize controller
ips: probe of 0001:40:0c.0 failed with error -1
Comment 15 Jon Stanley 2007-12-30 21:19:25 EST
Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!
Comment 16 Alexey Bozrikov 2008-01-02 17:22:02 EST
Tested the bug. There are no more soft lockups, just message:

ips 0001:40:0c.0: unable to read config from controller.
ips 0001:40:0c.0: Unable to initialize controller
ips: probe of 0001:40:0c.0 failed with error -1

Tried to load IPS driver with various arguments to no avail.

Alex
bozy@fgm.com.cy
Comment 17 Jon Stanley 2008-01-02 17:57:17 EST
Is this still in FC6, or in F8?
Comment 18 Alexey Bozrikov 2008-01-03 07:02:38 EST
This is FC7 (Moonshine) with latest kernel 2.6.23.8-34.fc7
Comment 19 Jon Stanley 2008-01-03 10:59:40 EST
Changing version to F7 in that case.
Comment 20 Jon Stanley 2008-03-09 01:10:05 EST
Is it possible to try on F8, or on the F9 beta which will be coming out soon
(currently scheduled for 3/20)?

Dave - any further words of wisdom on this one?
Comment 21 Brian Powell 2008-04-24 23:46:23 EDT
Note that maintenance for Fedora 7 will end 30 days after the GA of Fedora 9.
Comment 22 Brian Powell 2008-04-25 00:02:13 EDT
The information we've requested above is required in order
to review this problem report further and diagnose/fix the
issue if it is still present.  Since there have not been any
updates to the report since thirty (30) days or more since we
requested additional information, we're assuming the problem
is either no longer present in the current Fedora release, or
that there is no longer any interest in tracking the problem.

Setting status to "CLOSED INSUFFICIENT_DATA".  If you still
experience this problem after updating to our latest Fedora
release and can provide the information previously requested, 
please feel free to reopen the bug report.

Thank you in advance.

Note that maintenance for Fedora 7 will end 30 days after the GA of Fedora 9.

Note You need to log in before you can comment on or make changes to this bug.