Bug 140002 - [PATCH] i2o_block timeout Adaptec 2400A raid card
[PATCH] i2o_block timeout Adaptec 2400A raid card
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Adam "mantis" Manthei
:
Depends On:
Blocks: 154907 156322
  Show dependency treegraph
 
Reported: 2004-11-19 03:23 EST by Sergey Shanygin
Modified: 2007-11-30 17:07 EST (History)
8 users (show)

See Also:
Fixed In Version: RHSA-2005-514
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-05 08:33:24 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
output of lspci -vv (5.63 KB, text/plain)
2004-11-30 07:26 EST, Sergey Shanygin
no flags Details
patch to increase timeout for LCT_NOTIFY reply (423 bytes, patch)
2004-12-16 13:20 EST, Markus Lidel
no flags Details | Diff

  None (edit)
Description Sergey Shanygin 2004-11-19 03:23:31 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3)
Gecko/20041026 Firefox/1.0RC1

Description of problem:
This is identical to a previously reported bug 137866 observed on
Fedora Core 2 and 3.

Fedora Core 1 and RedHat 9 installed on the same machine without a hitch.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Start the installation, when it reports no disk, manually select i20
driver.
2.Go through the next few screens until come to partitioning
3. Regardless of authomatic or manual - install crashes and reboots
    

Actual Results:  Failure to recognise abovementioned RAID.

Expected Results:  proper install

Additional info:

Worked with FC1 and RH9 on the same HW.
Comment 1 Jeremy Katz 2004-11-22 17:03:14 EST
What driver is this card supposed to use?  i2o?

Can you switch to tty2 once there's a shell there and get the output of ls -lR
/sys/bus/i2o?
Comment 2 Sergey Shanygin 2004-11-23 11:36:49 EST
Cards bios said "Adaptec i2o bios v001.62 (2002/11/06) so it must be
I2o. In RH9 and FC1 the driver was dptI20 which is not used longer.
With that driver card was recognized without any problem.
here comes the output:
/sys/bus/i2o:
total 0
drwxr-xr-x  2 root 0 0 Nov 23 17:44 devices
drwxr-xr-x  4 root 0 0 Nov 23 17:45 drivers

/sys/bus/i2o/devices:
total 0

/sys/bus/i2o/drivers:
total 0
drwxr-xr-x  2 root 0 0 Nov 23 17:45 block-osm
drwxr-xr-x  2 root 0 0 Nov 23 17:44 exec-osm

/sys/bus/i2o/drivers/block-osm:
total 0

/sys/bus/i2o/drivers/exec-osm:
total 0

Then I thought to add a relevant part of dmesg output, just to prove
that i2o driver is actuallu loaded and (!) seems to function...

-------------------
divert: allocating divert_blk for eth0
I2O Core - (C) Copyright 1999 Red Hat Software
i2o: max_drivers=4
i2o: Checking for PCI I2O controllers...
ACPI: PCI interrupt 0000:00:06.0[A] -> GSI 5 (level, low) -> IRQ 5
i2o: I2O controller found on bus 0 at 48.
i2o: PCI I2O controller at FA000000 size=1048576
i2o: using write combining MTRR
i2o: MTRR workaround for Intel i960 processor
iop0: Installed at IRQ 5
iop0: Activating I2O controller...
iop0: This may take a few minutes if there are many devices
iop0: HRT has 1 entries of 16 bytes each.
Adapter 00000012: TID 0000:[HPC*]:PCI 1: Bus 1 Device 22 Function 0
I2O controller: probe of 0000:00:06.0 failed with error -110
I2O Block Storage OSM v0.9
   (c) Copyright 1999-2001 Red Hat Software.
block-osm: registered device at major 80
---------------------------------------
---------------------------------------
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
raid5: automatically using best checksumming function: pIII_sse
   pIII_sse  :  3044.000 MB/sec
raid5: using function: pIII_sse (3044.000 MB/sec)
md: raid5 personality registered as nr 4
raid6: int32x1    257 MB/s
raid6: int32x2    593 MB/s
raid6: int32x4    570 MB/s
raid6: int32x8    589 MB/s
raid6: mmxx1     1640 MB/s
raid6: mmxx2     2187 MB/s
raid6: sse1x1    1121 MB/s
raid6: sse1x2    2058 MB/s
raid6: sse2x1    2632 MB/s
raid6: sse2x2    2429 MB/s
raid6: using algorithm sse2x1 (2632 MB/s)
md: raid6 personality registered as nr 8
device-mapper: 4.1.0-ioctl (2003-12-10) initialised: dm@uk.sistina.com
Comment 3 Jeremy Katz 2004-11-24 23:47:28 EST
Is i2o_block loaded?
Comment 4 Sergey Shanygin 2004-11-25 11:03:31 EST
Yes, I see it loading on the first part of install, then installer
said there is no HD and ask to select a driver manually - I select I2o
from the list.
Except from lsmod:
..........
Module                  Size  Used by    Not tainted
raid6 102481 0 - Live 0xe0a12000
raid5 25793 0 - Live 0xe08fe000
xor 13641 2 raid6,raid5, Live 0xe08f9000
raid1 20929 0 - Live 0xe0977000
raid0 7617 0 - Live 0xe0871000
i2o_block 13773 0 - Live 0xe08f4000
i2o_core 39385 1 i2o_block, Live 0xe08a7000
...........
Comment 5 Jeremy Katz 2004-11-25 16:13:16 EST
Then the kernel isn't exporting the appropriate sysfs bits I need for
detecting i2o devices.

Comment 6 Markus Lidel 2004-11-26 08:06:14 EST
Think the driver has problems with enabling the controller.

> I2O controller: probe of 0000:00:06.0 failed with error -110

means that something timed out. Probably some interrupt problem...
Could you try to boot with kernel parameter "noacpi" and report if the
error disappear?
Comment 7 Sergey Shanygin 2004-11-29 05:04:04 EST
I tried "noacpi" parameter, the same problem persists.
Comment 8 Markus Lidel 2004-11-29 05:54:30 EST
Hmmm, strange...

OK, another try. Couly you please try with:

pci=noacpi acpi=noirq

or probably

pci=bios acpi=noirq?
Comment 9 Sergey Shanygin 2004-11-29 10:45:08 EST
Tried both, same result. From dmesg:

"I2O controller: Probe of 000:00:06.0 failed with error -110"

what is error -110?
Comment 10 Markus Lidel 2004-11-29 13:04:59 EST
-110 is ETIMEDOUT. It's defined in include/asm-generic/errno.h

To activate the controller, the driver send messages to the controller
and wait for response. It seems that the response will never arrive,
or  the driver will not be notified that a response has arrived...

I'm pretty sure, it has something to do with interrupts, because IIRC
i2o_hrt_get is the first function which uses the interrupt...

could you send me a lspci -vv output please?
Comment 11 Sergey Shanygin 2004-11-30 07:26:49 EST
Created attachment 107618 [details]
output of lspci -vv
Comment 12 Markus Lidel 2004-12-01 07:41:08 EST
The only thing which looks strange to me is the line:

BIST is running

think it should be:

BIST result: 00

If you load the i2o_core module, does the value of interrupt 5 in
/proc/interrupts change?
Comment 13 Tom Coughlan 2004-12-02 17:28:08 EST
I tested the installer on an Adaptec 2400A in an Dell Precision 450.
It configured the /dev/i2o/hda device with no problem. 

This further suggests that the problem is related to an interrupt
routing issue that is specific to certain platforms, as opposed to a
pervasive problem in the driver or the installer. 

Although I am running a very recent RHEL 4 internal build
(2.4.9-1.849_EL), I tend to doubt that it has a magic fix. It is more
likely related to interrupts.   
Comment 15 Sergey Shanygin 2004-12-09 11:13:34 EST
at start BIOS reports:
Adaptec I2o BIOS v0001.62 (2002/11/06)
Controller:0xFA000000 IRQ5 2400A FW3A0L
.........
press F1 to continue (pressing F1)
Startup screen for the type of install - graphical, text
use text.
loading vmlinuz...

Pressing ALT-F3 showed that it:
*loaded i2o_core from /modules/modules.cgz
*------ i2o_block --------------------
<skip>
*inserted /tmp/i2o_core.ko
*--------------i2o_block.ko
* load module set done

Pressing ALT-F4 showed that:
<6>I2O Core - (c) Copyright 1999 Red Hat Soft
<6>i2o: max_drivers=4
<6>i2o:Checking for PCI  I2O controllers
<6>i2o:PCI interrupt 0000:00:06.0[A] ->GSI 5 (level, low)->IRQ 5
<6>i2o:I2O controller found on bus 0 at 48
<6>i2o: PCI I2O controller at FA000000 size=1048576
<6>i2o: using write combining MTRR
<6>i2o: MTRR workaround for Intel i960 processor
<6>iop0: Installed at IRQ 5
<6>iop0: Activating I2O controller
<6>iop0: This may take a few minutes if there are many devices
<6>iop0: HRT has 1 entries of 16 bytes each
<6>Adapter 00000012:TID 0000:[HPC*]:PCI 1: Bus 1 Device22 Function 0
<4>I2O controller: Probe of 0000:00:06 failed with error -110
<6>I2O block Storage OSM v0.9
<6>   (c) Copyright 1999-2001 Red Hat Software
<6>block-osm: registered device at major 80
 End of relevant part....

At that poing at the first console I am at the screen "Bagin testing
the CD..."
Skip the test... Next screen - No hard drives have been found... Would
you like to select drivers now? - yes

Choosing I20 block driver (i2o_block)
I am now at Welcome screen, ALT-F2 brings me to the console
lsmod:
i2o_block 13773 0 - Live 0x08f4000
i2o_core 39385 1 i2o_block; Live 0xe08a7000

/proc/interrupts is 0

that's all I could extract...
Comment 16 Sergey Shanygin 2004-12-13 04:41:09 EST
Sorry, contrary to what I wrote I missed the contents of
/proc/interrupts. Here it is:
 
          CPU0       
  0:     236510          XT-PIC  timer
  1:        268          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  3:          0          XT-PIC  ohci_hcd
  6:         31          XT-PIC  floppy
  8:          0          XT-PIC  rtc
  9:          0          XT-PIC  acpi
 11:       1815          XT-PIC  ide2
 12:       1226          XT-PIC  i8042
NMI:          0 
ERR:          0
Comment 17 Sergey Shanygin 2004-12-13 05:36:22 EST
And the kernel version I am running is 2.6.9-1.648_EL #1 Tue Oct 26
12:39:58 EDT 2004
Comment 18 Markus Lidel 2004-12-13 05:47:15 EST
The output from /proc/interrupts is before or after you loaded i2o_core?
Comment 19 Sergey Shanygin 2004-12-13 09:24:34 EST
At the very end, just before the Disk Druid part of install.
Comment 20 Markus Lidel 2004-12-15 05:58:34 EST
OK, i have the same error report as yours with a different system, but
the same controller... I'll let you know if i found something out...
Comment 21 Markus Lidel 2004-12-16 13:19:11 EST
OK, on the other system the timeout occured, because the time to get a
response from the I2O controller for LCT_NOTIFY took too long. I'll
submit a patch, which increases the timeout.

If on your system the time to load the dpt_i2o driver also takes very
long (> 20 sec.), it's very likely be the same problem.
Comment 22 Markus Lidel 2004-12-16 13:20:28 EST
Created attachment 108732 [details]
patch to increase timeout for LCT_NOTIFY reply
Comment 23 Sergey Shanygin 2004-12-17 04:51:04 EST
dpt_i2o driver (used in RHEL3, FC1 and before) took a time to load,
but it loaded and eventually worked well. It is specifically newer i2o
driver (I presume associated with kernel 2.6.xx) which is a problem.
It definitely *is* a slower load, although I did not clock it.
Comment 24 Markus Lidel 2004-12-17 17:20:55 EST
you are right, in 2.6 the I2O subsystem was partly rewritten, and the
timeout was set too short. The patch to increase the timeout is
already send to be included into mainline kernel, and will be in 2.6.10.
Comment 26 Warren Togami 2005-03-04 19:11:07 EST
Markus did this patch end up merged upstream yet?
Comment 27 Markus Lidel 2005-03-04 20:15:36 EST
Yep, it's already merged in 2.6.10.
Comment 30 Mike Kinney 2005-06-10 10:13:21 EDT
(In reply to comment #27)
> Yep, it's already merged in 2.6.10.

I have a RHEL4 system, which is on the latest kernel 2.6.9-11.ELsmp, and I
cannot see the data on a 2400a array. What's the timeline to get to 2.6.10
kernel out or how can I get the patch before then?
Comment 31 Tom Coughlan 2005-06-10 13:53:25 EDT
This patch is planned for RHEL 4 Update 2. (BTW, this, like all RHEL 4 kernels,
will be 2.6.9 based.) U1 just shipped so U2 is a ways away. If you need an RHEL
4 kernel with this patch before U2, please contact the Red Hat support
organization. 
Comment 32 Warren Togami 2005-06-10 15:56:22 EDT
Hi Mike, have you confirmed that newer kernels, or 2.6.9 with this patch
included works with your 2400a array?  AFAIK this patch is not necessarily to
make this 2400a card work properly, but it helps I2O cards in general that need
more time to be found by the driver on certain motherboards/chipsets.  So it is
possible that your particular problem may or may not be helped by this patch.
Comment 38 Warren Togami 2005-09-16 17:07:12 EDT
Reportedly this issue is only reproducible only on some rarer motherboards or
chipsets and/or with certain I2O cards.  Markus, how well has this fixed been
working for the upstream kernel?
Comment 39 Markus Lidel 2005-09-19 07:17:44 EDT
Hi...

I don't know of any issues with Adaptec controllers and kernel >= 2.6.12... And
Promise controllers should work with kernel version >= 2.6.13, too...

Bye...
Comment 41 Red Hat Bugzilla 2005-10-05 08:33:24 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-514.html

Note You need to log in before you can comment on or make changes to this bug.