Bug 190340 - I2O oops in all FC5 and FC4 >2.6.14 kernels
I2O oops in all FC5 and FC4 >2.6.14 kernels
Status: CLOSED DUPLICATE of bug 189570
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
5
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-01 09:23 EDT by Need Real Name
Modified: 2007-11-30 17:11 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-09-03 13:40:46 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Need Real Name 2006-05-01 09:23:56 EDT
Description of problem: There appears to be a bug with the I2O drivers affecting
all release kernels in FC5 and FC4 updated kernels at least >2.6.14


Version-Release number of selected component (if applicable): For FC5 the kernel
on the install CD is affected. For FC4 at least 2.6.16-1.2096 is. 2.6.14-1.1656
appears stable, I have a heavily used box running it and I2O that's been up 96
days currently.


How reproducible: Just cause lots of disk I/O. Quick way to crash the box is
thi: for i in `seq 1 90`; do echo $i; dd if=/dev/zero of=/data/junk.$i
count=250000 bs=4096 1>/dev/null 2>&1; done


Steps to Reproduce:
1. Lots of I/O, see above
2.
3.
  
Actual results: From an ssh connection I see this:
[root@localhost ~]# !for
for i in `seq 1 90`; do echo $i; dd if=/dev/zero of=/data/junk.$i count=250000
bs=4096 1>/dev/null 2>&1; done
1
2
3
4
5

Message from syslogd@localhost at Mon May  1 08:45:22 2006 ...
localhost kernel: ------------[ cut here ]------------

On the console I get a few screens worth of messages followed by a countdown of
"continuing in 120" which counts down to 0 but never does anything after that.


Expected results:


Additional info: Looking through dmesg output I see messages about PCI and a
suggestion to use "pci=routeirq" as a boot-time kernel parameter. I'm testing
that now and so far that looping dd has made it further than usual on
2.6.16-1.2096SMP kernel. 2.6.11 and 2.6.14 appear to be able to complete it
indefinitely, or at least as long as disk space holds out.

I'm not sure if this is a "bug" or the "routeirq" thing is common. I only report
it because a few boxes I had with FC4 and I2O drivers started crashing like
crazy after a kernel upgrade (via yum). I also tried an in-place upgrade to FC5
which was disasterous since the stock kernel with it has the problem. I've not
tried that yet with pci=routeirq on the CD boot line. Also the line in dmesg
suggests trying pci=routeirq and also to report the results if it's beneficial.

Hardware: SuperMicro 7043M-6 and other similar boxes with Adaptec 2000S and
similar hardware RAID cards.

If any log files or additional info might be helpful, let me know...
Comment 1 Need Real Name 2006-05-01 12:52:48 EDT
Without "pci=routeirq" the "dd" test above would complete only about 5
iterations before locking up. With that boot string it completed 88 before
locking up.
Comment 2 Dave 2006-05-08 17:03:40 EDT
I think this is probably a duplicate of my bug...

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=189570

Please let me know if you can mirror my testing...
Comment 3 Need Real Name 2006-05-08 17:10:38 EDT
Yes, appears to be a dupe. I searched and searched and of course found nothing
even close to this before filing the report, ah well...
Comment 4 Dave Russell 2006-09-03 13:40:46 EDT

*** This bug has been marked as a duplicate of 189570 ***

Note You need to log in before you can comment on or make changes to this bug.