Bug 110302 - 3ware kernel panic on disk write
3ware kernel panic on disk write
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
athlon Linux
high Severity high
: ---
: ---
Assigned To: Tom Coughlan
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-11-17 21:53 EST by Need Real Name
Modified: 2007-11-30 17:06 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-20 10:27:28 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
kernel panic message (53.93 KB, text/plain)
2003-11-17 22:08 EST, Need Real Name
no flags Details
netdump log (17.65 KB, text/plain)
2004-08-17 12:08 EDT, daryl herzmann
no flags Details
netdump log for 2.4.20-21.0.1.ELsmp (27.82 KB, text/plain)
2004-12-27 10:46 EST, daryl herzmann
no flags Details

  None (edit)
Description Need Real Name 2003-11-17 21:53:39 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET 
CLR 1.1.4322)

Description of problem:
ASUS A7N8X Deluxe
1024MB RAM
Boot Drive: Maxtor 54098H8

3ware 3W-6410
4 x Maxtor 6Y160P0
RAID 5
ext3

write any file over 100MB, kernel panic will occur.


Version-Release number of selected component (if applicable):
kernel-2.4.21-4.0.1.EL

How reproducible:
Always

Steps to Reproduce:
1.Boot system
2.Write file to RAID volume
3.
    

Actual Results:  Kernel panic

Expected Results:  successful file write, no kernel panic

Additional info:
Comment 1 Need Real Name 2003-11-17 22:08:35 EST
Created attachment 96038 [details]
kernel panic message
Comment 2 Need Real Name 2003-11-17 22:09:27 EST
sometimes panic occurs at 1MB file size; panic always occurs with 
>100MB file size
Comment 3 Arjan van de Ven 2003-11-18 03:18:34 EST
nvnet

can you reproduce this issue without binary only kernel modules loaded?
Comment 4 Need Real Name 2003-11-18 11:41:32 EST
noticed on the 2.4.22 changelog 
http://www.kernel.org/pub/linux/kernel/v2.4/ChangeLog-2.4.22 that the 
3ware driver was updated.  

Adam Radford:
  o 3ware driver update

Curious to know if the update was bug or feature related.
Comment 5 Need Real Name 2003-11-18 11:53:09 EST
these are the updates in the 2.4.22 3w-xxxx.c
1.02.00.034 - Fix tw_decode_bits() to handle multiple errors.
              Add support for user configurable cmd_per_lun. 
              Add support for sht->select_queue_depths.      
1.02.00.035 - Improve tw_allocate_memory() memory allocation.
              Fix tw_chrdev_ioctl() to sleep correctly.      
1.02.00.036 - Increase character ioctl timeout to 60 seconds.
Comment 6 Need Real Name 2003-11-25 11:37:39 EST
Replaced the 6410-4 with a 7500-4, system does not crash now.  
Apparently this issue is related to the 6410 card only.
Comment 7 Erik Espinoza 2003-12-29 23:17:18 EST
I had the same problem, however using the kernel-BOOT rpm works fine,
but is missing modules. I am currently looking into the difference in
the configs. All I noticed regarding scsi was the following:

kernel-BOOT:
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_DEBUG is not set

kernel-regular:
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_DEBUG=m

I am going to change the kernel-regular to match the kernel-BOOT
settings for scsi and see if that works.
Comment 8 Ernie Petrides 2004-04-08 13:50:31 EDT
Could the bug reporter please answer question in comment #3?

Thanks.  -ernie
Comment 9 Need Real Name 2004-04-08 14:09:20 EDT
Per comment #6, I replaced the 6410-4 with a 7500-4 and the system 
does not crash now.  Apparently this issue is related to the 6410 
card only (which is no longer installed in the system because it was 
crashing).  No vendor modules were used on the system, 3ware has been 
including their source within the kernel tree for quite some time.
Comment 10 daryl herzmann 2004-07-02 09:40:10 EDT
Hi,

For what it is worth, I am getting the same kernel panics while
writing to 6410 RAID arrays.  I am currently running 
2.4.21-15.0.3.ELsmp on a Dual Althon system.  Unfortunately, I am
less competent than a newbie, so I am having trouble grabbing the
kernel trace for you :(  I will do some more RTFM this weekend and
see if I can grab one.  If you have any other suggestions, please 
let me know.  This problem did not occur while I was running 
2.4.18-4 back on RH7.3.  This bug is easy to trigger, a simple
dd if=/dev/zero of=test on a 3ware RAID array will panic the system in a
few seconds :(

For my 6410 cards (I have 3 of them in the system).
Monitor version:	ME6X 1.01.00.028
Firmware version:	FE6X 1.02.28.053
BIOS version:	BE6X 1.07.02.005

Sorry for not being too helpful.  I will to get something for you.

PS.  I did try kernel-smp-2.4.22-1.2194.nptl from FC1 and got a panic
     too.

daryl
Comment 11 daryl herzmann 2004-08-17 12:08:11 EDT
Created attachment 102799 [details]
netdump log

I got netdump working!	Here is the log file.  I am not certain if I am
supposed to process this file before uploading?  The machine is running
2.4.21-15.0.4.ELsmp

thanks! 
  daryl
Comment 12 daryl herzmann 2004-10-15 14:46:26 EDT
Hi,

Sorry to be a 'bug', but I was wondering about the status of this bug.
 Do I need to process my netdump log differently?  I still have it on
disk....

I can try and produce a crash on 2.4.21-20 if you like, but would
rather not if the log is not helpfull...

The machine has 1 TB of data that I wouldn't mind taking out of ro
state on of these days....

thanks,
  daryl
Comment 13 daryl herzmann 2004-11-22 09:45:49 EST
Greetings,

This morning I installed the beta 2.4.21-25.ELsmp kernel on this box
and got a similiar kernel panic.  Unfortunately, netdump didn't seem
to want to play nice this morning.

Any insights?  thanks.

daryl
Comment 14 daryl herzmann 2004-12-27 10:46:15 EST
Created attachment 109138 [details]
netdump log for 2.4.20-21.0.1.ELsmp

Here is a netdump log for the most recent kernel.  The procedure to generate
this crash was the same as noted before...

Happy holidays....
Comment 15 daryl herzmann 2004-12-27 10:47:06 EST
Me culpa, that previous netdump was for 2.4.21-27.0.1.ELsmp
Comment 16 daryl herzmann 2005-02-01 11:53:45 EST
Greetings,

I hate to be a bug, but does anybody have thoughts on this bug.  If I
am doing something stupid, that is fine, but the bug might as well be
closed then.  I have scheduled downtime for this machine coming up
this weekend, so I can do any testing needed.  3ware has suggested I
try some newer drivers.  So if I do capture any kernel dumps, are the
ones I sent before any good or do I need to process them further somehow?

Sorry that I can't be more help than to whine :(

daryl
Comment 17 Jim Snow 2005-04-18 19:48:39 EDT
I am having a similar problem with Fedora Core 3 (32 bit) on a dual opteron box
with a 3ware 8506-8 SATA raid controller.  Rather than panicing, the machine
locks up hard (no response from keyboard, mouse) when attempting to run
mkfs.ext3 /dev/sda1 (sda being a raid 5 array with 5 sata disks).  Same problem
occurs running a uniprocessor kernel.  I am not running any proprietary kernel
modules.

I can provide more detailed information if anyone needs it to help track down
this problem.

I also posted a message to fedora forum <a
href=http://forums.fedoraforum.org/showthread.php?t=51875&highlight=3ware>here</a>.

-jim
Comment 18 daryl herzmann 2005-07-05 13:01:14 EDT
Hi,

Just to follow up.  An upgrade to RHEL 4AS resolved our problem.  We are happily
writing data again.  

daryl
Comment 19 Tom Coughlan 2006-03-20 10:27:28 EST
This old problem report appears to have been resolved by a variety of
work-arounds and upgrades. If there is still a problem, please test with the
latest RHEL Update and open a new BZ.  

Note You need to log in before you can comment on or make changes to this bug.