Bug 110302 - 3ware kernel panic on disk write
Summary: 3ware kernel panic on disk write
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 3.0
Hardware: athlon
OS: Linux
high
high
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact:
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-11-18 02:53 UTC by Need Real Name
Modified: 2007-11-30 22:06 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-20 15:27:28 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
kernel panic message (53.93 KB, text/plain)
2003-11-18 03:08 UTC, Need Real Name
no flags Details
netdump log (17.65 KB, text/plain)
2004-08-17 16:08 UTC, daryl herzmann
no flags Details
netdump log for 2.4.20-21.0.1.ELsmp (27.82 KB, text/plain)
2004-12-27 15:46 UTC, daryl herzmann
no flags Details

Description Need Real Name 2003-11-18 02:53:39 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET 
CLR 1.1.4322)

Description of problem:
ASUS A7N8X Deluxe
1024MB RAM
Boot Drive: Maxtor 54098H8

3ware 3W-6410
4 x Maxtor 6Y160P0
RAID 5
ext3

write any file over 100MB, kernel panic will occur.


Version-Release number of selected component (if applicable):
kernel-2.4.21-4.0.1.EL

How reproducible:
Always

Steps to Reproduce:
1.Boot system
2.Write file to RAID volume
3.
    

Actual Results:  Kernel panic

Expected Results:  successful file write, no kernel panic

Additional info:

Comment 1 Need Real Name 2003-11-18 03:08:35 UTC
Created attachment 96038 [details]
kernel panic message

Comment 2 Need Real Name 2003-11-18 03:09:27 UTC
sometimes panic occurs at 1MB file size; panic always occurs with 
>100MB file size

Comment 3 Arjan van de Ven 2003-11-18 08:18:34 UTC
nvnet

can you reproduce this issue without binary only kernel modules loaded?

Comment 4 Need Real Name 2003-11-18 16:41:32 UTC
noticed on the 2.4.22 changelog 
http://www.kernel.org/pub/linux/kernel/v2.4/ChangeLog-2.4.22 that the 
3ware driver was updated.  

Adam Radford:
  o 3ware driver update

Curious to know if the update was bug or feature related.

Comment 5 Need Real Name 2003-11-18 16:53:09 UTC
these are the updates in the 2.4.22 3w-xxxx.c
1.02.00.034 - Fix tw_decode_bits() to handle multiple errors.
              Add support for user configurable cmd_per_lun. 
              Add support for sht->select_queue_depths.      
1.02.00.035 - Improve tw_allocate_memory() memory allocation.
              Fix tw_chrdev_ioctl() to sleep correctly.      
1.02.00.036 - Increase character ioctl timeout to 60 seconds.


Comment 6 Need Real Name 2003-11-25 16:37:39 UTC
Replaced the 6410-4 with a 7500-4, system does not crash now.  
Apparently this issue is related to the 6410 card only.

Comment 7 Erik Espinoza 2003-12-30 04:17:18 UTC
I had the same problem, however using the kernel-BOOT rpm works fine,
but is missing modules. I am currently looking into the difference in
the configs. All I noticed regarding scsi was the following:

kernel-BOOT:
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_DEBUG is not set

kernel-regular:
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_DEBUG=m

I am going to change the kernel-regular to match the kernel-BOOT
settings for scsi and see if that works.

Comment 8 Ernie Petrides 2004-04-08 17:50:31 UTC
Could the bug reporter please answer question in comment #3?

Thanks.  -ernie


Comment 9 Need Real Name 2004-04-08 18:09:20 UTC
Per comment #6, I replaced the 6410-4 with a 7500-4 and the system 
does not crash now.  Apparently this issue is related to the 6410 
card only (which is no longer installed in the system because it was 
crashing).  No vendor modules were used on the system, 3ware has been 
including their source within the kernel tree for quite some time.

Comment 10 daryl herzmann 2004-07-02 13:40:10 UTC
Hi,

For what it is worth, I am getting the same kernel panics while
writing to 6410 RAID arrays.  I am currently running 
2.4.21-15.0.3.ELsmp on a Dual Althon system.  Unfortunately, I am
less competent than a newbie, so I am having trouble grabbing the
kernel trace for you :(  I will do some more RTFM this weekend and
see if I can grab one.  If you have any other suggestions, please 
let me know.  This problem did not occur while I was running 
2.4.18-4 back on RH7.3.  This bug is easy to trigger, a simple
dd if=/dev/zero of=test on a 3ware RAID array will panic the system in a
few seconds :(

For my 6410 cards (I have 3 of them in the system).
Monitor version:	ME6X 1.01.00.028
Firmware version:	FE6X 1.02.28.053
BIOS version:	BE6X 1.07.02.005

Sorry for not being too helpful.  I will to get something for you.

PS.  I did try kernel-smp-2.4.22-1.2194.nptl from FC1 and got a panic
     too.

daryl

Comment 11 daryl herzmann 2004-08-17 16:08:11 UTC
Created attachment 102799 [details]
netdump log

I got netdump working!	Here is the log file.  I am not certain if I am
supposed to process this file before uploading?  The machine is running
2.4.21-15.0.4.ELsmp

thanks! 
  daryl

Comment 12 daryl herzmann 2004-10-15 18:46:26 UTC
Hi,

Sorry to be a 'bug', but I was wondering about the status of this bug.
 Do I need to process my netdump log differently?  I still have it on
disk....

I can try and produce a crash on 2.4.21-20 if you like, but would
rather not if the log is not helpfull...

The machine has 1 TB of data that I wouldn't mind taking out of ro
state on of these days....

thanks,
  daryl

Comment 13 daryl herzmann 2004-11-22 14:45:49 UTC
Greetings,

This morning I installed the beta 2.4.21-25.ELsmp kernel on this box
and got a similiar kernel panic.  Unfortunately, netdump didn't seem
to want to play nice this morning.

Any insights?  thanks.

daryl

Comment 14 daryl herzmann 2004-12-27 15:46:15 UTC
Created attachment 109138 [details]
netdump log for 2.4.20-21.0.1.ELsmp

Here is a netdump log for the most recent kernel.  The procedure to generate
this crash was the same as noted before...

Happy holidays....

Comment 15 daryl herzmann 2004-12-27 15:47:06 UTC
Me culpa, that previous netdump was for 2.4.21-27.0.1.ELsmp

Comment 16 daryl herzmann 2005-02-01 16:53:45 UTC
Greetings,

I hate to be a bug, but does anybody have thoughts on this bug.  If I
am doing something stupid, that is fine, but the bug might as well be
closed then.  I have scheduled downtime for this machine coming up
this weekend, so I can do any testing needed.  3ware has suggested I
try some newer drivers.  So if I do capture any kernel dumps, are the
ones I sent before any good or do I need to process them further somehow?

Sorry that I can't be more help than to whine :(

daryl

Comment 17 Jim Snow 2005-04-18 23:48:39 UTC
I am having a similar problem with Fedora Core 3 (32 bit) on a dual opteron box
with a 3ware 8506-8 SATA raid controller.  Rather than panicing, the machine
locks up hard (no response from keyboard, mouse) when attempting to run
mkfs.ext3 /dev/sda1 (sda being a raid 5 array with 5 sata disks).  Same problem
occurs running a uniprocessor kernel.  I am not running any proprietary kernel
modules.

I can provide more detailed information if anyone needs it to help track down
this problem.

I also posted a message to fedora forum <a
href=http://forums.fedoraforum.org/showthread.php?t=51875&highlight=3ware>here</a>.

-jim

Comment 18 daryl herzmann 2005-07-05 17:01:14 UTC
Hi,

Just to follow up.  An upgrade to RHEL 4AS resolved our problem.  We are happily
writing data again.  

daryl

Comment 19 Tom Coughlan 2006-03-20 15:27:28 UTC
This old problem report appears to have been resolved by a variety of
work-arounds and upgrades. If there is still a problem, please test with the
latest RHEL Update and open a new BZ.  


Note You need to log in before you can comment on or make changes to this bug.