Bug 170633

Summary: System Stops responding with "queue 6 full" messages
Product: Red Hat Enterprise Linux 3 Reporter: Seth <seth.reinoso>
Component: kernelAssignee: Tom Coughlan <coughlan>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: mike.l.romero, petrides
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0144 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-15 16:47:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168424    

Description Seth 2005-10-13 13:59:16 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7

Description of problem:
Whether the machine is idle or at work, the machine ceases to respond
The console displays constant lines of "queue 6 full" messages
This error leaves no clues in /var/log/messages 
This error leaves no clues in dmesg 
IBM eServer 366 
4 Xeon Processors 
10 Gigs of memory 


Version-Release number of selected component (if applicable):
kernel-2.4.21-37.ELSMP

How reproducible:
Always

Steps to Reproduce:
1.Leave the machine on 
2.
3.
  

Actual Results:  returned later to find the console displaying line after line of "queue 6 full" error messages without ceasing

Expected Results:  The machine should have remained stable and responsive

Additional info:

I suspect its a RAID driver issue but not sure

Comment 1 Ernie Petrides 2005-10-13 21:12:02 UTC
Hello, Seth.  I couldn't find that exact message format in the RHEL3
source pool.  Is the correct string possibly the following (where %d
would be replaced by a numeric values)?

  Queue %d full, %ld outstanding.

That's from the aacraid driver.  Could you please include the list of
modules you've got loaded?

Thanks in advance.


Comment 2 Seth 2005-10-14 12:14:40 UTC
when all three servers abend the message is always 
Queue 6 full

lsmod 
-----

Module                  Size  Used by    Tainted: GF
autofs4                16888   0 (autoclean) (unused)
audit                  90872   2
tg3                    72680   1
microcode               6912   0 (autoclean)
keybdev                 2944   0 (unused)
mousedev                5656   0 (unused)
hid                    22532   0 (unused)
input                   6176   0 [keybdev mousedev hid]
ehci-hcd               20776   0 (unused)
usb-ohci               23208   0 (unused)
usbcore                81152   1 [hid ehci-hcd usb-ohci]
ext3                   90088   7
jbd                    55380   7 [ext3]
aacraid                63672   8
qla2300               599324   0 (unused)
sd_mod                 14160  16
scsi_mod              115756   3 [aacraid qla2300 sd_mod]


Comment 3 Tom Coughlan 2005-10-14 12:43:01 UTC
> Tainted: GF

Please try to reproduce with a non-tainted kernel. 

Did you see this problem on earlier kernels? The aacraid driver has not changed
substancially since 2.4.21-20.EL (Update 3). 



Comment 4 Seth 2005-10-14 13:17:07 UTC
Yes I did see the problem with an earlier kernel. 

I found the resolution. 

The aacraid driver needs an update from IBM
http://www-1.ibm.com/support/docview.wss?rs=1201&uid=psg1MIGR-59553&loc=en_US
Updating the aacraid driver seems to work. 


Comment 5 Tom Coughlan 2005-10-14 13:50:08 UTC
Okay, we are planning to update the aacraid driver in Update 7. You will get a
notice when this has been checked in so you can test it. 

Comment 6 Tom Coughlan 2005-11-02 00:02:20 UTC
Please test the kernel located at:

http://people.redhat.com/coughlan/.2.4.21-37.7.ELdrvrtest2/

to verify that it solves the problem. 

This contains version 1.1.5-2412 of the aacraid driver. This is the latest from
Adaptec, and is a candidate for U7. 



Comment 7 Ernie Petrides 2005-11-23 00:37:19 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.10.EL).


Comment 9 Michael Romero 2005-12-14 18:02:01 UTC
I am also having this exact same problem with patched RHEL 3.5.  Can you put 
out a hugemem version of this kernel so that I can grab that and see if that'll 
work for me?  I have 16gb ram on this server and its also an xSeries 366. 

Comment 10 Michael Romero 2006-01-11 16:14:46 UTC
can someone email me a link to the source rpm for this?  I'm trying to install 
a new qlogic driver and its asking for it.. thanks! -Mike

Comment 11 Ernie Petrides 2006-01-11 19:58:15 UTC
Michael, all RPMs are in the RHEL3 beta channels on RHN.  The
latest U7 beta kernel is version 2.4.21-38.EL.

Comment 13 Red Hat Bugzilla 2006-03-15 16:47:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html