Bug 132788

Summary: "Invalid packet 21 count! 15 messages" on nfs server console
Product: Red Hat Enterprise Linux 3 Reporter: jason andrade <jason>
Component: kernelAssignee: Tom Coughlan <coughlan>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: deepak.kotian, petrides, pjb, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL 3 U6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-07 13:32:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jason andrade 2004-09-17 03:13:04 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US;
rv:1.7) Gecko/20040517 Camino/0.8b

Description of problem:
Hi,

Since we've upgraded our fileserver from RHELQU2 to RHELQU3 i've been
seeing a lot of the following messages on the console.  I haven't been
able to find out what they mean.

Invalid packet 21 count! 15



Version-Release number of selected component (if applicable):
kernel-2.4.21-20EL

How reproducible:
Always

Steps to Reproduce:
1.boot 2.4.21-20EL kernel
2.wait for a short period of time for nfs load
3.
    

Actual Results:  error messages appear in console/messages logfile

Expected Results:  no messages

Additional info:

system is a dual processor xeon with hyperthreading disabled, 4G of ram
and about 6Tbytes of scsi/fibrechannel disk.

Comment 1 Neil Horman 2004-10-14 15:29:13 UTC
A quick search indicates these messages are originating in the qla
driver.  I suppose that means this is an arjan issue?

Comment 2 Elliot Kendall 2004-10-22 19:09:15 UTC
I'm running kernel 2.4.21-20.EL i686, using qla2200 to talk to a fibre
channel disk shelf. I see the same message only when writing to the FC
disk. I'm not using NFS at all on the machine in question.

Comment 3 Elliot Kendall 2004-10-26 17:01:26 UTC
I can confirm that this issue was introduced in 2.4.21-20. It does not
affect 2.4.21-15.0.4EL or 2.4.21-15EL. It seems to occur when writing
to, but not reading from a qla2200-attached disk. For example, dd
if=/dev/zero of=/qladisk/bigfile will generate the messages, but
md5sum /qladisk/bigfile will not.

Comment 4 Deepak Kotian 2004-10-27 06:55:22 UTC
Did anyone get an answer why this message("Invalid packet 21 count! 
15 messages" ) comes up. I have a similar problem
Please let us know
1. Under what conditions is this message displayed
2. Which layer logs this message



Comment 5 Deepak Kotian 2004-10-27 09:13:26 UTC
Did anyone get an answer why this message("Invalid packet 21 count! 
15 messages" ) comes up. I have a similar problem
Please let us know
1. Under what conditions is this message displayed
2. Which layer logs this message



Comment 6 jason andrade 2004-10-28 21:33:35 UTC
i don't think there has been an answer yet.  keeping in mind that as i
understand it, bugzilla is a way to report bugs to redhat, not to
actually get them resolved - that is what a support contract is for. 
unfortunately the system we're running this on is RHEL3/ES/Basic so i
don't have the ability to lodge a support call on it.  does anyone
else here do ?

so far it appears to be a cosmetic error that is just filling the
console with messages.  i haven't observed any actual corruption or
other issues yet. i am hoping it's only cosmetic..

regards,

-jason

Comment 7 Alex Georgopoulos 2004-10-29 22:55:55 UTC
I've opened a case on this issue on Oct 11th with Web Support.  We
were noticing the problem on AS 3.0 x86_64 with the 2.4.21-20.ELsmp
and the qla2200 driver.  We had to drop down to 2.4.21-15.0.4.ELsmp
since our DBA's said it was causing a performance issue during write
operations to our FAStT 500.

Comment 8 Paul Batchelor 2004-11-03 02:07:04 UTC
I've noticed this happens on both the 32-bit and 64-bit versions  - we are using kernel  
2.4.21-20.ELsmp with the qla2200 driver on AS 3. 

As far as I have seen, it does not seem to be causing any corruption. 

Comment 9 Alex Georgopoulos 2004-11-11 20:55:01 UTC
Here is what Redhat support is telling me about the issue:

Engineering is currently verifying fixes for several 
of the problems which cause this behavior. It is likely that these
fixes will be 
released as an Errata, prior to Update 4. Currently this kswapd issue
has an 
extremely high priority as it is affecting a wide range of customers.
Please 
keep an eye on the errata announcements for a kernel update. If you
have any 
questions in the mean time please feel free to respond. Thanks for
working with 
Red Hat support. 

Comment 10 Thorsten Jungblut 2005-06-03 10:00:33 UTC
We are encountering the same problem. It seems to be still unsolved in
2.4.21-32.0.1.

It does only apply to a qla2200 hba, a 2340 works without messages.

I've taken a look into the kernel sources to get a clue what the message means.
It has something to do with a response queue. The driver is expecting a queue
with a maximum of 14 entries (using an array with 14 fields to store it) but
gets an entry count of 15. The code will then access the (nonexistent) 15th
field of that array (see drivers/addon/qla2200/qla2x00.c).


Comment 11 jason andrade 2005-06-03 11:13:11 UTC
i'd confirm that. it's continued with me through QU2 right into QU5 running the
latest errata kernel.  it's pretty annoying in terms of filling up syslog :-/

i now have two systems with qla2200s so i'm resigned to both of them spewing
this message now.

Alex, did RH support every get back to you ? it's been about six months since
your post..


Comment 12 Jens Oberender 2005-10-05 11:51:51 UTC
We have here the very same problem, also with 2.4.21-32.0.1.ELsmp.

To Thorsten Jungblut and other experts:
do you consider this as a thread to data security?

We will be looking further into it.
Please report, if someone solved the issue.

Thanks!

Comment 13 Ernie Petrides 2005-10-06 20:45:48 UTC
Fixing "component" field and reassigning to Tom.

Comment 14 Tom Coughlan 2005-10-07 13:32:05 UTC
The diff from Qlogic driver 7.01.01 to 7.05.00 has:

@@ -11116,8 +11276,9 @@ static void qla2x00_handle_RIO_type1_ioc

        rio = (struct rio_iocb_type1_entry *) pkt;

-       if (rio->handle_count > 14) {
-               printk("Invalid packet 21 count! %i\n", rio->handle_count);
+       if (rio->handle_count > 15) {
+               printk(KERN_INFO "Invalid packet 21 count! %i\n",
+                       rio->handle_count);
        }

        for (i=0; i < rio->handle_count; i++)


So the problem is fixed in 7.05.00. This driver just shipped in RHEL 3 U6. 

Comment 15 Ernie Petrides 2005-10-07 20:57:00 UTC
A fix for this problem was committed to the RHEL3 U6 patch pool
on 15-Jul-2005 (in kernel version 2.4.21-32.12.EL).

The U6 kernel was release as version 2.4.21-37.EL, and here's the
associated message from the Errata System:


An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-663.html