Bug 77861 - Kernel lockup in qlogicfc0 driver
Kernel lockup in qlogicfc0 driver
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Jones
Brian Brock
Depends On:
Blocks: 77803
  Show dependency treegraph
Reported: 2002-11-14 10:10 EST by Hrunting Johnson
Modified: 2015-01-04 17:02 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2003-12-17 08:26:22 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
output from readprofile -v (53.26 KB, text/plain)
2002-11-22 12:25 EST, Hrunting Johnson
no flags Details

  None (edit)
Description Hrunting Johnson 2002-11-14 10:10:28 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 

Description of problem:
Compaq 8500, 8 P3 CPU, 4GB RAM, QLA2200 FC
RH 7.2, all errata, kernel 2.4.18-17.7.x

Under heavy load (backups, running real-time monitoring system, and lookupd 
data) across the fibre-channel card, the system locks up.  Upon 
reboot, /var/log/messages contains 36 lines like:

kernel: qlogicfc0 : no handle slots, this should not happen
kernel: hostdata->queued is 4d, in_ptr: 38

The '4d' and 'in_ptr' values will vary.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.boot system
2.run under heavy load

Actual Results:  system locks up

Expected Results:  system does not lock up

Additional info:


This URL contains information about changes made in 2.5 that supposedly fix 
this problem.  Looks like a change was made in the way drivers need to handle 
locks (per device vs. global).

I consider 2.4.18-17.7.x to be an extremely buggy kernel.  This is the third 
bug related to this kernel I've filed since I upgraded an RH7.2 box to this 
kernel yesterday.  Was any QA done on this kernel at all?
Comment 1 Arjan van de Ven 2002-11-14 10:12:37 EST
please use the qla2200 driver instead; that one is actually supported
Comment 2 Hrunting Johnson 2002-11-14 10:26:26 EST
Will do.  Under 2.4.9-31, I was using the qla2x00 without incident, but that 
disappeared in the new release.  The qla2200 driver under that kernel never 
worked for us, so I didn't bother to try it again.
Comment 3 Hrunting Johnson 2002-11-14 18:01:38 EST
Okay, switching to the supported qla2200 driver appears to fix the problems 
with the machine lockup (and another bug, 77803, which I have no idea why or 
how), but the kjournald thread for the ext3 partition that is on the RAID 
accessed through that card is taking up around 11% of the total CPU on the box, 
whereas before it took up around 2%.  Why the increase?  Is that qla2200 driver 
that poor?

Under 2.4.9-31 and the qla2x00 driver, we didn't have that much journal 
activity, but we were also running under a different VM.  Under the qlogicfc0 
driver and the new VM, we had basically the same system usage as the qla2x00 
Comment 4 Stephen Tweedie 2002-11-15 05:15:48 EST
The 77803 bug is likely due to dropped interrupts if a driver change fixes it.

As for the kjournald overhead, that could be a number of things, including
bounce buffer overhead.  We'd need to see a kernel profile to have any hope of
diagnosing it.  (Boot with the kernel parameters "profile=2"; man readprofile to
see how to extract info.)
Comment 5 Hrunting Johnson 2002-11-22 09:14:49 EST
At the risk of being taken for an idiot, when I enable profiling (with 
profile=2), no matter what, I always get:

# readprofile -m /boot/System.map-2.4.18-18.7.xbigmem 
     4 _stext                                     0.0500
     4 total                                      0.0000

No matter what.  Do I need to do something else to enable accurate profiling on 
this machine?  The system is under heavy load.  The /proc/profile file is 
constantly being updated (according to its timestamp), but it's always the same 
size, and it always contains that same data (in -v, everything is set to 0).

This is with 2.4.18-18.7.xbigmem.
Comment 6 Arjan van de Ven 2002-11-22 09:20:19 EST
you need to ALSO specify nmi_watchdog=1 in addition to profile=
Comment 7 Hrunting Johnson 2002-11-22 12:25:38 EST
Created attachment 86072 [details]
output from readprofile -v
Comment 8 Dave Jones 2003-12-16 21:34:44 EST
Fixed in the 2.4.20-20 erratas ?
Comment 9 Hrunting Johnson 2003-12-17 08:11:11 EST

Note You need to log in before you can comment on or make changes to this bug.