Bug 138796

Summary: __get_request_wait() uses add_wait_queue() instead of add_wait_queue_exclusive()
Product: Red Hat Enterprise Linux 3 Reporter: Yasuma Takeda <yasuma>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: coughlan, dowdle, petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 19:14:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yasuma Takeda 2004-11-11 07:45:18 UTC
Description of problem:
The __get_request_wait() in drivers/block/ll_rw_block.c
uses add_wait_queue(). 
This code is coming from the original kernel source(2.4.21).
But the other version's kernel uses add_wait_queue_exclusive(). 

I think it may cause a bad situation.
If many processes are waiting in a wait queue, 
all processes are waked up. 
And the process which was sleeping for a long time may sleep again. 


Version-Release number of selected component (if applicable):
kernel-2.4.21-20.EL

Comment 2 Scott Dowdle 2004-12-13 22:26:48 UTC
I've been having a big performance problem of RHEL AS 3 with the
2.4.21-20 kernel.  The situtation is frustrating.  I'm running
OpenWebMail using SpeedyCGI.  SpeedyCGI speeds things up and is
supposed to be a good thing.  After the kernel upgrade, I see the
following in /var/log/messages

Dec 12 04:08:38 mail kernel: application bug: speedy_backend(20437)
has SIGCHLD set to SIG_IGN but calls wait().
Dec 12 04:08:38 mail kernel: (see the NOTES section of 'man 2 wait').
Workaround activated.

The iowait from top is always very high.  The machine gets bogged down
and has to be restarted every few days when it gets into situations
where the load is in the 20s or higher and just won't come down.  Even
when it's almost doing nothing it has a load average 0.76.  I am NOT
running on underpowered hardware.

I've been told that I need to do some vm tweaking but all of my
attempts have helped a bit here and there but have not solved the problem.

Are my performance issues related to use of perl / openwebmail /
speedycgi and this issue interacting?

Please help.

Comment 3 Warren S 2004-12-13 22:30:14 UTC
wsanders

Comment 4 Doug Ledford 2004-12-15 13:53:55 UTC
This should probably go to Larry and Tom, not to me.  Reassigning to
Larry.

Comment 5 Larry Woodman 2005-01-06 19:35:11 UTC
I dont think add_wait_queue_exclusive is the right thing to do in
__get_request_wait() for RHEL3.  This would only wakeup one process
when blkdev_release_request is called and since RHEL3 does batch
processing of requests we would leave several processes sleeping even
though an entire batch of requests is free!  I think this could leave
one or more processes hung perminantly in __get_request_wait().

Has anyone tried simply replacing the add_wait_queue with
add_wait_queue_exclusive in __get_request_wait() and let the system
run under load for a long time?

Larry Woodman

Comment 6 Ernie Petrides 2005-01-06 23:09:40 UTC
In response to off-topic comment #2: Scott, your posting here has nothing
to do with this bug report.  The warning message you reported will be
addressed when changed to a debug message in response to bug 140552.

Comment 7 RHEL Program Management 2007-10-19 19:14:27 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.