Bug 466127

Summary: dasd: fix loop in request expiration handling
Product: Red Hat Enterprise Linux 4 Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: Hans-Joachim Picht <hpicht>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 4.7CC: cward, hpicht, jkachuck, peterm, tao
Target Milestone: rc   
Target Release: ---   
Hardware: s390x   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 19:17:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
linux-2.6.9-s390-dasd_fix_loop_in_request_expiration.patch none

Description IBM Bug Proxy 2008-10-08 15:47:21 UTC
=Comment: #0=================================================
Hans Joachim Picht <hans.ibm.com> - 

linux-2.6.9-s390-dasd_fix_loop_in_request_expiration.patch

Description: dasd: fix loop in request expiration handling
Symptom:     I/O on a DASD is blocked and message log shows a lot
             of messages that say 'termination failed, retrying in 5s'
             but the message repeats several times a second and not
             just every 5 seconds.
Problem:     The first thing we do in the dasd device tasklet is to
             check for expired requests. When a expired cqr is found,
             we try to terminate that request so that we can give it
             a fresh start. If this termination fails, we want to
             wait for 5 seconds and do the same check/termination
             again.
             We setup a timer, which will schedule the tasklet in 5
             seconds. Unfortunately the termination function itself
             schedules the tasklet as well, so the tasklet will be
             executed again right after it finished and will find the
             expired cqr. If the termination failed due to a hardware
             problem it will probably fail again, and we are stuck
             in a loop until the hardware allows termination again.
Solution:    The schedule in the termination function may be needed
             in other contexts, so if we want to give a request some
             more time, we need to add this time to it's 'expires'
             value.

Contact Information = stefan.haberland.com
=Comment: #2=================================================
Hans Joachim Picht <hans.ibm.com> - 
The patch has been tested, fixed the problem and is upstream:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;=7dc1da9ffae5a344f7115d019e2be069d3e1bb8d

Comment 1 IBM Bug Proxy 2008-10-08 15:47:26 UTC
Created attachment 319760 [details]
linux-2.6.9-s390-dasd_fix_loop_in_request_expiration.patch

Comment 2 Hans-Joachim Picht 2008-10-22 10:16:42 UTC
The patch has been posted to rhkernel on Oct 22 by Hans-Joachim Picht
<hpicht>

Comment 3 IBM Bug Proxy 2008-10-22 13:11:04 UTC
Hello Red Hat:
I increase severity from normal to "high"
That is to reflect  the changed severity assessment at IBM that occurred
during fix development and test.

This fix should be included into RHEL4.8.

Comment 4 IBM Bug Proxy 2008-10-23 12:10:58 UTC
Hello Red Hat:
As discussed yesterday here the business justification for this fix
to make RHEL4.8:

If the system runs into this error
it is caught in a loop that floods the message log with errors.
The System will not respond for an indefinite time.

Comment 5 IBM Bug Proxy 2008-10-29 12:40:43 UTC
Hello Red Hat,
There is  RIT  232885 created for this LTC BZ
Please ensure the RIT is linked to RH BZ 466127.
Thx

Comment 6 RHEL Program Management 2008-10-30 18:28:06 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 7 Vivek Goyal 2008-11-05 13:58:10 UTC
Committed in 78.17.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 8 IBM Bug Proxy 2008-12-15 11:40:50 UTC
*** Bug 50127 has been marked as a duplicate of this bug. ***

Comment 10 IBM Bug Proxy 2009-03-19 09:53:42 UTC
------- Comment From mgrf.com 2009-03-19 05:42 EDT-------
This is verified OK on RHEL 4.8 beta1 on System z .
Closing on IBM site , Thx

Comment 12 Chris Ward 2009-03-27 14:20:15 UTC
~~ Attention Partners! Snap 1 Released ~~
RHEL 4.8 Snapshot 1 has been released on partners.redhat.com. There should
be a fix present, which addresses this bug. NOTE: there is only a short time
left to test, please test and report back results on this bug
at your earliest convenience.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have found a NEW bug, clone this
bug and describe the issues you encountered. Further questions can be
directed to your Red Hat Partner Manager.

If you have VERIFIED the bug fix. Please select your PartnerID from the
Verified field above. Please leave a comment with your test results details.
Include which arches tested, package version and any applicable logs.

 - Red Hat QE Partner Management

Comment 14 errata-xmlrpc 2009-05-18 19:17:46 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html