Bug 624713
Summary: | [RHEL4] Problems with aacraid - File system going into read-only. | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Bryn M. Reeves <bmr> | ||||
Component: | kernel | Assignee: | Rob Evers <revers> | ||||
Status: | CLOSED ERRATA | QA Contact: | Storage QE <storage-qe> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.8 | CC: | andriusb, coughlan, cward, djeffery, fbijlsma, jwest, revers, ServeRAIDDriver, sschaefer, syeghiay, tao | ||||
Target Milestone: | rc | Keywords: | OtherQA | ||||
Target Release: | 4.9 | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 523920 | Environment: | |||||
Last Closed: | 2011-02-16 15:31:03 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 626414 | ||||||
Attachments: |
|
Description
Bryn M. Reeves
2010-08-17 14:42:27 UTC
This is the RHEL4 version of bug 523920. Only the memory leak as discussed in the RHEL5 bug is relevant here: Issue:3 -------- The driver tends to not free the memory (FIB) when the management request exits prematurely. The accumulation of such un-freed memory causes the driver to fail to allocate anymore memory (FIB) and hence return 0x70000 value to the upper layer, which puts the file system into read only mode. Fix details: ------------- The fix makes sure to free the memory(FIB) even if the request exits prematurely hence ensuring the driver wouldn’t run out of memory(FIBs) This was accepted upstream in 2.6.33: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=cacb6dc3d7fea751879a225c15e48228415e6359 Patch doesn't apply directly to current EL4 aacraid: $ diffstat /tmp/aacraid-fix-leak.patch aachba.c | 52 +++++++++++++++++++++++++++++++++----------- aacraid.h | 5 +++- commctrl.c | 28 +++++++++++------------ comminit.c | 6 ++++- commsup.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++---------- dpcsup.c | 36 +++++++++++++++++++++++++----- 6 files changed, 154 insertions(+), 45 deletions(-) And does not apply cleanly to the current RHEL4 aacraid: $ patch -p1 < /tmp/aacraid-fix-leak.patch patching file drivers/scsi/aacraid/aachba.c Hunk #1 succeeded at 266 (offset -27 lines). Hunk #2 succeeded at 286 (offset -27 lines). Hunk #3 succeeded at 336 (offset -27 lines). Hunk #6 succeeded at 1480 (offset -10 lines). Hunk #7 succeeded at 1639 (offset -16 lines). Hunk #8 succeeded at 1719 (offset -16 lines). patching file drivers/scsi/aacraid/aacraid.h Hunk #1 FAILED at 12. Hunk #2 FAILED at 1036. 2 out of 2 hunks FAILED -- saving rejects to file drivers/scsi/aacraid/aacraid.h.rej patching file drivers/scsi/aacraid/commctrl.c Hunk #1 succeeded at 142 (offset -11 lines). Hunk #2 succeeded at 309 (offset -13 lines). Hunk #3 FAILED at 593. Hunk #4 FAILED at 645. Hunk #5 FAILED at 695. Hunk #6 FAILED at 734. Hunk #7 succeeded at 727 (offset -45 lines). Hunk #8 succeeded at 765 (offset -45 lines). Hunk #9 succeeded at 803 (offset -45 lines). 4 out of 9 hunks FAILED -- saving rejects to file drivers/scsi/aacraid/commctrl.c.rej patching file drivers/scsi/aacraid/comminit.c Hunk #1 succeeded at 202 (offset 8 lines). Hunk #2 succeeded at 314 (offset 8 lines). patching file drivers/scsi/aacraid/commsup.c Hunk #1 succeeded at 192 (offset 3 lines). Hunk #2 succeeded at 400 (offset 3 lines). Hunk #3 succeeded at 483 (offset 3 lines). Hunk #4 FAILED at 547. Hunk #5 succeeded at 721 (offset 1 line). Hunk #6 succeeded at 742 (offset 1 line). Hunk #7 succeeded at 1393 (offset -1 lines). Hunk #8 succeeded at 1793 (offset -8 lines). Hunk #9 succeeded at 1804 (offset -8 lines). 1 out of 9 hunks FAILED -- saving rejects to file drivers/scsi/aacraid/commsup.c.rej patching file drivers/scsi/aacraid/dpcsup.c [breeves@breeves rhel4]$ patch -R -p1 < /tmp/aacraid-fix-leak.patch patching file drivers/scsi/aacraid/aachba.c Hunk #1 succeeded at 266 (offset -27 lines). Hunk #2 succeeded at 283 (offset -27 lines). Hunk #3 succeeded at 328 (offset -27 lines). Hunk #6 succeeded at 1460 (offset -10 lines). Hunk #7 succeeded at 1617 (offset -16 lines). Hunk #8 succeeded at 1696 (offset -16 lines). patching file drivers/scsi/aacraid/aacraid.h Hunk #1 FAILED at 12. Hunk #2 FAILED at 1036. 2 out of 2 hunks FAILED -- saving rejects to file drivers/scsi/aacraid/aacraid.h.rej patching file drivers/scsi/aacraid/commctrl.c Hunk #1 succeeded at 142 (offset -11 lines). Hunk #2 succeeded at 309 (offset -13 lines). Hunk #3 FAILED at 593. Hunk #4 FAILED at 645. Hunk #5 FAILED at 695. Hunk #6 FAILED at 734. Hunk #7 succeeded at 727 (offset -45 lines). Hunk #8 succeeded at 765 (offset -45 lines). Hunk #9 succeeded at 803 (offset -45 lines). 4 out of 9 hunks FAILED -- saving rejects to file drivers/scsi/aacraid/commctrl.c.rej patching file drivers/scsi/aacraid/comminit.c Hunk #1 succeeded at 202 (offset 8 lines). Hunk #2 succeeded at 312 (offset 8 lines). patching file drivers/scsi/aacraid/commsup.c Hunk #1 succeeded at 192 (offset 3 lines). Hunk #2 succeeded at 393 (offset 3 lines). Hunk #3 succeeded at 474 (offset 3 lines). Hunk #4 FAILED at 516. Hunk #7 succeeded at 1354 (offset -2 lines). Hunk #8 succeeded at 1751 (offset -9 lines). Hunk #9 succeeded at 1761 (offset -9 lines). 1 out of 9 hunks FAILED -- saving rejects to file drivers/scsi/aacraid/commsup.c.rej patching file drivers/scsi/aacraid/dpcsup.c The patch submitted earlier was for RHEL-5 base kernels
>>> Regarding Patch for RHEL-4 base kernels
As per the RHEL4U8 aacraid driver source, the version of the aacraid driver is- 2455.
Earlier we have submitted patch-2461 and on top of that we have submitted patch-24702 to RHEL-5 base kernels, but we haven’t submitted patch-2461 and patch-24702 to RHEL-4 base kernels.
We have planned to submit a fresh patch for RHEL-4 base kernels which includes both patch-2461 and patch-24702.
Could you please let us know whether we need to merge patch-2461 and patch-24702 or should be submitted as two different patches?
(In reply to comment #3) > The patch submitted earlier was for RHEL-5 base kernels > > >>> Regarding Patch for RHEL-4 base kernels > > As per the RHEL4U8 aacraid driver source, the version of the aacraid driver > is- 2455. > > Earlier we have submitted patch-2461 and on top of that we have submitted > patch-24702 to RHEL-5 base kernels, but we haven’t submitted patch-2461 and > patch-24702 to RHEL-4 base kernels. > > We have planned to submit a fresh patch for RHEL-4 base kernels which includes > both patch-2461 and patch-24702. > > Could you please let us know whether we need to merge patch-2461 and > patch-24702 or should be submitted as two different patches? Ideally we want one patch that only addresses the read-only filesystem issue. Is this possible?
>Ideally we want one patch that only addresses the read-only filesystem issue.
>Is this possible?
Based on your suggestion we will be submitting a new patch for RHEL 4 U8 which addresses read-only file system issue alone.
We are not sure on the driver version for this patch which we are going to submit since it doesn’t contain 2461 changes. We have submitted the version 24702 patch for RHEL-5 base kernels.
Please guide us for which version number we need to maintain for upcoming RHEL-4 base kernels.
(In reply to comment #6) > >Ideally we want one patch that only addresses the read-only filesystem issue. > >Is this possible? > > Based on your suggestion we will be submitting a new patch for RHEL 4 U8 > which addresses read-only file system issue alone. > We are not sure on the driver version for this patch which we are going to > submit since it doesn’t contain 2461 changes. We have submitted the version > 24702 patch for RHEL-5 base kernels. > > Please guide us for which version number we need to maintain for upcoming > RHEL-4 base kernels. This is really up to you. Can you append something like -rh4-1 to the end of the version to indicate that it branched? The new patch for RHEL 4.8 will address both File system read-only and False RAID Alert issue, which are customer critical issues. The patch submitted to RHEL 5 base kernel contains the above mentioned fix. For RHEL 4.8 we are planning to change the version number from 2455 to 24551 to indicate that it is branched. We will release the patch for RHEL 4.8 once, HCL QA has qualified it. (In reply to comment #8) > For RHEL 4.8 we are planning to change the version number from 2455 to 24551 to > indicate that it is branched. We will release the patch for RHEL 4.8 once, HCL > QA has qualified it. Please attach details of what HCL did to qualify this patch when the quality effort is complete. Thanks, Rob Hi Rob, We have answered the above query in link below: https://bugzilla.redhat.com/show_bug.cgi?id=523920 comment no:31 Created attachment 443412 [details]
aacraid 24551 patch for RHEL4U8
I am attaching aacraid_24551 patch.
This patch is generated against the RHEL-4U8 which will address the file system
read only and False RAID alert issues
See potential hang/data corruption issue with equivalent patch in rhel5.6: https://bugzilla.redhat.com/show_bug.cgi?id=523920#c34 This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Committed in 89.43.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ Test Results? An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html |