Bug 497411
Summary: | kernel BUG at drivers/scsi/libiscsi.c:301! | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | rob_thomas | ||||||||
Component: | kernel | Assignee: | Mike Christie <mchristi> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 5.3 | CC: | ds, emcnabb, jfeeney, joseph_salisbury, jtrungale, martinez, matt_domsch, mchristi, tao, wwlinuxengineering | ||||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | i686 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2009-09-02 08:09:55 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 502916 | ||||||||||
Attachments: |
|
Description
rob_thomas
2009-04-23 19:21:13 UTC
Created attachment 340994 [details]
multipath config file.
Adding multipath config for reference.
Created attachment 340996 [details]
console log of panic
console log of panic.
Created attachment 341466 [details] do not bug when nop is sent while cleaning up session The session recovery code would set the stop bits, then suspend the recv path, so if a nop was being sent in response to a nop from the recv path we might hit the bug on in the conn send pdu path. Please try the kernel here http://people.redhat.com/dzickus/el5/141.el5/ with this patch (this is the proposed update to the iscsi layer for 5.4): http://people.redhat.com/mchristi/iscsi/rhel5.4/v3/0001-RHEL-5.4-update-iscsi-layer-and-drivers.patch And then apply the patch that I attached to this bugzilla handle-nops-that-race-with-recovery.patch This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Thanks for the quick turn. I want be able to get to it probably until the end of the week. The array's are tied up with something else right now. Rob (In reply to comment #3) I do have the same problem, but on a different architecture. I am using x86_64 and the system panics with exactly the same message. The hardware I'm using are EqualLogic Arrays and Dell R610 servers. I tried to build a kernel RPM with the herein supplied patches, but it failed while testing the ABI whitelists. So I installed the compiled kernel by hand and rebootet into it. Since then the login process to the arrays does not panic anymore. Thanks for testing. Last night I made some rpms here: http://people.redhat.com/mchristi/iscsi/rhel5.4/test/kernels/ It has this fix and some other fixes for issues we found upstream that I plan on submitting for RHEL 5.4. (In reply to comment #7) > Last night I made some rpms here: > http://people.redhat.com/mchristi/iscsi/rhel5.4/test/kernels/ > It has this fix and some other fixes for issues we found upstream that I plan > on submitting for RHEL 5.4. Well, the kernel boots and I did let it run during the last weekend on the machines and all is still working fine so far. But there was a panic during the first reboot of the machines, that I could not catch (looping panic messages of he OOM killer) and I do not know where that originated. After a second reboot all was fine and I was not able to reproduce the panics. Looks good here Mike. I'm able to multipath 64 paths using 8 nics and iface. Is 2.6.18-157.el5PAE the recommended fix for this? I'm currently experiencing this same problem on 2.6.18-128.1.10.el5PAE with PE1950's and an EL-PS70E. Joey, RHEL5.4 will have this fix (2.6.18-157.el is a test kernel for 5.4). As for released errata, this was included in kernel-2.6.18-128.1.14: http://rhn.redhat.com/errata/RHSA-2009-1106.html Can you try updating to this and let us know if the problem persists? I can confirm that 2.6.18-128.1.14.el5PAE fixes this problem (else it wouldn't have been in the errata, right? ;) ). 2.6.18-128.1.16.el5PAE also tests good. I've tested an EqualLogic array against several U4 kernels. I have also tested the U4 iSCSI patches and iscsi-utils from U4 on a RHEL5U3 kernel. The issue that I was experiencing has been resovled. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html |