Bug 643236
Summary: | iscsi: get nopout and conn errors. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Mike Christie <mchristi> |
Component: | kernel | Assignee: | Mike Christie <mchristi> |
Status: | CLOSED ERRATA | QA Contact: | Gris Ge <fge> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 6.0 | CC: | coughlan, fge, qcai |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.32-112.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-05-19 12:31:55 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Mike Christie
2010-10-15 03:18:30 UTC
Thank you for your bug report. This issue was evaluated for inclusion in the current release of Red Hat Enterprise Linux. Unfortunately, we are unable to address this request in the current release. Because we are in the final stage of Red Hat Enterprise Linux 6 development, only significant, release-blocking issues involving serious regressions and data corruption can be considered. If you believe this issue meets the release blocking criteria as defined and communicated to you by your Red Hat Support representative, please ask your representative to file this issue as a blocker for the current release. Otherwise, ask that it be evaluated for inclusion in the next minor release of Red Hat Enterprise Linux. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Patch(es) available on kernel-2.6.32-112.el6 Mike, I failed to reproduce this issue by reduce node.session.cmds_max to 16 and perform huge IO aganist that iscsi disk for about 2 hour. The network between target and initiator is about 200ms. 10 process 'perl -e while(1){}' is running to consume CPU. I don't have any change to see any 'Could not send nopout' error. Can you advise me on how to reproduce this problem? It is a little difficult. What target are you using? It is easiest to hit with a Equallogic target. You need to use bnx2i or cxgb3i (iscsi_tcp does not show the problem), make sure your IO test is set to send more than cmds_max IOs and your target also has to support that many IOs. So I think most targets support at least 32 cmds. So set node.session.cmds_max = 16 node.session.queue_depth = 32 (either set this in iscsid.conf then rerun iscsiadm discovery command so the new iscsid.conf values are used or run iscsiadm -m node -o update -n $NAME_OF_SETTING_ABOVE -v $VALUE_ABOVE on a existing target portal record). Then we want to send more than cmds_max IOs. With this command we would send about 64: disktest -PT -T130 -h1 -K64 -B256k -ID /dev/sdXYZ If you cannot find a EQL target, then if you use the settings about and run the command above and you do cat /sys/class/scsi_host/hostX/host_busy then that value should always be less than cmds_max if the problem is fixed. If the problem is larger then cmds_max then you hit the problem. Mike, We have emulex be2iscsi at hand (not configured). Does that hit this problem? Yeah you would hit the problem with that driver. I think you can just do a sanity check. I tested it here and I believe Chelsio tested it too (They are the ones that pinged me about merging the patch and tested it upstream). To get the timing right is really hard. I do not think it is worth your time to try and replicate the problem. As long as there are not regressions it should be ok. Code reviewed. Patch applyed. iscsi basic funciton was tested by errata and it's pass the test. No Hardware and Sanity Only. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0542.html |