| Summary: | be2iscsi: panic in multipath configuration on Clariion CX4-480 | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Martin Wilck <martin.wilck> |
| Component: | kernel | Assignee: | Rob Evers <revers> |
| Status: | CLOSED NOTABUG | QA Contact: | Storage QE <storage-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.1 | CC: | czhang, fge, gasmith, lmcilroy, ltroan, mchristi, michael.hagmann |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-08-07 09:22:27 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 696653, 720397, 782183, 840683 | ||
|
Description
Martin Wilck
2011-07-01 14:39:49 UTC
I think this is the following code:
1133 static void
1134 be_complete_io(struct beiscsi_conn *beiscsi_conn,
1135 struct iscsi_task *task, struct sol_cqe *psol)
1136 {
...
1156 status = ((psol->dw[offsetof(struct amap_sol_cqe, i_sts) / 32]
1157 & SOL_STS_MASK) >> 8);
1158 flags = ((psol->dw[offsetof(struct amap_sol_cqe, i_flags) / 32]
1159 & SOL_FLAGS_MASK) >> 24) | 0x80;
1160
1161 task->sc->result = (DID_OK << 16) | status; <==== Crash
Looks as if be_complete_io() is calles with a task with sc == NULL. It is apparently possible that task->sc can be NULL, at least there are some places in the code where this condition is checked.
Maybe the dh_emc driver creates a special command that would cause this situation to occur?
Setting FJ6.2bugs tracker (was FJ6.1Bugs tracker) and requested an exception until we understand the importance to Fujitsu to fix in 6.2 or 6.2.z. Bug is public so Emulex and EMC can view it. May be DUP of Bug 738934 per comment #3 above. This is of course important (it's a PANIC), and we haven't been able to release be2iscsi for RHEL6.1 for this reason (and bug 726353). I can't see comment #3, nor bug 738924. I've already requested FJ access to Bug 738934 before possibly closing it as DUP - awaiting response. The oops in this bz described in comment #2 should be fixed in 6.2. It got fixed in a general be2iscsi update in 6.2. https://bugzilla.redhat.com/show_bug.cgi?id=738934 is for a different issue. We saw the oops too in that bz, but we are trying to figure out why a scsi command timed out. In our test, when we enabled CHAP it would lead to scsi commands timing out and the scsi eh running. We would then hit the same oops you guys hit in here. If we disabled CHAP though, it all worked ok. No scsi command timeouts. So we only have 738934 open to investigate why CHAP is causing problems for some of our setups. @martin Have you checked this issue with 6.3? Mike's comment#9 suggests that this particular oops was resolved in 6.2 and the other similar problem from bug 738934 was resolved in kernel-2.6.32-279.el6 / RHEL6.3 via http://rhn.redhat.com/errata/RHSA-2012-0862.html "Enabling CHAP (Challenge-Handshake Authentication Protocol) on an iSCSI target for the be2iscsi driver results in kernel panic. To work around this issue, disable CHAP on the iSCSI target." Also, bug 726353 was closed awaiting a retest from your QA guys, so a retest on 6.3 would help clarify the current status. We haven't seen this any more. The bug can be closed. Thanks for the clarification, closing as requested. |