Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 521163

Summary: fence_scsi, device mapper mulitpath and persistent reservations - path failures reservation conflict SCSI error: return code = 0x00000018
Product: Red Hat Enterprise Linux 5 Reporter: Michael Kearey <mkearey>
Component: cmanAssignee: Ryan O'Hara <rohara>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 5.3CC: agk, bmarzins, bmr, christophe.varoqui, cluster-maint, dwysocha, edamato, egoggin, heinzm, junichi.nomura, kueda, lmb, mbroz, prockai, rfujita, rohara, tao, tranlan
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-22 15:57:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael Kearey 2009-09-04 00:13:07 UTC
Description of problem:

It appears that device mapper multipath used with fence_scsi ( scsi reservations ) and the tur path checker has problems with paths failing and reservation conflicts.

From what I could find we should be supporting such a configuration. Specific failure messages:


kernel: sd 1:0:7:0: reservation conflict
kernel: sd 1:0:7:0: SCSI error: return code = 0x00000018
kernel: end_request: I/O error, dev sds, sector 118904128
kernel: device-mapper: multipath: Failing path 65:32.


From the fence agent itself :

fenced[6186]: agent "fence_scsi" reports: Execuing [sg_persist -n -d /dev/dm-9 -o -A -K 63b40001 -S 63b40004 -T 5] Unable to execute sg_persist (/dev/dm-9). 

fenced[6186]: fence "node-b" failed
fenced[6186]: fencing node "node-b"
kernel: sd 3:0:1:0: reservation conflict

etc


Version-Release number of selected component (if applicable):

device-mapper-multipath-0.4.7-23.el5_3.4-x86_64

How reproducible:

100%

Steps to Reproduce:
1. Install RHEL5.3
2. Add HBA, connect to fibre chan switch, and SAN storage - multiple paths
3. Utilise tur ( device selected ) path checker
4. Configure cluster
5. enable fence_scsi 
  
Actual results:

Reservation conflicts , path failures 

Expected results:

Paths should remain up, scsi reservation should be transparently passed to real devices via multipath and managed properly via path checker tur

Additional info:

We performed a divide and conquer approach and tested the following scenarious:
1. Remove device-mapper-multipath from the equation.  Only present 1 device from the SAN instead of 16.  Test to see if you can reproduce the problem.

The result was NO problems, scsi reservation appears to work

2. Keep zoning with only 1 disk path, and re-add device-mapper-multipath.  Try to reproduce with multipath and only 1 path. Test to see if you can reproduce the problem.

Result was no problem scsi reservation works


3. Re-add some more paths (2-4) and attempt to reproduce the problem.

We see inconsistent results - Sometimes ONE host of the cluster sees reservation conflicts, with a regular time period when the errors are logged. On other occasions, all four hosts see the errors. 

With multiple paths to the active/passive storage processors on the SAN, we see problems. More details will be provided in an update.