Bug 521163 - fence_scsi, device mapper mulitpath and persistent reservations - path failures reservation conflict SCSI error: return code = 0x00000018
Summary: fence_scsi, device mapper mulitpath and persistent reservations - path failur...
Keywords:
Status: CLOSED DUPLICATE of bug 516625
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.3
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Ryan O'Hara
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-04 00:13 UTC by Michael Kearey
Modified: 2016-04-26 14:57 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-22 15:57:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michael Kearey 2009-09-04 00:13:07 UTC
Description of problem:

It appears that device mapper multipath used with fence_scsi ( scsi reservations ) and the tur path checker has problems with paths failing and reservation conflicts.

From what I could find we should be supporting such a configuration. Specific failure messages:


kernel: sd 1:0:7:0: reservation conflict
kernel: sd 1:0:7:0: SCSI error: return code = 0x00000018
kernel: end_request: I/O error, dev sds, sector 118904128
kernel: device-mapper: multipath: Failing path 65:32.


From the fence agent itself :

fenced[6186]: agent "fence_scsi" reports: Execuing [sg_persist -n -d /dev/dm-9 -o -A -K 63b40001 -S 63b40004 -T 5] Unable to execute sg_persist (/dev/dm-9). 

fenced[6186]: fence "node-b" failed
fenced[6186]: fencing node "node-b"
kernel: sd 3:0:1:0: reservation conflict

etc


Version-Release number of selected component (if applicable):

device-mapper-multipath-0.4.7-23.el5_3.4-x86_64

How reproducible:

100%

Steps to Reproduce:
1. Install RHEL5.3
2. Add HBA, connect to fibre chan switch, and SAN storage - multiple paths
3. Utilise tur ( device selected ) path checker
4. Configure cluster
5. enable fence_scsi 
  
Actual results:

Reservation conflicts , path failures 

Expected results:

Paths should remain up, scsi reservation should be transparently passed to real devices via multipath and managed properly via path checker tur

Additional info:

We performed a divide and conquer approach and tested the following scenarious:
1. Remove device-mapper-multipath from the equation.  Only present 1 device from the SAN instead of 16.  Test to see if you can reproduce the problem.

The result was NO problems, scsi reservation appears to work

2. Keep zoning with only 1 disk path, and re-add device-mapper-multipath.  Try to reproduce with multipath and only 1 path. Test to see if you can reproduce the problem.

Result was no problem scsi reservation works


3. Re-add some more paths (2-4) and attempt to reproduce the problem.

We see inconsistent results - Sometimes ONE host of the cluster sees reservation conflicts, with a regular time period when the errors are logged. On other occasions, all four hosts see the errors. 

With multiple paths to the active/passive storage processors on the SAN, we see problems. More details will be provided in an update.


Note You need to log in before you can comment on or make changes to this bug.