Bug 529712
| Summary: | RHEL5.3: fence_scsi, mulitpath and persistent reservations | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Benjamin Kahn <bkahn> | ||||||
| Component: | cman | Assignee: | Ryan O'Hara <rohara> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 5.3 | CC: | agk, alfredo.moralejo, bmarzins, bmr, bzeranski, ccaulfie, cluster-maint, djansa, dwysocha, edamato, heinzm, jkortus, lmcilroy, mbroz, mkearey, pm-eus, prockai, rfujita, rohara, tao, tdunnon | ||||||
| Target Milestone: | rc | Keywords: | Reopened, ZStream | ||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | cman-2.0.115-1.el5_4.7 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2009-11-23 08:31:00 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 516625 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
|
Description
Benjamin Kahn
2009-10-19 15:49:39 UTC
Fixed in RHEL54 branch. commit e790e4b988f7f12a327a020244c40b5bf1969f6d I'm currently having trouble giving it a GO because of the following:
1. Init script layout
There should be only one section looking like this:
[...]
case $1 in
start)
echo -n "Starting scsi_reserve:"
[...]
Having multiple of these seems quite confusing to me.
2. [FAILED] output is not written correctly
This seems to be problem of missing echo call whenever
failure call is used.
3. Wrong regex for multipath devices
if [[ $dm_uuid =~ "^mpath-*" ]]; then
These are "mpath", "mpath-", "mpath--",... Is this really what it was meant to be?
In my setup not all devices match that and the script fails saying that " is not a multipath device". Full list of devices is attached at the bottom.
4. Undefined variable $dev
"[error] $dev is not a multipath device"
This variable is not defined (should it be $pv_dev?).
With minor changes the script was working and passed our SFENCE test.
[root@dash-01 init.d]# dmsetup ls
mpath2 (253, 3)
mpath1 (253, 2)
VolGroup00-LogVol01 (253, 1)
VolGroup00-LogVol00 (253, 0)
scsifence-scsifence0 (253, 6)
mpath1p1 (253, 4)
mpath2p1 (253, 5)
[root@dash-01 init.d]# vi scsi_reserve_orig
[root@dash-01 init.d]# dmsetup info
Name: mpath2
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 1
Major, minor: 253, 3
Number of targets: 1
UUID: mpath-3600d0230006c1c440bdc2d15ef2cbc01
Name: mpath1
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 1
Major, minor: 253, 2
Number of targets: 1
UUID: mpath-3600d0230006c1c440bdc2d15ef2cbc00
Name: VolGroup00-LogVol01
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 0
Major, minor: 253, 1
Number of targets: 1
UUID: LVM-hoS752fEtaVLsQQ83okl3G9n4hsHxNrS2mLzdOSG8e1M21syhwKbqPe5SVRtUxv6
Name: VolGroup00-LogVol00
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 0
Major, minor: 253, 0
Number of targets: 1
UUID: LVM-hoS752fEtaVLsQQ83okl3G9n4hsHxNrSXyUSl1uXaf0a76c4LJm3yoFsd3BpYo2g
Name: scsifence-scsifence0
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 0
Event number: 0
Major, minor: 253, 6
Number of targets: 1
UUID: LVM-kZ56J6lyRpI7dnGBq8iyoEkQKaNiVgadu8E4gCEuBHKcMJnoVKOlYV7cS0E1vvAs
Name: mpath1p1
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 0
Major, minor: 253, 4
Number of targets: 1
UUID: part1-mpath-3600d0230006c1c440bdc2d15ef2cbc00
Name: mpath2p1
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 0
Event number: 0
Major, minor: 253, 5
Number of targets: 1
UUID: part1-mpath-3600d0230006c1c440bdc2d15ef2cbc01
and one more:
logger -t scsi_reserve \
"[info] leaving the fence domain"
# fence_tool leave
We should not be logging what we have not done :).
(In reply to comment #9) > 1. Init script layout > There should be only one section looking like this: > [...] > case $1 in > > start) > echo -n "Starting scsi_reserve:" > [...] > Having multiple of these seems quite confusing to me. This is by design. Admittedly it is poor design, but it was needed for BZ 530400 since we need to print out (echo) what operation we are doing before we call 'failure'. A better design would be to write subroutines for the common tests, but that seemed to be too risky for a z-stream release. The current code is slightly confusing, but works as expected. > 2. [FAILED] output is not written correctly > This seems to be problem of missing echo call whenever > failure call is used. Yes. There are a few echo statements missing after calls to 'failure'. Will fix under BZ 530400. > 3. Wrong regex for multipath devices > if [[ $dm_uuid =~ "^mpath-*" ]]; then > These are "mpath", "mpath-", "mpath--",... Is this really what it was meant to > be? > In my setup not all devices match that and the script fails saying that " is > not a multipath device". Full list of devices is attached at the bottom. Actually there are two problems. The current regex works, but should be "^mpath-" or "^mpath". So you are correct about that. The other issue is due to your dm_uuid, which was "^part[1-9]-mpath-". This is due to the fact that you are using partitions, which have never been supported with fence_scsi. We can work around this such that partitions are allowed. It just takes an extra elif statement in the code to get the correct dm-multipath slaves. I have a fix for this. > 4. Undefined variable $dev > "[error] $dev is not a multipath device" > This variable is not defined (should it be $pv_dev?). Yes. Good catch. (In reply to comment #10) > and one more: > > logger -t scsi_reserve \ > "[info] leaving the fence domain" > # fence_tool leave > > We should not be logging what we have not done :). Yes, I need to uncomment the 'fence_tool leave'. I commented this out during development and forgot to fix it. Thanks. Fixed the above issues and pushed to RHEL54 branch. - The fence_tool leave command is now correctly uncommented. - The $dev variable is fixed to be $pv_dev. - The script can now handle partitions with dm-multipath. This takes place when we get the dm device's uuid. Added case to check if uuid for dm device matches "^part[0-9]-mpath". If so, the slave devices can be found next in the slave dir for the current pv_dev. commit d8b5762fbe018bba8282492d41f6231d2ce8930e Created attachment 369340 [details]
Handle partitions with multipath in scsi_reserve script.
Please change the regex to "^part[0-9]*-mpath" as we spoke. This is to allow more than 10 partitions on the device. Created attachment 369721 [details]
Fix regular expression.
Here is fix for incorrect regular expression.
commit eb0da3897be8b752b7abbabcdf1a871ad3aec9e4
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1598.html |