Bug 709400 - fs.sh resource agent monitor should not return an error if device does not exist.
Summary: fs.sh resource agent monitor should not return an error if device does not ex...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents
Version: 6.1
Hardware: x86_64
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Chris Feist
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-31 15:52 UTC by bcodding
Modified: 2011-12-06 12:05 UTC (History)
6 users (show)

Fixed In Version: resource-agents-3.9.2-2.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 12:05:28 UTC
Target Upstream Version:


Attachments (Terms of Use)
A patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor( (424 bytes, application/octet-stream)
2011-05-31 15:52 UTC, bcodding
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1580 0 normal SHIPPED_LIVE Low: resource-agents security, bug fix, and enhancement update 2011-12-06 00:38:57 UTC

Description bcodding 2011-05-31 15:52:20 UTC
Created attachment 502032 [details]
A patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor(

The fs.sh resource agent (ocf::redhat:fs.sh) reports an error condition (invalid parameter) when called with "monitor" if the referenced device doesn't exist.  The device may not be present yet because a dependency has not been started (such as ocf::heartbeat:lvm-cluster.sh).

This error condition prevents the resource from being started -- even if the dependency is met by grouping the resources.

Name        : resource-agents
Version     : 3.0.12
Release     : 22.el6

Reproduced every time.. requires a cluster with shared storage and properly configured CLVM, with device resources configured to only be available to one node at a time.

Explicit reproduction can be performed by simply specifying a non-existent device to the resource agent.  For example:

[root@r1 ~]# OCF_RESKEY_mountpoint="/dev_rack1a" OCF_RESKEY_device="/dev/cluster_vg_01/r_lv_01" OCF_RESKEY_fstype="ext4" /usr/lib/ocf/resource.d/redhat/fs.sh monitor
<err>    start_filesystem: Could not match /dev/cluster_vg_01/r_lv_01 with a real device
start_filesystem: Could not match /dev/cluster_vg_01/r_lv_01 with a real device

It would be desirable for the monitor call to return $OCF_NOT_RUNNING, which should be entirely true if the device does not exist.

I've attached a patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor().

Comment 3 Chris Feist 2011-07-12 15:01:48 UTC
Are you using this agent with pacemaker or rgmanager?  And can you also send us your cluster.conf or pacemaker configuration file?

Thanks,
Chris

Comment 4 bcodding 2011-07-12 15:19:44 UTC
Using pacemaker.  Here's an example config:

node kudu1
node kudu2
node kudu3
primitive fs01-export ocf:uvm:exportfs \
        params export="/fs01    *(sec=krb5,rw,fsid=1) c_a(sec=krb5,rw,insecure) c_b(rw,no_root_squash,sec=sys)" \
        op start interval="0" timeout="40"
primitive fs01-fs ocf:redhat:fs.sh \
        params mountpoint="/fs01" device="/dev/cluster_vg_01/kudu_lv_01" fstype="ext4" nfslock="1" \
        op start interval="0" timeout="900" \
        op stop interval="0" timeout="30"
primitive fs01-lvm ocf:heartbeat:lvm-cluster.sh \
        params vg_name="cluster_vg_01" lv_name="kudu_lv_01" exclusive="1"
primitive nfs1-delay ocf:heartbeat:Delay \
        params stopdelay="32" startdelay="0" \
        op start interval="0" timeout="30" \
        op stop interval="0" timeout="60"
primitive nfs1-grace lsb:nfs-graceful
primitive nfs1-ip ocf:heartbeat:IPaddr2 \
        params ip="132.198.100.211" cidr_netmask="23"
group nfs1 fs01-lvm fs01-fs nfs1-delay fs01-export nfs1-grace nfs1-ip
location prefer-kudu1_nfs1 nfs1 50: kudu1
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
        cluster-infrastructure="cman" \
        stonith-enabled="false" \
        last-lrm-refresh="1309272190"

Comment 5 Chris Feist 2011-07-12 16:43:17 UTC
Thanks for the quick response, we should be able to get this fixed soon.

Comment 7 Chris Feist 2011-07-22 19:50:01 UTC
After resource-agents update:

[root@ask-04 ~]# OCF_RESKEY_device="/dev/noexist" OCF_RESKEY_fstype="ext3" /usr/lib/ocf/resource.d/redhat/fs.sh monitor
<err>    start_filesystem: Could not match /dev/noexist with a real device
[fs.sh] start_filesystem: Could not match /dev/noexist with a real device
[root@ask-04 ~]# echo $?
7

Comment 11 errata-xmlrpc 2011-12-06 12:05:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1580.html


Note You need to log in before you can comment on or make changes to this bug.