Bug 709400

Summary: fs.sh resource agent monitor should not return an error if device does not exist.
Product: Red Hat Enterprise Linux 6 Reporter: bcodding
Component: resource-agentsAssignee: Chris Feist <cfeist>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: low Docs Contact:
Priority: low    
Version: 6.1CC: agk, cfeist, cluster-maint, djansa, lhh, mjuricek
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: resource-agents-3.9.2-2.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 12:05:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
A patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor( none

Description bcodding 2011-05-31 15:52:20 UTC
Created attachment 502032 [details]
A patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor(

The fs.sh resource agent (ocf::redhat:fs.sh) reports an error condition (invalid parameter) when called with "monitor" if the referenced device doesn't exist.  The device may not be present yet because a dependency has not been started (such as ocf::heartbeat:lvm-cluster.sh).

This error condition prevents the resource from being started -- even if the dependency is met by grouping the resources.

Name        : resource-agents
Version     : 3.0.12
Release     : 22.el6

Reproduced every time.. requires a cluster with shared storage and properly configured CLVM, with device resources configured to only be available to one node at a time.

Explicit reproduction can be performed by simply specifying a non-existent device to the resource agent.  For example:

[root@r1 ~]# OCF_RESKEY_mountpoint="/dev_rack1a" OCF_RESKEY_device="/dev/cluster_vg_01/r_lv_01" OCF_RESKEY_fstype="ext4" /usr/lib/ocf/resource.d/redhat/fs.sh monitor
<err>    start_filesystem: Could not match /dev/cluster_vg_01/r_lv_01 with a real device
start_filesystem: Could not match /dev/cluster_vg_01/r_lv_01 with a real device

It would be desirable for the monitor call to return $OCF_NOT_RUNNING, which should be entirely true if the device does not exist.

I've attached a patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor().

Comment 3 Chris Feist 2011-07-12 15:01:48 UTC
Are you using this agent with pacemaker or rgmanager?  And can you also send us your cluster.conf or pacemaker configuration file?

Thanks,
Chris

Comment 4 bcodding 2011-07-12 15:19:44 UTC
Using pacemaker.  Here's an example config:

node kudu1
node kudu2
node kudu3
primitive fs01-export ocf:uvm:exportfs \
        params export="/fs01    *(sec=krb5,rw,fsid=1) c_a(sec=krb5,rw,insecure) c_b(rw,no_root_squash,sec=sys)" \
        op start interval="0" timeout="40"
primitive fs01-fs ocf:redhat:fs.sh \
        params mountpoint="/fs01" device="/dev/cluster_vg_01/kudu_lv_01" fstype="ext4" nfslock="1" \
        op start interval="0" timeout="900" \
        op stop interval="0" timeout="30"
primitive fs01-lvm ocf:heartbeat:lvm-cluster.sh \
        params vg_name="cluster_vg_01" lv_name="kudu_lv_01" exclusive="1"
primitive nfs1-delay ocf:heartbeat:Delay \
        params stopdelay="32" startdelay="0" \
        op start interval="0" timeout="30" \
        op stop interval="0" timeout="60"
primitive nfs1-grace lsb:nfs-graceful
primitive nfs1-ip ocf:heartbeat:IPaddr2 \
        params ip="132.198.100.211" cidr_netmask="23"
group nfs1 fs01-lvm fs01-fs nfs1-delay fs01-export nfs1-grace nfs1-ip
location prefer-kudu1_nfs1 nfs1 50: kudu1
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
        cluster-infrastructure="cman" \
        stonith-enabled="false" \
        last-lrm-refresh="1309272190"

Comment 5 Chris Feist 2011-07-12 16:43:17 UTC
Thanks for the quick response, we should be able to get this fixed soon.

Comment 7 Chris Feist 2011-07-22 19:50:01 UTC
After resource-agents update:

[root@ask-04 ~]# OCF_RESKEY_device="/dev/noexist" OCF_RESKEY_fstype="ext3" /usr/lib/ocf/resource.d/redhat/fs.sh monitor
<err>    start_filesystem: Could not match /dev/noexist with a real device
[fs.sh] start_filesystem: Could not match /dev/noexist with a real device
[root@ask-04 ~]# echo $?
7

Comment 11 errata-xmlrpc 2011-12-06 12:05:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1580.html