Red Hat Bugzilla – Bug 709400
fs.sh resource agent monitor should not return an error if device does not exist.
Last modified: 2011-12-06 07:05:28 EST
Created attachment 502032 [details] A patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor( The fs.sh resource agent (ocf::redhat:fs.sh) reports an error condition (invalid parameter) when called with "monitor" if the referenced device doesn't exist. The device may not be present yet because a dependency has not been started (such as ocf::heartbeat:lvm-cluster.sh). This error condition prevents the resource from being started -- even if the dependency is met by grouping the resources. Name : resource-agents Version : 3.0.12 Release : 22.el6 Reproduced every time.. requires a cluster with shared storage and properly configured CLVM, with device resources configured to only be available to one node at a time. Explicit reproduction can be performed by simply specifying a non-existent device to the resource agent. For example: [root@r1 ~]# OCF_RESKEY_mountpoint="/dev_rack1a" OCF_RESKEY_device="/dev/cluster_vg_01/r_lv_01" OCF_RESKEY_fstype="ext4" /usr/lib/ocf/resource.d/redhat/fs.sh monitor <err> start_filesystem: Could not match /dev/cluster_vg_01/r_lv_01 with a real device start_filesystem: Could not match /dev/cluster_vg_01/r_lv_01 with a real device It would be desirable for the monitor call to return $OCF_NOT_RUNNING, which should be entirely true if the device does not exist. I've attached a patch that returns $OCF_NOT_RUNNING instead of $OCF_ERR_ARGS inside do_monitor().
Are you using this agent with pacemaker or rgmanager? And can you also send us your cluster.conf or pacemaker configuration file? Thanks, Chris
Using pacemaker. Here's an example config: node kudu1 node kudu2 node kudu3 primitive fs01-export ocf:uvm:exportfs \ params export="/fs01 *(sec=krb5,rw,fsid=1) c_a(sec=krb5,rw,insecure) c_b(rw,no_root_squash,sec=sys)" \ op start interval="0" timeout="40" primitive fs01-fs ocf:redhat:fs.sh \ params mountpoint="/fs01" device="/dev/cluster_vg_01/kudu_lv_01" fstype="ext4" nfslock="1" \ op start interval="0" timeout="900" \ op stop interval="0" timeout="30" primitive fs01-lvm ocf:heartbeat:lvm-cluster.sh \ params vg_name="cluster_vg_01" lv_name="kudu_lv_01" exclusive="1" primitive nfs1-delay ocf:heartbeat:Delay \ params stopdelay="32" startdelay="0" \ op start interval="0" timeout="30" \ op stop interval="0" timeout="60" primitive nfs1-grace lsb:nfs-graceful primitive nfs1-ip ocf:heartbeat:IPaddr2 \ params ip="132.198.100.211" cidr_netmask="23" group nfs1 fs01-lvm fs01-fs nfs1-delay fs01-export nfs1-grace nfs1-ip location prefer-kudu1_nfs1 nfs1 50: kudu1 property $id="cib-bootstrap-options" \ dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ cluster-infrastructure="cman" \ stonith-enabled="false" \ last-lrm-refresh="1309272190"
Thanks for the quick response, we should be able to get this fixed soon.
After resource-agents update: [root@ask-04 ~]# OCF_RESKEY_device="/dev/noexist" OCF_RESKEY_fstype="ext3" /usr/lib/ocf/resource.d/redhat/fs.sh monitor <err> start_filesystem: Could not match /dev/noexist with a real device [fs.sh] start_filesystem: Could not match /dev/noexist with a real device [root@ask-04 ~]# echo $? 7
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1580.html