Description of problem: The monitoring of the nfsexports to nfsclinents goes in to an endless loop. See attached syslog. When using inheritance in the cluster.conf it looks like /usr/share/cluster/nfsclient.sh can't monitor the export properly. Problem #1: /var/lib/nfs/etab doesn't have a slash '/' at the end of the export path. Yet $OCF_RESKEY_path contains the end slash '/'. When checking if the export is present in: cat $tmpfn | tr -d "\n" | sed -e 's/([^)]*)/\n/g' | grep -q \ "^${OCF_RESKEY_path}[\t ]*.*${OCF_RESKEY_target_tmp}" rv=$? It doesn't match you try to match: /nfsexport/lv_app bamse.niceriver.net with ^/nfsexport/lv_app/[\t ]*.*bamse.niceriver.net No so the monitor is exiting with 1 and the whole thing error out. Problem #2: Ip doesn't work in the target well you try to make it work here. declare OCF_RESKEY_target_tmp=$(clufindhostname -i "$OCF_RESKEY_target") but the fact is that the /var/lib/nfs/etab actually contain the IP address so then it's a no go again due to a miss match.. Anyways here is a version that will work :).. for both IP and hostnames as well as with or without ending '/' -- yes could be more brain behind the regexp but I just made it simple.. declare OCF_RESKEY_target_tmp=$(clufindhostname -i "$OCF_RESKEY_target") if [ $? -ne 0 ]; then [ "$OCF_RESKEY_use_cache" = "1" ] || rm -f $tmpfn ocf_log err "nfsclient:$OCF_RESKEY_name is missing!" exit 1 fi cat $tmpfn | tr -d "\n" | sed -e 's/([^)]*)/\n/g' | grep -q \ "^${OCF_RESKEY_path}[\t ]*.*${OCF_RESKEY_target_tmp}" rv=$? if [ $rv -eq 0 ]; then [ "$OCF_RESKEY_use_cache" = "1" ] || rm -f $tmpfn exit 0 fi cat $tmpfn | tr -d "\n" | sed -e 's/([^)]*)/\n/g' | grep -q \ "^${OCF_RESKEY_path}[\t ]*.*${OCF_RESKEY_target}" rv=$? if [ $rv -eq 0 ]; then [ "$OCF_RESKEY_use_cache" = "1" ] || rm -f $tmpfn exit 0 fi OCF_RESKEY_path_tmp=$(echo $OCF_RESKEY_path | sed -e 's@/$@@') cat $tmpfn | tr -d "\n" | sed -e 's/([^)]*)/\n/g' | grep -q \ "^${OCF_RESKEY_path_tmp}[\t ]*.*${OCF_RESKEY_target_tmp}" rv=$? if [ $rv -eq 0 ]; then [ "$OCF_RESKEY_use_cache" = "1" ] || rm -f $tmpfn exit 0 fi cat $tmpfn | tr -d "\n" | sed -e 's/([^)]*)/\n/g' | grep -q \ "^${OCF_RESKEY_path_tmp}[\t ]*.*${OCF_RESKEY_target}" rv=$? [ "$OCF_RESKEY_use_cache" = "1" ] || rm -f $tmpfn if [ $rv -eq 0 ]; then exit 0 fi ocf_log err "nfsclient:$OCF_RESKEY_name is missing! at end" exit 1 ;;
Created attachment 488253 [details] Fixes problem #1.
Problem #2 is fixed by the fix for bug 592613
POSTed upstream: https://github.com/ClusterLabs/resource-agents/pull/2
The patch in this bugzilla trims the trailing slash prior to dealing with exports. This prevents the half of the problem from occurring. Here is a successful test that would fail without the patch: [root@rhel5-1 ~]# OCF_RESKEY_name=a OCF_RESKEY_path=/tmp/ OCF_RESKEY_target="ayanami" /usr/share/cluster/nfsclient.sh start <info> Adding export: ayanami:/tmp [root@rhel5-1 ~]# echo $? 0 [root@rhel5-1 ~]# OCF_RESKEY_name=a OCF_RESKEY_path=/tmp/ OCF_RESKEY_target="ayanami" /usr/share/cluster/nfsclient.sh status [root@rhel5-1 ~]# echo $? 0 [root@rhel5-1 ~]# OCF_RESKEY_name=a OCF_RESKEY_path=/tmp/ OCF_RESKEY_target="ayanami" /usr/share/cluster/nfsclient.sh stop <info> Removing export: ayanami:/tmp [root@rhel5-1 ~]# echo $? 0
Verified in version rgmanager-2.0.52-19.el5, kernel 2.6.18-266.el5 [root@a3 ~]# rpm -q rgmanager rgmanager-2.0.52-19.el5 [root@a3 ~]# export OCF_RESKEY_path=/tmp/ [root@a3 ~]# export OCF_RESKEY_target=a2 [root@a3 ~]# export OCF_RESKEY_name=a [root@a3 ~]# /usr/share/cluster/nfsclient.sh start ; echo $? <info> Adding export: a2:/tmp 0 [root@a3 ~]# /usr/share/cluster/nfsclient.sh status ; echo $? 0 [root@a3 ~]# /usr/share/cluster/nfsclient.sh stop ; echo $? <info> Removing export: a2:/tmp 0 Reproduced error in older version: [root@a1 opt]# rpm -q rgmanager rgmanager-2.0.52-13.el5 [root@a1 opt]# export OCF_RESKEY_path=/tmp/ [root@a1 opt]# export OCF_RESKEY_target=a2 [root@a1 opt]# export OCF_RESKEY_name=a [root@a1 opt]# /usr/share/cluster/nfsclient.sh start ; echo $? <info> Adding export: a2:/tmp/ 0 [root@a1 opt]# /usr/share/cluster/nfsclient.sh status ; echo $? <err> nfsclient:a is missing! 1 [root@a1 opt]# /usr/share/cluster/nfsclient.sh stop ; echo $? <info> Removing export: a2:/tmp/ 0 Problem #2 with clufindhostname wasn't reproduced.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1000.html