Description of problem: Too many nfs-mon_monitor_10000 error messages seen in /var/log/messages n 20 12:48:04 dhcp46-111 lrmd[22195]: notice: nfs-mon_monitor_10000:31431:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:48:20 dhcp46-111 lrmd[22195]: notice: nfs-mon_monitor_10000:31462:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Also, the machine got failovered to other node, even though pacemaker and ganesha services are up. Not sure, this issue is seen because of these error messages. Version-Release number of selected component (if applicable): nfs-ganesha-gluster-2.4.1-6.el7rhgs.x86_64 nfs-ganesha-2.4.1-6.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-12.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1. Create a ganesha cluster and volume. 2. Enable nfs-ganesha on the volume and mount it. 3. Run IOs from mount point. Actual results: Too many nfs-mon_monitor_10000 error messages seen in /var/log/messages Expected results: No such error messages should seen. Additional info: [root@dhcp46-111 ~]# service pacemaker status -l Redirecting to /bin/systemctl status -l pacemaker.service ● pacemaker.service - Pacemaker High Availability Cluster Manager Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2017-01-18 15:24:44 IST; 1 day 21h ago Docs: man:pacemakerd http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html Main PID: 22191 (pacemakerd) CGroup: /system.slice/pacemaker.service ├─ 6654 /bin/bash /usr/lib/ocf/resource.d/heartbeat/ganesha_mon monitor ├─ 6666 sleep 5 ├─22191 /usr/sbin/pacemakerd -f ├─22193 /usr/libexec/pacemaker/cib ├─22194 /usr/libexec/pacemaker/stonithd ├─22195 /usr/libexec/pacemaker/lrmd ├─22196 /usr/libexec/pacemaker/attrd ├─22197 /usr/libexec/pacemaker/pengine └─22198 /usr/libexec/pacemaker/crmd Jan 20 12:56:10 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6418:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:56:25 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6432:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:56:40 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6474:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:56:55 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6488:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:57:11 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6502:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:57:26 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6516:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:57:41 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6558:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:57:56 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6572:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:58:11 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6586:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] Jan 20 12:58:26 dhcp46-111.lab.eng.blr.redhat.com lrmd[22195]: notice: nfs-mon_monitor_10000:6600:stderr [ /usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 142: [: too many arguments ] [root@dhcp46-111 ~]# service nfs-ganesha status Redirecting to /bin/systemctl status nfs-ganesha.service ● nfs-ganesha.service - NFS-Ganesha file server Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha.service; disabled; vendor preset: disabled) Active: active (running) since Thu 2017-01-19 18:25:16 IST; 16h ago Docs: http://github.com/nfs-ganesha/nfs-ganesha/wiki Process: 32563 ExecStop=/bin/dbus-send --system --dest=org.ganesha.nfsd --type=method_call /org/ganesha/nfsd/admin org.ganesha.nfsd.admin.shutdown (code=exited, status=0/SUCCESS) Process: 415 ExecStartPost=/bin/bash -c prlimit --pid $MAINPID --nofile=$NOFILE:$NOFILE (code=exited, status=0/SUCCESS) Process: 407 ExecStart=/bin/bash -c ${NUMACTL} ${NUMAOPTS} /usr/bin/ganesha.nfsd ${OPTIONS} ${EPOCH} (code=exited, status=0/SUCCESS) Main PID: 411 (ganesha.nfsd) CGroup: /system.slice/nfs-ganesha.service └─411 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -E 637729... Jan 19 18:25:16 dhcp46-111.lab.eng.blr.redhat.com systemd[1]: Starting NFS-Ganesha file server... Jan 19 18:25:16 dhcp46-111.lab.eng.blr.redhat.com systemd[1]: Started NFS-Ganesha file server. [root@dhcp46-111 ~]# pcs status Cluster name: ganesha-ha-360 Stack: corosync Current DC: dhcp46-111.lab.eng.blr.redhat.com (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum Last updated: Fri Jan 20 12:47:17 2017 Last change: Fri Jan 20 11:20:53 2017 by root via crm_attribute on dhcp46-124.lab.eng.blr.redhat.com 4 nodes and 24 resources configured Online: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com ] Full list of resources: Clone Set: nfs_setup-clone [nfs_setup] Started: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com ] Clone Set: nfs-mon-clone [nfs-mon] Started: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ dhcp46-115.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com ] Stopped: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com ] Resource Group: dhcp46-111.lab.eng.blr.redhat.com-group dhcp46-111.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp46-139.lab.eng.blr.redhat.com dhcp46-111.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-139.lab.eng.blr.redhat.com dhcp46-111.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp46-139.lab.eng.blr.redhat.com Resource Group: dhcp46-115.lab.eng.blr.redhat.com-group dhcp46-115.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp46-115.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-115.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp46-115.lab.eng.blr.redhat.com Resource Group: dhcp46-139.lab.eng.blr.redhat.com-group dhcp46-139.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp46-139.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-139.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp46-139.lab.eng.blr.redhat.com Resource Group: dhcp46-124.lab.eng.blr.redhat.com-group dhcp46-124.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp46-139.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-139.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp46-139.lab.eng.blr.redhat.com Failed Actions: * nfs-grace_monitor_5000 on dhcp46-139.lab.eng.blr.redhat.com 'not running' (7): call=145, status=complete, exitreason='none', last-rc-change='Thu Jan 19 23:19:55 2017', queued=0ms, exec=0ms * nfs-grace_monitor_5000 on dhcp46-111.lab.eng.blr.redhat.com 'not running' (7): call=208, status=complete, exitreason='none', last-rc-change='Thu Jan 19 18:24:49 2017', queued=0ms, exec=0ms * nfs-mon_monitor_10000 on dhcp46-111.lab.eng.blr.redhat.com 'unknown error' (1): call=13, status=Timed Out, exitreason='none', last-rc-change='Thu Jan 19 12:51:12 2017', queued=0ms, exec=0ms Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled s
sosreports and ganesha logs are at, http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1415072/