Created attachment 920487 [details] status of services on node which is down Description of problem ====================== Nagios reports that "Process glusterd is running" even when the node is down. Version-Release number of selected component (if applicable) ============================================================ On nodes: # rpm -qa | grep -i nagios nagios-plugins-1.4.16-10.el6rhs.x86_64 nagios-plugins-procs-1.4.16-10.el6rhs.x86_64 nagios-common-3.5.1-6.el6.x86_64 gluster-nagios-common-0.1.3-2.el6rhs.noarch gluster-nagios-addons-0.1.9-1.el6rhs.x86_64 nagios-plugins-ide_smart-1.4.16-10.el6rhs.x86_64 # rpm -qa | grep gluster vdsm-gluster-4.14.7.2-1.el6rhs.noarch glusterfs-api-3.6.0.24-1.el6rhs.x86_64 glusterfs-geo-replication-3.6.0.24-1.el6rhs.x86_64 gluster-nagios-common-0.1.3-2.el6rhs.noarch samba-glusterfs-3.6.9-168.4.el6rhs.x86_64 glusterfs-3.6.0.24-1.el6rhs.x86_64 glusterfs-fuse-3.6.0.24-1.el6rhs.x86_64 glusterfs-server-3.6.0.24-1.el6rhs.x86_64 glusterfs-rdma-3.6.0.24-1.el6rhs.x86_64 gluster-nagios-addons-0.1.9-1.el6rhs.x86_64 glusterfs-libs-3.6.0.24-1.el6rhs.x86_64 glusterfs-cli-3.6.0.24-1.el6rhs.x86_64 On the Nagios/RHSC server: # rpm -qa | grep nagios nagios-plugins-1.4.16-10.el6rhs.x86_64 nagios-server-addons-0.1.4-2.el6rhs.noarch nagios-plugins-dummy-1.4.16-10.el6rhs.x86_64 gluster-nagios-common-0.1.3-2.el6rhs.noarch nagios-3.5.1-6.el6.x86_64 nagios-plugins-nrpe-2.14-1.3.el6rhs.x86_64 nagios-common-3.5.1-6.el6.x86_64 nagios-plugins-ping-1.4.16-10.el6rhs.x86_64 pnp4nagios-0.6.20-1.1.el6rhs.x86_64 Steps to Reproduce ================== 1. Install RHS on 4 nodes and setup volume on them with RHSC 2. Setup nagios monitorign (as described in RHS 3.0 documentation) 3. Kill all node servers (do hard shutdown, virsh undefine or similar) Actual results ============== In Nagios web interface, go to "Services" page which reports that the following services are OK even though the node itself is down: * Gluster Management (Process glusterd is running) * NFS (OK: No gluster volume uses nfs) * Quota (OK: Quota not enabled) * SMB (OK: No gluster volume uses smb) * Self-Heal (Gluster Self Heal Daemon is running) Expected results ================ All mentioned services are reported as CRITICAL. Additional info =============== See the attached screenshot from Nagios web interface.
Hi Martin, can you confirm, if your Nagios server is setup on RHSC or on one of the RHS node?
I used the following configuration: * 4 storage servers with gluster * one management server with RHSC and Nagios server So the Nagios server is outside of trusted storage pool on the manegement server.
Thank you for your report. However, this bug is being closed as it's logged against gluster-nagios monitoring for which no further new development is being undertaken.