Description of problem: Auto configuration removes all the configurations(hosts,volumes, bricks) from Nagios if the host used for discovery is no longer part of the cluster Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Create a cluster with 3 nodes (HostA, HostB, HostC). 2. Create some volumes and start them 3. Run discovery script by providing name of the cluster and ip of HostA 4. Make sure all the volumes/hosts show up in nagios UI 5. Detach HostA from the cluster using "gluster peer detach" command 6. Re-schedule the auto-config in nagios ui Actual results: Except HostA all other hosts/volumes removed from the nagios configuration. Expected results: hosts/volumes should not removed from the nagios configuration. User should run the discovery by providing the ip of HostB. Additional info:
Patch sent upstrean : http://review.gluster.org/#/c/8024/
Verified as fixed in nagios-server-addons-0.1.4-1.el6rhs.x86_64 Performed the following steps - 1. Created a cluster of four nodes, host1, host2, host3 and host4. 2. Created a couple of volumes, and started them. 3. Configured this cluster to be monitored via nagios server, which is also one of the RHS nodes. 4. Removed host4 from the cluster using gluster peer detach command. 5. Attempted to run cluster auto-config service using the Nagios UI. Saw the status of the service change to critical with the following status information - Can't remove all hosts except sync host in 'auto' mode. Run auto discovery manually 6. Ran auto-discovery at the nagios server using command line, manually - # /usr/lib64/nagios/plugins/gluster/discovery.py -c cluster_auto -H host1 Cluster configurations changed Changes : Hostgroup cluster_auto - UPDATE Host cluster_auto - UPDATE Service - Cluster Auto Config -UPDATE Host host4 - REMOVE Are you sure, you want to commit the changes? (Yes, No) [Yes]: Cluster configurations synced successfully from host host1 Do you want to restart Nagios to start monitoring newly discovered entities? (Yes, No) [Yes]: Nagios re-started successfully In the Nagios UI, the host host4 was removed, and the status of the cluster auto-config service changed to OK.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1277.html