Bug 1177129

Summary: [New] - glusterd service in nagios is not marked critical when glusterd is hung on the node
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: RamaKasturi <knarra>
Component: nagios-server-addonsAssignee: Sahina Bose <sabose>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: high Docs Contact:
Priority: medium    
Version: rhgs-3.1CC: divya, dpati, sabose, tjeyasin
Target Milestone: ---   
Target Release: RHGS 3.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nagios-server-addons-0.2.1-3.el7rhgs, nagios-server-addons-0.2.1-2.el6rhs Doc Type: Bug Fix
Doc Text:
Previously, the Nagios plugin monitored if glusterd process is present. As a consequence, the Plugin returned OK status even if the glusterd process is dead but the pid file existed. With this fix, the plugin is updated to monitor glusterd service state and the glusterd service status is now reflected correctly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-29 05:27:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1202842    
Attachments:
Description Flags
Screenshot when glusterd is stopped
none
Screenshot when glusterd dead but pid file exists none

Description RamaKasturi 2014-12-24 10:22:34 UTC
Description of problem:
glusterd service in nagios is not marked critical when glusterd is hung i.e when "service glusterd status" gives the output as glusterd dead but pid file exists.

Version-Release number of selected component (if applicable):
nagios-server-addons-0.1.11-1.el6rhs.noarch

How reproducible:
Always

Steps to Reproduce:
1. perform any operation where glusterd gets hung/ service glusterd status says "glusterd dead but pid file exists."
2.
3.

Actual results:
glusterd is not marked as critical, status and status information for glusterd shows as "ok" with "process glusterd is running"

Expected results:
glusterd service should be marked critical with status information as "glusterd not running".

Additional info:

Comment 2 RamaKasturi 2014-12-25 05:48:56 UTC
Attaching the sos reports link

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1177129/

Comment 4 Timothy Asir 2015-04-15 07:14:42 UTC
Patch sent to master: http://review.gluster.org/10246

Comment 6 RamaKasturi 2015-05-28 12:52:28 UTC
Can you please put FIV for this bug?

Comment 7 RamaKasturi 2015-06-01 10:12:40 UTC
Moving this bug back because when glusterd is stopped on the node or when glusterd is dead put pid file exists, then the status of glusterd is moving to UNKNOWN, but glusterd should be marked critical.

Comment 8 RamaKasturi 2015-06-01 10:28:01 UTC
Created attachment 1033250 [details]
Screenshot when glusterd is stopped

Comment 9 RamaKasturi 2015-06-01 10:29:17 UTC
Created attachment 1033251 [details]
Screenshot when glusterd dead but pid file exists

Comment 10 RamaKasturi 2015-06-01 10:31:58 UTC
Ignore comment 7.

Moving this bug back because when glusterd is stopped on the node service status moves to UNKNOWN with status information "glusterd is stopped".

When glusterd is killed on the node using kill -9 <glusterpid> service status is shown as WARNING with status information as "gluster dead but pid file exists".

In both the above cases glusterd should be marked critical.

Comment 11 Sahina Bose 2015-06-10 13:38:47 UTC
Posted patch http://review.gluster.org/11161 to correct this

Comment 12 RamaKasturi 2015-06-18 06:54:11 UTC
Verified and works fine with build nagios-server-addons-0.2.1-2.el6rhs.noarch.

When glusterd is stopped on the node, status of gluster Management is marked as critical with status information "Process glusterd is not running".

When glusterd is killed on the node using kill -9 <glusterpid> service status is shown as WARNING with status information as "gluster dead but pid file exists".

Comment 13 RamaKasturi 2015-06-18 07:21:11 UTC
Ignore comment 12.

Verified and works fine with build nagios-server-addons-0.2.1-2.el6rhs.noarch.

When glusterd is stopped on the node, status of gluster Management is marked as CRITICAL with status information "Process glusterd is not running".

When glusterd is killed on the node using kill -9 <glusterpid> service status is shown as CRITICAL with status information as "gluster dead but pid file exists".

Comment 14 Divya 2015-07-26 05:40:12 UTC
Sahina,

Please review and sign-off the edited doc text.

Comment 15 Sahina Bose 2015-07-27 05:02:51 UTC
acked

Comment 17 errata-xmlrpc 2015-07-29 05:27:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-1494.html