Bug 1109752

Summary: [New] - Geo-Replication status says "warning" and status informtion says "georeplication status could not be determined - Another transaction is in progress for vol_repmaster.Please try again after some time"
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: RamaKasturi <knarra>
Component: gluster-nagios-addonsAssignee: Sahina Bose <sabose>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: high    
Version: rhgs-3.0CC: asrivast, dpati, esammons, nsathyan, psriniva, rhs-bugs, rhsc-qe-bugs, rnachimu, sabose, sgraf
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.0.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: gluster-nagios-addons-0.1.11-1.el6rhs Doc Type: Bug Fix
Doc Text:
Previously the Geo-replication status plugin displayed a Warning state when the Red Hat Storage volume was locked due to another volume operation. With this fix, when a volume is locked, the command is executed again after a wait time. If the error message persists, the status plugin displays the state as unknown.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-15 13:48:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description RamaKasturi 2014-06-16 10:12:51 UTC
Description of problem:
Geo-Replication status says "warning" and status informtion says "georeplication status could not be determined - Another transaction is in progress for vol_repmaster.Please try again after some time"

Version-Release number of selected component (if applicable):
nagios-server-addons-0.1.3-3.el6rhs.x86_64
gluster-nagios-common-0.1.3-1.el6rhs.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 RamaKasturi 2014-06-16 10:13:34 UTC
Inconsistently reproducible.

Comment 6 RamaKasturi 2014-10-17 05:41:12 UTC
From kanagaraj, i understand that these bugs have been moved to on_qa by errata. 

Since QE has not yet received the build i am moving this bug back to assigned state. Please move it on to on_qa once builds are attached to errata.

Comment 7 RamaKasturi 2014-10-17 05:42:01 UTC
From kanagaraj, i understand that these bugs have been moved to on_qa by errata. 

Since QE has not yet received the build i am moving this bug back to assigned state. Please move it on to on_qa once builds are attached to errata.

Comment 9 RamaKasturi 2014-11-17 09:13:22 UTC
Sahina,

   I see that status and status information gets displayed as below:

status - UNKNOWN

status Information - temporary error. Another transaction is in progress for <vol_name>. Please try again after some time.

Is this the expected status information? 

AFAIK, status information should say "temporary error". Could you please confirm on this?

Comment 10 Sahina Bose 2014-11-18 09:42:41 UTC
The fix is that this error should not be returned frequently - as the fix retries the geo-replication status command on other nodes in the gluster, till all nodes are exhausted.

Could you check with > 4 node cluster in the master side and see how frequently you run into this?

Comment 11 RamaKasturi 2014-11-24 09:13:14 UTC
I could still see for geo-replication service that status goes to 'UNKNOWN' with status information 'UNKNOWN- temporary error. Another transaction is in progress for <vol_name>. Please try again after some time.'

Moving this bug back to assigned.

Comment 12 Sahina Bose 2014-11-25 07:23:26 UTC
http://review.gluster.org/#/c/9192/ - patch submitted to intoduce a sleep for 2 seconds, when the volume is locked and transaction in progress errors are returned.
However, this is not a foolproof way - but reduces the chances of such errors in Nagios plugins

Comment 14 RamaKasturi 2014-12-02 05:42:01 UTC
Moving this bug back because geo-rep status gives the status as warning with status information as "null".

Comment 15 Sahina Bose 2014-12-02 08:40:06 UTC
http://review.gluster.org/#/c/9226/ - fixing the issue with string comparison

Comment 17 RamaKasturi 2014-12-05 08:29:08 UTC
Verified and works fine with build nagios-server-addons-0.1.11-1.el6rhs.noarch.

I have not seen the issue with status as warning and status information as null. Will reopen this, if i see it again.

Comment 18 Pavithra 2014-12-17 06:33:20 UTC
Hi Sahina,

Can you please review the edited doc text for technical accuracy and sign off?

Comment 19 Sahina Bose 2014-12-24 09:16:33 UTC
Minor edit done - signing off

Comment 21 errata-xmlrpc 2015-01-15 13:48:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0039.html