1028878 – "volume status" for single brick fails if brick is not on the server where peer command was issued.

Bug 1028878 - "volume status" for single brick fails if brick is not on the server where peer command was issued.

Summary: "volume status" for single brick fails if brick is not on the server where pe...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Kaushal
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:	888752
Blocks:
TreeView+	depends on / blocked

Reported:	2013-11-11 06:14 UTC by Kaushal
Modified:	2015-05-13 17:01 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.6.0.5
Doc Type:	Bug Fix
Doc Text:
Clone Of:	888752
Environment:
Last Closed:	2014-09-22 19:29:32 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1278	0	normal	SHIPPED_LIVE	Red Hat Storage Server 3.0 bug fix and enhancement update	2014-09-22 23:26:55 UTC

Description Kaushal 2013-11-11 06:14:25 UTC

+++ This bug was initially created as a clone of Bug #888752 +++

For "volume status" commands, the source glusterd depends on some keys to be set in the context dictionary to modify and merge the replies sent by other peers. These keys are set in the commit-op on the source.  
But in "volume status" for a single brick, with the brick on another peer, the commit-op finishes without setting the required keys, which prevents the replies from other peers from being merged properly and causes the command to fail.

--- Additional comment from Vijay Bellur on 2012-12-27 13:10:56 IST ---

CHANGE: http://review.gluster.org/4347 (glusterd: "volume status" for remote brick fails on cli.) merged in master by Vijay Bellur (vbellur)

--- Additional comment from Jules Wang on 2013-01-29 12:50:27 IST ---

please set bug status to MODIFIED

--- Additional comment from Shireesh on 2013-03-08 15:56:33 IST ---

Moving back to ASSIGNED based on following discussion.

-------- Original Message --------
Subject: 	Re: Bug 888752 - "volume status" for single brick fails if brick is not on the server where peer command was issued.
Date: 	Thu, 7 Mar 2013 08:38:27 -0500 (EST)
From: 	Kaushal M <kaushal>
To: 	Shireesh Anjal <sanjal>
CC: 	Prasanth <pprakash>, Satheesaran Sundaramoorthi <sasundar>, Raghavendra Talur <rtalur>, Sahina Bose <sabose>, Shruti Sampat <ssampat>, Dustin Tsang <dtsang>, Matthew Mahoney <mmahoney>, Vijay Bellur <vbellur>


Found the problem. 

TL;DR: I didn't do the correct check for when to expect tasks.

The normal output & xml output code always expect tasks to be present when the normal status is requested, be it for the whole volume, just a brick or the nfs/shd processes. The xml generation code fails when it doesn't find tasks. The tasks are added only by the glusterd on the system where the command is issued, which doesn't happen when we request for the status of a brick on another machine or request for the status of specific process. Ideally, the code should expect for tasks only when the normal full status command is issued. In normal cli output we don't consider the failure of task output code, which allows it to complete successfully.

As a workaround you could,
i.  Use the full status command instead.
ii. Run the command on the system with the brick.
I know these are not ideal. I'll try to have a fix for this soon.

- Kaushal

----- Original Message -----
> From: "Shireesh Anjal" <sanjal>
> To: "Vijay Bellur" <vbellur>
> Cc: "Prasanth" <pprakash>, "Satheesaran Sundaramoorthi" <sasundar>, "Kaushal M"
> <kaushal>, "Raghavendra Talur" <rtalur>, "Sahina Bose" <sabose>, "Shruti Sampat"
> <ssampat>, "Dustin Tsang" <dtsang>, "Matthew Mahoney" <mmahoney>
> Sent: Thursday, March 7, 2013 6:01:15 PM
> Subject: Re: Bug 888752 - "volume status" for single brick fails if brick is not on the server where peer command was
> issued.
> 
... 
> Trying to prepare for my tomorrow's demo, I also hit this issue with
> upstream (glusterfs-3.4.0aphpa2). The problem is that it exits with
> return code 2 when we pass "--xml". Seems to work fine when "--xml"
> is
> not passed.
> 
> [root@localhost vdsm]# gluster volume status test1
> 10.70.37.127:/tmp/test1
> Status of volume: test1
> Gluster process                                         Port Online
>  Pid
> ------------------------------------------------------------------------------
> Brick 10.70.37.127:/tmp/test1                           49152 Y
>       20878
> 
> [root@localhost vdsm]# gluster volume status test1
> 10.70.37.127:/tmp/test1 --xml
> [root@localhost vdsm]# echo $?
> 2
> [root@localhost vdsm]#
> 
> You can check this on 10.70.37.124.
>

--- Additional comment from Anand Avati on 2013-07-10 18:20:09 IST ---

REVIEW: http://review.gluster.org/5308 (cli,glusterd: Fix when tasks are shown in 'volume status') posted (#2) for review on master by Kaushal M (kaushal)

--- Additional comment from Anand Avati on 2013-07-15 11:05:47 IST ---

REVIEW: http://review.gluster.org/5308 (cli,glusterd: Fix when tasks are shown in 'volume status') posted (#3) for review on master by Kaushal M (kaushal)

--- Additional comment from Anand Avati on 2013-08-03 18:21:54 IST ---

COMMIT: http://review.gluster.org/5308 committed in master by Anand Avati (avati) 
------
commit 572f5f0a85c97a4f90a33be87b96368a0d7e7a8e
Author: Kaushal M <kaushal>
Date:   Wed Jul 10 18:10:49 2013 +0530

    cli,glusterd: Fix when tasks are shown in 'volume status'
    
    Asynchronous tasks are shown in 'volume status' only for a normal volume
    status request for either all volumes or a single volume.
    
    Change-Id: I9d47101511776a179d213598782ca0bbdf32b8c2
    BUG: 888752
    Signed-off-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/5308
    Reviewed-by: Amar Tumballi <amarts>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 1 Kaushal 2013-11-11 06:15:47 UTC

The upstream fix was backported to downstream as a dependency for bug-1020331. Downstream review happened at
https://code.engineering.redhat.com/gerrit/15274

Comment 2 Kaushal 2013-11-11 06:16:18 UTC

Moved to ON_QA by mistake.

Comment 6 SATHEESARAN 2014-07-15 04:58:22 UTC

Tested with glusterfs-3.6.0.24-1.el6rhs with the following steps,

1. Created a Trusted storage pool with 2 RHSS Nodes
2. Created a brick ( as per snapshot requirement and also without it ) on them
3. Created a Distributed volume with a single brick on the other node, from the node that doesn't contain the brick for the volume
4. Executed 'gluster volume status <vol-name>' command
5. Executed 'gluster volume status <vol-name> --xml' command
6. Started the volume
7. Executed step 5 & 6

No problems found.

Comment 8 errata-xmlrpc 2014-09-22 19:29:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html

Note You need to log in before you can comment on or make changes to this bug.