Bug 1020331 - The output of command "gluster volume status all tasks --xml" and "gluster volume remove-brick <vol> <brick> status --xml" not in agreement
Summary: The output of command "gluster volume status all tasks --xml" and "gluster vo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: RHGS 2.1.2
Assignee: Kaushal
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On: 1027094
Blocks: 1015659 1020189 1020325 1021816 1022511
TreeView+ depends on / blocked
 
Reported: 2013-10-17 13:05 UTC by Shubhendu Tripathi
Modified: 2015-05-13 16:34 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.4.0.42.1u2rhs-1
Doc Type: Bug Fix
Doc Text:
Previously, the remove-brick status displayed by the volume status command was inconsistent on different peers, whereas the remove-brick status command displayed a consistent output. With this fix, the status displayed by the volume status command and the remove-brick status command is consistent across the cluster.
Clone Of:
: 1027094 (view as bug list)
Environment:
Last Closed: 2014-02-25 07:54:34 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:0208 0 normal SHIPPED_LIVE Red Hat Storage 2.1 enhancement and bug fix update #2 2014-02-25 12:20:30 UTC

Description Shubhendu Tripathi 2013-10-17 13:05:02 UTC
Description of problem:
VDSM has a verb to get the list tasks on a volume. Internally it -

1. Uses the command "gluster volume remove-brick <vol> <brick> status --xml" to get the status of remove brick action on a brick.
2. uses the command "gluster volume status all tasks --xml" to get the overall status of the tasks.

If there are two hosts in a cluster and a remove brick action is performed while getting the overall status of the tasks, the output of the command "gluster volume status all tasks --xml" is different on the two hosts -

Version-Release number of selected component (if applicable):


How reproducible:
Almost always

Steps to Reproduce:
1. Make sure two hosts are present in peer group
2. Create a distributed volume with 2 bricks (brick dirs from server-1 only)
3. Populate the volume with data
4. Start remove brick for one of the bricks on the volume
5. Individually run the command "gluster volume status all tasks --xml"

Actual results:
The status values returned on the hosts differ

Expected results:
Both the hosts should return the same status value

Additional info:

Comment 2 Dusmant 2013-10-18 05:02:48 UTC
This bug is blocking the RHSC remove brick feature, which is giving in-consistent information because of this issue. We need a fix for this ASAP.

Thanks,
-Dusmant

Comment 3 Ramesh N 2013-10-21 11:39:15 UTC
Same scenario is applicable for volume rebalance task.

Comment 5 SATHEESARAN 2013-12-23 07:11:00 UTC
Verified with glusterfs-3.4.0.51rhs.el6rhs

Now the "remove-brick" and "rebalance status" information obtained using "gluster volume status all --xml", is uniform across all RHSS Nodes in the "Trusted Storage Pool"

Performed the following steps to verify this bug,
1. Created a trusted storage pool of 4 RHSS Nodes
(i.e) gluster peer probe <RHSS-NODE-IP>

2. Created a distribute-replicate volume of 6 bricks ( 3X2 )
(i.e) gluster volume create <vol-name> replica 2 <brick1>..<brick8>

3. Start the volume
(i.e) gluster volume start <vol-name>

4. Fuse mount the volume 
(i.e) mount.glusterfs <RHSS-NODE>:<vol-name> <mount-point>

5. Created some files on the mount point
(i.e) for i in {1..200}; do dd if=/dev/urandom of=<mount-point>/file$i bs=4k count=1000;done

6. Add pair of bricks to the volume
(i.e) gluster volume add-brick <vol-name> <brick1> brick2>

7. Start rebalance on the volume
(i.e) gluster volume rebalance <vol-name> start

8. Get the status of all volumes using --xml
(i.e) gluster volume status all --xml

9. Get the status on all RHSS Nodes. (i.e) repeat step 9 on all RHSS Nodes

Observation -  Rebalance status was consistent across all the nodes

11. Now, remove a pair of bricks from the volume
(i.e) gluster volume remove-brick <vol-name> <brick1> <brick2> start

12. Repeat step 8, and step 9

Observation : remove brick status was seen consistent across all the RHSS Nodes

Comment 6 Pavithra 2014-01-03 11:04:10 UTC
Kaushal, I've made minor changes. Please verify.

Comment 7 Kaushal 2014-01-16 11:02:29 UTC
Doc text looks fine.

Comment 9 errata-xmlrpc 2014-02-25 07:54:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html


Note You need to log in before you can comment on or make changes to this bug.