961608 – glusterd : 'gluster volume status <vol_name>' some times do not show active task and sometimes it shows tasks which are not active.

Bug 961608 - glusterd : 'gluster volume status <vol_name>' some times do not show active task and sometimes it shows tasks which are not active.

Summary: glusterd : 'gluster volume status <vol_name>' some times do not show active t...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Kaushal
QA Contact:	amainkar
Docs Contact:
URL:
Whiteboard:
Depends On:	963541
Blocks:
TreeView+	depends on / blocked

Reported:	2013-05-10 05:10 UTC by Rachana Patel
Modified:	2015-04-20 11:57 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.4.0.12rhs.beta6-1
Doc Type:	Bug Fix
Doc Text:	Cause: Unknown Consequence: Fix: Unknown Result: Sometimes 'gluster volume status <vol_name>' does not show active tasks and sometimes it shows tasks which are no longer active.
Clone Of:
Clones:	963541 (view as bug list)
Environment:
Last Closed:	2013-09-23 22:39:43 UTC
Embargoed:
Dependent Products:
Flags:	sasundar: needinfo-

Attachments	(Terms of Use)

Description Rachana Patel 2013-05-10 05:10:58 UTC

Description of problem:
glusterd : 'gluster volume status <vol_name>' some times do not show active task and sometimes it shows tasks which are not active. 

Not able to understand which tasks should be listed and when it should be removed

Version-Release number of selected component (if applicable):
3.4.0.4rhs-1.el6rhs.x86_64

How reproducible:
always

Steps to Reproduce:
Observation:
1. rebalance status is there even rebalance is completed (no more active task). It will be removed only if any other task becomes active!!!

2.
a) if you do 'remove-brick start' then it will be listed under active task, till you commit it .

b) but if you run rebalance before commit then remove-brick task will be removed from ative task even though user hasnt committed it!

3. run 'remove-brick start' for one brick and before committing run 'remove-brick start' for another brick, the first task will be removed from active task list!

e.g.

1.
[root@cutlass ~]# gluster volume rebalance task start force
volume rebalance: task: success: Starting rebalance on volume task has been successful.
ID: decd28ea-d714-4be4-b3cc-0d3ef6faaa86
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta	49155	Y	25148
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25199
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26352
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26322
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5199
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    decd28ea-d714-4be4-b3cc-0d3ef6faaa86              3
[root@cutlass ~]# gluster volume rebalance task status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              104       400.0KB           559             0      completed             7.00
             fred.lab.eng.blr.redhat.com                0        0Bytes           449             0      completed             6.00
              fan.lab.eng.blr.redhat.com              110      1000.0KB           550             0      completed             6.00
              mia.lab.eng.blr.redhat.com              106       100.0KB           553             0      completed             6.00
volume rebalance: task: success: 
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta	49155	Y	25148
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25199
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26352
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5199
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26322
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    decd28ea-d714-4be4-b3cc-0d3ef6faaa86              3


2.
 a)
[root@cutlass ~]# gluster volume remove-brick task cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta start
volume remove-brick start: success
ID: 4aa2ac47-5070-40dc-b2e8-87fddf38e7cf
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta	49155	Y	25148
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25199
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26386
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5232
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26355
 
           Task                                      ID         Status
           ----                                      --         ------
   Remove brick    4aa2ac47-5070-40dc-b2e8-87fddf38e7cf              3
[root@cutlass ~]# gluster volume remove-brick task cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta status
                                    Node Rebalanced-files          size       scanned      failures         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost               97       700.0KB           537             0      completed             4.00
             fred.lab.eng.blr.redhat.com                0        0Bytes             0             0    not started             0.00
              fan.lab.eng.blr.redhat.com                0        0Bytes             0             0    not started             0.00
              mia.lab.eng.blr.redhat.com                0        0Bytes             0             0    not started             0.00
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta	49155	Y	25148
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25199
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26386
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5232
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26355
 
           Task                                      ID         Status
           ----                                      --         ------
   Remove brick    4aa2ac47-5070-40dc-b2e8-87fddf38e7cf              3
[root@cutlass ~]# gluster volume remove-brick task cutlass.lab.eng.blr.redhat.com:/rhs/brick1/ta commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25293
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26410
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	N/A	N	N/A
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26372
 
There are no active volume tasks

b)
[root@cutlass ~]# gluster volume remove-brick task mia.lab.eng.blr.redhat.com:/rhs/brick1/ta start
volume remove-brick start: success
ID: 38eeda20-4238-4d11-8166-bd910d0c800a
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25318
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26427
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5249
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26396
 
           Task                                      ID         Status
           ----                                      --         ------
   Remove brick    38eeda20-4238-4d11-8166-bd910d0c800a              0
[root@cutlass ~]# gluster volume rebalance task start force
volume rebalance: task: success: Starting rebalance on volume task has been successful.
ID: d722c113-dc8e-41e5-8a4c-d6897cc4d0e1
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25318
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26427
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5249
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26396
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    d722c113-dc8e-41e5-8a4c-d6897cc4d0e1              3



3.
[root@cutlass ~]# gluster volume remove-brick task mia.lab.eng.blr.redhat.com:/rhs/brick1/ta start
volume remove-brick start: success
ID: 07053960-a720-4220-85db-bbd130296f56
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25367
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5249
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26462
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26430
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    07053960-a720-4220-85db-bbd130296f56              0
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25367
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26462
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5249
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26430
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    07053960-a720-4220-85db-bbd130296f56              0
[root@cutlass ~]# gluster volume remove-brick task mia.lab.eng.blr.redhat.com:/rhs/brick1/ta status
                                    Node Rebalanced-files          size       scanned      failures         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes             0             0    not started             0.00
             fred.lab.eng.blr.redhat.com                0        0Bytes             0             0    not started             0.00
              fan.lab.eng.blr.redhat.com                0        0Bytes             0             0    not started             0.00
              mia.lab.eng.blr.redhat.com                0        0Bytes           440             0      completed             1.00
[root@cutlass ~]# gluster volume remove-brick task fan.lab.eng.blr.redhat.com:/rhs/brick1/ta start
volume remove-brick start: success
ID: 17d1a925-c38e-4b16-84a2-ecf127325ded
[root@cutlass ~]# gluster v status task
Status of volume: task
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/ta		49166	Y	5156
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/ta		49171	Y	26280
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/ta	49176	Y	26342
NFS Server on localhost					2049	Y	25432
NFS Server on 81a2750a-79f6-47f1-ae9b-961aed998238	2049	Y	26511
NFS Server on 94dda48c-1c98-4d56-b8b0-59c88e299af5	2049	Y	26430
NFS Server on 4b6d57e1-7de6-40e0-b53e-ea5331aa39cc	2049	Y	5348
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    17d1a925-c38e-4b16-84a2-ecf127325ded              0

Actual results:
'gluster volume status <vol_name>' some times do not show active task and sometimes it shows tasks which are not active. 

Expected results:
for all task which should be listed under active task - rebalance , remove-brick etc., behaviour should be consistent.

Additional info:

Comment 3 Kaushal 2013-05-16 06:15:33 UTC

Issues 2,3 happen because glusterd keeps track of only one rebalance/remove-brick operation on a volume at a time. Starting another rebalance/remove-brick, before commiting an earlier remove-brick will cause glusterd to stop tracking the earlier task, and start tracking earlier. Starting a new remove-brick/rebalance task on the volume shouldn't be allowed before commiting a remove-brick task. Will do the necessary changes and have a patch for review.

Comment 4 Kaushal 2013-05-16 11:33:29 UTC

Patch under review @ https://code.engineering.redhat.com/gerrit/7666

Comment 5 Kaushal 2013-07-17 09:50:37 UTC

Patch available in builds since glusterfs-v3.4.0.12rhs.beta2

The patch makes sure that we commit a remove-brick operation before starting another remove-brick or another rebalance operation. This will ensure that we don't have two ops uncommited tasks active at the same time.

Comment 6 SATHEESARAN 2013-07-24 14:52:10 UTC

Kaushal,

I tried creating a volume (distributed volume with 3 bricks),then tried to remove a brick from the volume, and I got an error as follows,

Wed Jul 24 14:45:58 UTC 2013 root.37.200:~ ] # gluster volume remove-brick dvol 10.70.37.204:/rhs/brick4/tdir3 start
volume remove-brick start: failed: Commit failed on 10.70.37.204. Error: A remove-brick task on volume dvol is not yet committed. Either commit or stop the remove-brick task.

Additional Information
======================
[Wed Jul 24 14:43:57 UTC 2013 root.37.200:~ ] # gluster volume create dvol 10.70.37.200:/rhs/brick4/tdir3 10.70.37.82:/rhs/brick4/tdir3 10.70.37.204:/rhs/brick4/tdir3
volume create: dvol: success: please start the volume to access data

[Wed Jul 24 14:45:25 UTC 2013 root.37.200:~ ] # gluster volume start dvol
volume start: dvol: success

[Wed Jul 24 14:45:40 UTC 2013 root.37.200:~ ] # gluster volume info dvol
 
Volume Name: dvol
Type: Distribute
Volume ID: e87e36d7-7261-4473-a205-f11051dd0407
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: 10.70.37.200:/rhs/brick4/tdir3
Brick2: 10.70.37.82:/rhs/brick4/tdir3
Brick3: 10.70.37.204:/rhs/brick4/tdir3

[Wed Jul 24 14:45:49 UTC 2013 root.37.200:~ ] # gluster volume status dvol
Status of volume: dvol
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.200:/rhs/brick4/tdir3                    49158   Y       23161
Brick 10.70.37.82:/rhs/brick4/tdir3                     49157   Y       20287
Brick 10.70.37.204:/rhs/brick4/tdir3                    49157   Y       19375
NFS Server on localhost                                 2049    Y       23174
NFS Server on 10.70.37.204                              2049    Y       19388
NFS Server on 10.70.37.82                               2049    Y       20300
NFS Server on 10.70.37.188                              2049    Y       18471
 
There are no active volume tasks

[Wed Jul 24 14:45:58 UTC 2013 root.37.200:~ ] # gluster volume remove-brick dvol 10.70.37.204:/rhs/brick4/tdir3 start
volume remove-brick start: failed: Commit failed on 10.70.37.204. Error: A remove-brick task on volume dvol is not yet committed. Either commit or stop the remove-brick task.

SETUP INFORMATION
=================
1. Cluster has 4 nodes, 
10.70.37.{200,82,204,188}

2. All commands are executed from 10.70.37.200

3. Attaching the sosreports to this bug, which has the suffixed the ip address in their names

4. There are 4 volumes existing in this cluster
[Wed Jul 24 14:46:30 UTC 2013 root.37.200:~ ] # gluster volume list
drvol
testvol
distvol
dvol

5. dvol is the freshly created volume, as mentioned in additional information

Comment 7 SATHEESARAN 2013-07-24 14:54:55 UTC

comment6 is seen with glusterfs-3.4.0.12rhs.beta5-2

[Wed Jul 24 14:58:08 UTC 2013 root.37.200:~ ] # glusterfs -V
glusterfs 3.4.0.12rhs.beta5 built on Jul 18 2013 07:00:38
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

Comment 8 SATHEESARAN 2013-07-24 14:57:22 UTC

I could not able to verify this bug, as I could not able to perform remove-brick operation. 

Let me know, should I open a separate bug for this issue.

Comment 9 SATHEESARAN 2013-07-24 15:07:09 UTC

(In reply to SATHEESARAN from comment #8)
> I could not able to verify this bug, as I could not able to perform
> remove-brick operation. 
> 
> Let me know, should I open a separate bug for this issue.

Apologies, remove-brick issue is already raised in BZ, https://bugzilla.redhat.com/show_bug.cgi?id=982184

This should be fixed in order to verify this bug

Comment 10 SATHEESARAN 2013-07-24 15:22:38 UTC

Please fix the FIXED-IN-VERSION, to continue further testing on this

Comment 11 SATHEESARAN 2013-08-12 11:09:22 UTC

Tested the issue with, glusterfs-3.4.0.18rhs-1, and found the following,

1. Now, rebalance or another 'remove-brick start' operation is not allowed, before
committing the previous, 'remove-brick start' operation.

2. But the case as mentioned by Bug Reporter ( Rachana ), in Description of bug, under heading titled as 'observation',

<snip>
Observation:
1. rebalance status is there even rebalance is completed (no more active task). It will be removed only if any other task becomes active!!!
</snip>

I could while rebalance is going on, I could status is 3, and post rebalance also the status shows 3.

[Mon Aug 12 10:18:21 UTC 2013 root.37.54:~ ] # gluster volume rebalance distvol status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes            12             0             0      completed             0.00
                            10.70.37.205                0        0Bytes            12             0             0      completed             0.00
                             10.70.37.61                1         5.0GB             9             0             0    in progress           365.00
                             10.70.37.86                0        0Bytes            12             0             0      completed             0.00
volume rebalance: distvol: success: 

[Mon Aug 12 10:24:10 UTC 2013 root.37.54:~ ] # gluster volume status distvol
Status of volume: distvol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.54:/rhs/brick4/dir1			49155	Y	21010
Brick 10.70.37.205:/rhs/brick4/dir1			49157	Y	19485
Brick 10.70.37.61:/rhs/brick4/dir1			49155	Y	16383
Brick 10.70.37.86:/rhs/brick4/newbrick			49155	Y	16676
NFS Server on localhost					2049	Y	21599
NFS Server on 10.70.37.86				2049	Y	16688
NFS Server on 10.70.37.205				2049	Y	20021
NFS Server on 10.70.37.61				2049	Y	16885
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    0f2df00c-764b-46a0-a112-97e93c100498              3

[Mon Aug 12 10:24:12 UTC 2013 root.37.54:~ ] # gluster volume rebalance distvol status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes            12             0             0      completed             0.00
                            10.70.37.205                0        0Bytes            12             0             0      completed             0.00
                             10.70.37.61                2        10.0GB            14             0             0      completed           386.00
                             10.70.37.86                0        0Bytes            12             0             0      completed             0.00
volume rebalance: distvol: success: 
[Mon Aug 12 10:41:44 UTC 2013 root.37.54:~ ] # gluster volume status distvol
Status of volume: distvol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.54:/rhs/brick4/dir1			49155	Y	21010
Brick 10.70.37.205:/rhs/brick4/dir1			49157	Y	19485
Brick 10.70.37.61:/rhs/brick4/dir1			49155	Y	16383
Brick 10.70.37.86:/rhs/brick4/newbrick			49155	Y	16676
NFS Server on localhost					2049	Y	21599
NFS Server on 10.70.37.61				2049	Y	16885
NFS Server on 10.70.37.205				2049	Y	20021
NFS Server on 10.70.37.86				2049	Y	16688
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    0f2df00c-764b-46a0-a112-97e93c100498              3


Similarly, while performing, 'remove-brick start', volume status shows as 0 and even after rebalance is completed, the status shows as 0

[Mon Aug 12 10:41:46 UTC 2013 root.37.54:~ ] # gluster volume remove-brick distvol 10.70.37.61:/rhs/brick4/dir1 start
volume remove-brick start: success
ID: ab7fb1d9-7abe-447d-9240-a43fc1b4ecf5

[Mon Aug 12 10:52:02 UTC 2013 root.37.54:~ ] # gluster volume status distvolStatus of volume: distvol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.54:/rhs/brick4/dir1			49155	Y	21010
Brick 10.70.37.205:/rhs/brick4/dir1			49157	Y	19485
Brick 10.70.37.61:/rhs/brick4/dir1			49155	Y	16383
Brick 10.70.37.86:/rhs/brick4/newbrick			49155	Y	16676
NFS Server on localhost					2049	Y	21989
NFS Server on 10.70.37.205				2049	Y	20321
NFS Server on 10.70.37.61				2049	Y	16885
NFS Server on 10.70.37.86				2049	Y	16991
 
           Task                                      ID         Status
           ----                                      --         ------
   Remove brick    ab7fb1d9-7abe-447d-9240-a43fc1b4ecf5              0
------> status is shown as ZERO here <-----

[Mon Aug 12 10:52:05 UTC 2013 root.37.54:~ ] # gluster volume remove-brick distvol 10.70.37.61:/rhs/brick4/dir1 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes             0             0             0    not started             0.00
                            10.70.37.205                0        0Bytes             0             0             0    not started             0.00
                             10.70.37.61                0        0Bytes             8             0             0    in progress            18.00
                             10.70.37.86                0        0Bytes             0             0             0    not started             0.00

[Mon Aug 12 10:52:20 UTC 2013 root.37.54:~ ] # gluster volume status distvol
Status of volume: distvol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.54:/rhs/brick4/dir1			49155	Y	21010
Brick 10.70.37.205:/rhs/brick4/dir1			49157	Y	19485
Brick 10.70.37.61:/rhs/brick4/dir1			49155	Y	16383
Brick 10.70.37.86:/rhs/brick4/newbrick			49155	Y	16676
NFS Server on localhost					2049	Y	21989
NFS Server on 10.70.37.61				2049	Y	16885
NFS Server on 10.70.37.205				2049	Y	20321
NFS Server on 10.70.37.86				2049	Y	16991
 
           Task                                      ID         Status
           ----                                      --         ------
   Remove brick    ab7fb1d9-7abe-447d-9240-a43fc1b4ecf5              0

Considering all this, the issue to be answered is,
---> what is the status field in 'gluster volume status <vol-name>' should be while rebalance is in progress and post rebalance in 2 cases, 
1. Rebalance done after 'add-brick'
2. Rebalance caused by 'remove-brick start'

---> Will the status remain same during rebalance operation and post rebalance ?

Knowing the answers for above mentioned queries would help in resolving this

Comment 12 SATHEESARAN 2013-08-13 07:55:20 UTC

Based on the input from Kaushal in irc chat, as below, moving this bug to VERIFIED.

Verified with glusterfs-3.4.0.18rhs-1

----------------------------------------------

<kshlm> that is again because we don't have a way of synchronizing rebalance stats across all peers. you probably removed a brick on another peer and issued the command on your source peer.
<kshlm> since the rebalance process only starts on the peer from which the brick was removed for remove-brick, the output of volume status on the source peer will say 0 for remove-brick status

----------------------------------------------

Comment 13 Scott Haines 2013-09-23 22:39:43 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 14 Scott Haines 2013-09-23 22:43:47 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.