Bug 1154316 - DHT-Rebalance:-Rebalance running on a node will never receive stop command if glusterd has been restarted on the same node while rebalance is in progress
Summary: DHT-Rebalance:-Rebalance running on a node will never receive stop command if...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: 2.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: shylesh
URL:
Whiteboard:
Depends On:
Blocks: 1286100
TreeView+ depends on / blocked
 
Reported: 2014-10-18 17:38 UTC by shylesh
Modified: 2015-11-27 10:48 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1286100 (view as bug list)
Environment:
Last Closed: 2015-11-27 10:48:23 UTC
Embargoed:


Attachments (Terms of Use)

Description shylesh 2014-10-18 17:38:23 UTC
Description of problem:
While rebalance is in progress if glusterd restarts on a node and later if we try to stop the rebalance on the same node it will not stop.

Version-Release number of selected component (if applicable):
3.4.0.69rhs-1.el6rhs.x86_64

How reproducible:
Always

Steps to Reproduce:
1.created a dist-rep volume
2.created some data and add-brick
3.start rebalance
4. while rebalance is in progress on of the node restart glusterd
5. once glusterd is restarted stop the rebalance process


Actual results:
The node on which glusterd was restarted rebalance process will not stop and it will run to completion eventually, it never receives the stop command
 
[root@rhs-client4 mnt]# gluster v rebalance distrep start
volume rebalance: distrep: success: Starting rebalance on volume distrep has been successful.
ID: 5e0229e9-ae13-4da5-8701-21486de873cd
[root@rhs-client4 mnt]# gluster v rebalance distrep status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes          2342             0             0          in progress              10.00
     rhs-client39.lab.eng.blr.redhat.com                0        0Bytes          2468             0             0          in progress              10.00
      rhs-gp-srv2.lab.eng.blr.redhat.com              249         1.8MB          2092             0             0          in progress              10.00
volume rebalance: distrep: success:
[root@rhs-client4 mnt]# service glusterd restart
Stopping glusterd:                                         [  OK  ]
Starting glusterd:                                         [  OK  ]
[root@rhs-client4 mnt]# gluster v rebalance distrep status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress               0.00
     rhs-client39.lab.eng.blr.redhat.com              163       954.1KB          5376             0             0          in progress              30.00
      rhs-gp-srv2.lab.eng.blr.redhat.com              733         4.5MB          4746             0             0          in progress              31.00
volume rebalance: distrep: success:
[root@rhs-client4 mnt]# gluster v rebalance distrep stop
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress               0.00
     rhs-client39.lab.eng.blr.redhat.com              200       986.8KB          5608             0             0              stopped              34.00
      rhs-gp-srv2.lab.eng.blr.redhat.com              854        12.4MB          5203             0             0              stopped              34.00
volume rebalance: distrep: success: rebalance process may be in the middle of a file migration.
The process will be fully stopped once the migration of the file is complete.
Please check rebalance process for completion before doing any further brick related tasks on the volume.
[root@rhs-client4 mnt]# gluster v rebalance distrep status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress               0.00
     rhs-client39.lab.eng.blr.redhat.com              201       986.9KB          5608             3             0              stopped              34.00
      rhs-gp-srv2.lab.eng.blr.redhat.com              854        12.4MB          5203             2             0              stopped              34.00
volume rebalance: distrep: success:
[root@rhs-client4 mnt]# gluster v rebalance distrep stop
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress               0.00
     rhs-client39.lab.eng.blr.redhat.com              201       986.9KB          5608             3             0              stopped              34.00
      rhs-gp-srv2.lab.eng.blr.redhat.com              854        12.4MB          5203             2             0              stopped              34.00
volume rebalance: distrep: success: rebalance process may be in the middle of a file migration.
The process will be fully stopped once the migration of the file is complete.
Please check rebalance process for completion before doing any further brick related tasks on the volume.

Comment 2 Susant Kumar Palai 2015-11-27 10:48:23 UTC
Cloning to 3.1. To be fixed in future release.


Note You need to log in before you can comment on or make changes to this bug.