Description of problem: ================= when a rebalance is in progress, we can see the detailed info as below [root@rhs-gp-srv11 glusterfs]# gluster v rebal ctime-distrep-rebal status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- rhs-gp-srv13.lab.eng.blr.redhat.com 0 0Bytes 0 0 0 in progress 0:00:00 rhs-gp-srv16.lab.eng.blr.redhat.com 6744 73.7MB 48580 0 0 in progress 0:04:41 localhost 6209 97.5MB 45174 0 0 in progress 0:04:41 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: ctime-distrep-rebal: success However, while a node is rebooted, we are unable to see above such info, it only displays as below [root@rhs-gp-srv11 glusterfs]# gluster v rebal ctime-distrep-rebal status volume rebalance: ctime-distrep-rebal: success This is a problem if a user wants to know the exact files that have got rebalanced and how many have failed Version-Release number of selected component (if applicable): ============= 6.0.17 How reproducible: ============= consistent Steps to Reproduce: 1. create a 3x3 volume 2. do some IOs from client 3. issue a remove-brick to make it 2x3 4. while rebalance is happening do a reboot of one of the nodes Actual results: ================ reblance status doesnt show detailed info Expected results: ============= need detailed info even if a node is rebooted Additional info: ================ Not seeing rebalance details will be frustrating especially if the other node has gone down during maintenance. Also, if the rebalance completes, but files have failed to migrate, during this time (ie when a node is rebooted), then the user will not be able to figure out if the rebalance succeeded completely, without looking into logs.
On basis of output mentioned on comment #9 , I can see when one of the node is rebooted the information about the rest two nodes is available and it satisfies the expected result that is "need detailed info even if a node is rebooted". Hence moving the bug to verified state .
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0288