Bug 1454602 - Rebalance estimate time sometimes shows negative values
Summary: Rebalance estimate time sometimes shows negative values
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.3.0
Assignee: Nithya Balachandran
QA Contact: Ambarish
URL:
Whiteboard:
Depends On:
Blocks: 1417151 1457985 1460894 1460914 1475399
TreeView+ depends on / blocked
 
Reported: 2017-05-23 07:17 UTC by Prasad Desala
Modified: 2017-09-21 04:58 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-36
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1457985 (view as bug list)
Environment:
Last Closed: 2017-09-21 04:45:37 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Prasad Desala 2017-05-23 07:17:05 UTC
Description of problem:
=======================
On a cifs mount having a dataset of empty directories+ directories with files, started removing few bricks. When issued remove-brick status command, rebalance estimate time shows negative values. 

I have issued status for almost 21 times during remove-brick rebalance and every time it showed negative values. At the 22nd attempt, the rebalance estimate time showed positive values (at the point, rebalance ran for almost 24 mins) 

[root@dhcp47-127 samba]# gluster v remove-brick distrep 10.70.47.127:/bricks/brick6/b6 10.70.46.181:/bricks/brick6/b6 10.70.46.47:/bricks/brick6/b6 10.70.47.140:/bricks/brick6/b6 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                2         9.5KB             6             0             0            completed        0:15:16
       dhcp46-181.lab.eng.blr.redhat.com                0        0Bytes             0             0             0          in progress        0:21:32
        dhcp46-47.lab.eng.blr.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
       dhcp47-140.lab.eng.blr.redhat.com                0        0Bytes             0             0             0          in progress        0:21:21
Estimated time left for rebalance to complete : 2023406814:-21:-32


Version-Release number of selected component (if applicable):
3.8.4-25.el7rhgs.x86_64

How reproducible:
=================
1/1

Steps to Reproduce:
===================
1) Create a distributed-replicate volume and start it.
2) cifs mount the volume on a client.
3) Create a data set of empty directories+ directories with files.
4) Remove few bricks.
5) Keep running remove-brick status command and check "Estimated time left for rebalance to complete " output.

Actual results:
===============
Rebalance estimate time sometimes shows negative values.

Expected results:
=================
Rebalance estimate time should not show negative values.

Comment 7 Atin Mukherjee 2017-06-01 17:04:12 UTC
upstream patch : https://review.gluster.org/17448

Comment 10 Ambarish 2017-06-16 12:04:15 UTC
As updated in https://bugzilla.redhat.com/show_bug.cgi?id=1462181,I still see negative values for rebalance ETA on 3.8.4-28,just before it fails :

[root@gqas013 glusterfs]# gluster v rebalance testvol  status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress        0:00:00
      gqas005.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
      gqas006.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
      gqas008.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
volume rebalance: testvol: success
[root@gqas013 glusterfs]# gluster v rebalance testvol  status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress        0:00:03
      gqas005.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
      gqas006.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
      gqas008.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:00
Estimated time left for rebalance to complete : 2023406815:00:-3
volume rebalance: testvol: success
[root@gqas013 glusterfs]# 
[root@gqas013 glusterfs]# 
[root@gqas013 glusterfs]# gluster v rebalance testvol  status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress        0:00:06
      gqas005.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:01
      gqas006.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:01
      gqas008.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:01
Estimated time left for rebalance to complete : 2023406815:00:-1
volume rebalance: testvol: success
[root@gqas013 glusterfs]# gluster v rebalance testvol  status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress        0:00:08
      gqas005.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:03
      gqas006.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:03
      gqas008.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:03
Estimated time left for rebalance to complete : 2023406815:00:-3
volume rebalance: testvol: success
[root@gqas013 glusterfs]# gluster v rebalance testvol  status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress        0:00:09
      gqas005.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:04
      gqas006.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:04
      gqas008.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:04
Estimated time left for rebalance to complete : 2023406815:00:-4
volume rebalance: testvol: success
[root@gqas013 glusterfs]# gluster v rebalance testvol  status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress        0:00:10
      gqas005.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:05
      gqas006.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:05
      gqas008.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0          in progress        0:00:05
Estimated time left for rebalance to complete : 2023406815:00:-5
volume rebalance: testvol: success

rpm -qa|grep glus
glusterfs-3.8.4-28.el7rhgs.x86_64


I am moving this back to Dev for a relook.

Comment 15 Atin Mukherjee 2017-06-19 06:38:57 UTC
upstream patch : https://review.gluster.org/#/c/17564/

Comment 17 Nithya Balachandran 2017-06-22 06:03:28 UTC
Now, rebalance status will not show the estimate if the rebalance process cannot calculate the values. Scenarios where this can happen is when the rebalance process is unable to get the rate at which the files are processed (before a failure as in the test in comment#10)

Comment 22 Atin Mukherjee 2017-07-24 14:31:27 UTC
upstream patch : https://review.gluster.org/17863

Comment 23 Atin Mukherjee 2017-07-26 15:04:12 UTC
upstream 3.12 patch : https://review.gluster.org/17882
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/113576

Comment 25 Ambarish 2017-08-09 11:19:01 UTC
Neither Prasad nor I could hit in in our testing on latest downstream bits.

I am moving this BZ to Verified.

Will reopen again,if I hit it at a later time.

Comment 27 errata-xmlrpc 2017-09-21 04:45:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 28 errata-xmlrpc 2017-09-21 04:58:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.