Bug 1303028

Summary: Tiering status and rebalance status stops getting updated
Product: [Community] GlusterFS Reporter: Mohammed Rafi KC <rkavunga>
Component: glusterdAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: mainlineCC: amukherj, bugs, hgowtham, jbyers, nchilaka, storage-qa-internal
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1302968
: 1311041 (view as bug list) Environment:
Last Closed: 2016-06-16 13:56:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1302968    
Bug Blocks: 1303269, 1310972, 1311041    

Description Mohammed Rafi KC 2016-01-29 10:57:37 UTC
+++ This bug was initially created as a clone of Bug #1302968 +++

On my 16 node setup after about a day, 3 nodes in the rebalance status shows the lapsed time reset to "ZERO" and again after about 4-5 hours, all the nodes stopped ticking except only one node continued which is continually ticking.
Hence the promote/demote and scanned files stats have stopped getting updated


[root@dhcp37-202 ~]# gluster v rebal nagvol status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                2        0Bytes         35287             0             0          in progress           29986.00
                            10.70.37.195                0        0Bytes         35281             0             0          in progress           29986.00
                            10.70.35.155                0        0Bytes         35003             0             0          in progress           29986.00
                            10.70.35.222                0        0Bytes         35002             0             0          in progress           29986.00
                            10.70.35.108                0        0Bytes             0             0             0          in progress           29985.00
                             10.70.35.44                0        0Bytes             0             0             0          in progress           29986.00
                             10.70.35.89                0        0Bytes             0             0             0          in progress          146477.00
                            10.70.35.231                0        0Bytes             0             0             0          in progress           29986.00
                            10.70.35.176                0        0Bytes         35487             0             0          in progress           29986.00
                            10.70.35.232                0        0Bytes             0             0             0          in progress               0.00
                            10.70.35.173                0        0Bytes             0             0             0          in progress               0.00
                            10.70.35.163                0        0Bytes         35314             0             0          in progress           29986.00
                            10.70.37.101                0        0Bytes             0             0             0          in progress               0.00
                             10.70.37.69                0        0Bytes         35385             0             0          in progress           29986.00
                             10.70.37.60                0        0Bytes         35255             0             0          in progress           29986.00
                            10.70.37.120                0        0Bytes         35250             0             0          in progress           29986.00
volume rebalance: nagvol: success
[root@dhcp37-202 ~]# 
[root@dhcp37-202 ~]# 
[root@dhcp37-202 ~]# gluster v rebal nagvol status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                2        0Bytes         35287             0             0          in progress           29986.00
                            10.70.37.195                0        0Bytes         35281             0             0          in progress           29986.00
                            10.70.35.155                0        0Bytes         35003             0             0          in progress           29986.00
                            10.70.35.222                0        0Bytes         35002             0             0          in progress           29986.00
                            10.70.35.108                0        0Bytes             0             0             0          in progress           29985.00
                             10.70.35.44                0        0Bytes             0             0             0          in progress           29986.00
                             10.70.35.89                0        0Bytes             0             0             0          in progress          146488.00
                            10.70.35.231                0        0Bytes             0             0             0          in progress           29986.00
                            10.70.35.176                0        0Bytes         35487             0             0          in progress           29986.00
                            10.70.35.232                0        0Bytes             0             0             0          in progress               0.00
                            10.70.35.173                0        0Bytes             0             0             0          in progress               0.00
                            10.70.35.163                0        0Bytes         35314             0             0          in progress           29986.00
                            10.70.37.101                0        0Bytes             0             0             0          in progress               0.00
                             10.70.37.69                0        0Bytes         35385             0             0          in progress           29986.00
                             10.70.37.60                0        0Bytes         35255             0             0          in progress           29986.00
                            10.70.37.120                0        0Bytes         35250             0             0          in progress           29986.00





Also, the tier status shows as belo:
[root@dhcp37-202 ~]# gluster v  tier nagvol status
Node                 Promoted files       Demoted files        Status              
---------            ---------            ---------            ---------           
localhost            0                    0                    in progress         
10.70.37.195         0                    0                    in progress         
10.70.35.155         0                    0                    in progress         
10.70.35.222         0                    0                    in progress         
10.70.35.108         0                    0                    in progress         
10.70.35.44          0                    0                    in progress         
10.70.35.89          0                    0                    in progress         
10.70.35.231         0                    0                    in progress         
10.70.35.176         0                    0                    in progress         
10.70.35.232         0                    0                    in progress         
10.70.35.173         0                    0                    in progress         
10.70.35.163         0                    0                    in progress         
10.70.37.101         0                    0                    in progress         
10.70.37.69          0                    0                    in progress         
10.70.37.60          0                    0                    in progress         
10.70.37.120         0                    0                    in progress         
Tiering Migration Functionality: nagvol: success






-> I was running some IOs but not very heavy
-> Also, there was an nfs problem reported wrt music files, stopped palying with permission denied
-> I saw files promotes happening 
-> Also, the glusterd was restarted only on one of the nodes, in the last 2 days




glusterfs-client-xlators-3.7.5-17.el7rhgs.x86_64
glusterfs-server-3.7.5-17.el7rhgs.x86_64
gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64
vdsm-gluster-4.16.30-1.3.el7rhgs.noarch
glusterfs-3.7.5-17.el7rhgs.x86_64
glusterfs-api-3.7.5-17.el7rhgs.x86_64
glusterfs-cli-3.7.5-17.el7rhgs.x86_64
glusterfs-geo-replication-3.7.5-17.el7rhgs.x86_64
glusterfs-debuginfo-3.7.5-17.el7rhgs.x86_64
gluster-nagios-common-0.2.3-1.el7rhgs.noarch
python-gluster-3.7.5-16.el7rhgs.noarch
glusterfs-libs-3.7.5-17.el7rhgs.x86_64
glusterfs-fuse-3.7.5-17.el7rhgs.x86_64
glusterfs-rdma-3.7.5-17.el7rhgs.x86_64




sosreports will be attached

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-01-29 02:45:42 EST ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

Comment 1 Vijay Bellur 2016-01-29 10:57:58 UTC
REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#1) for review on master by mohammed rafi  kc (rkavunga)

Comment 2 Vijay Bellur 2016-01-29 17:35:09 UTC
REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#2) for review on master by mohammed rafi  kc (rkavunga)

Comment 3 Vijay Bellur 2016-01-30 08:35:01 UTC
REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#3) for review on master by mohammed rafi  kc (rkavunga)

Comment 4 Vijay Bellur 2016-01-31 17:51:07 UTC
REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#4) for review on master by mohammed rafi  kc (rkavunga)

Comment 5 Vijay Bellur 2016-02-22 11:26:39 UTC
REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#5) for review on master by mohammed rafi  kc (rkavunga)

Comment 6 Vijay Bellur 2016-02-23 05:42:08 UTC
COMMIT: http://review.gluster.org/13319 committed in master by Atin Mukherjee (amukherj) 
------
commit a67331f3f79e827ffa4f7a547f6898e12407bbf9
Author: Mohammed Rafi KC <rkavunga>
Date:   Fri Jan 29 16:24:02 2016 +0530

    glusterd/rebalance: initialize defrag variable after glusterd restart
    
    During reblance restart after glusterd restarted, we are not
    connecting to rebalance process from glusterd, because the
    defrag variable in volinfo will be null.
    
    Initializing the variable will connect the rpc
    
    Change-Id: Id820cad6a3634a9fc976427fbe1c45844d3d4b9b
    BUG: 1303028
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: http://review.gluster.org/13319
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Dan Lambright <dlambrig>
    CentOS-regression: Gluster Build System <jenkins.com>

Comment 7 Niels de Vos 2016-06-16 13:56:06 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user