Bug 1221476 - Data Tiering:rebalance fails on a tiered volume
Summary: Data Tiering:rebalance fails on a tiered volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: tiering
Version: 3.7.0
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: Mohammed Rafi KC
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On: 1205624 1229236
Blocks: qe_tracker_everglades glusterfs-tiering-supportability 1227188 1229269 1260923 1273726 1274411
TreeView+ depends on / blocked
 
Reported: 2015-05-14 06:48 UTC by Mohammed Rafi KC
Modified: 2015-10-30 17:32 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1205624
: 1227188 (view as bug list)
Environment:
Last Closed: 2015-06-02 08:02:43 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Mohammed Rafi KC 2015-05-14 06:48:32 UTC
+++ This bug was initially created as a clone of Bug #1205624 +++

Description of problem:
=======================
rebalance operation fails on a tiered volume.
Have tried it on a regular volume , where it passes


Version-Release number of selected component (if applicable):
============================================================
3.7 upstream nightlies build http://download.gluster.org/pub/gluster/glusterfs/nightly/glusterfs/epel-6-x86_64/glusterfs-3.7dev-0.777.git2308c07.autobuild/


How reproducible:
=================
reproduced it twice on tiered volume


Steps to Reproduce:
==================
1.create a gluster volume(i created a distribute type) and start the volume
2.create some files on the volume
3.attach a tier to the volume using attach-tier
4. Now run a rebalance using "gluster v rebalance <vol> start" on the tiered volume
5.check the status of rebalance


Actual results:
===============
The rebalance action fails as below
[root@rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0               failed               0.00
                            rhs-client38                0        0Bytes             0             0             0               failed               0.00
                            rhs-client37                0        0Bytes             0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 


Expected results:
================
rebalance should pass on tiered vol too.


Additional info(CLI logs):
===============

[root@rhs-client44 glusterfs]# tail -f nag_vol2-rebalance.log 
-------------------------------------------------------------
[2015-03-25 10:45:33.631882] I [MSGID: 100030] [glusterfsd.c:2288:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7dev (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/nag_vol2 --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option *dht.rebalance-cmd=1 --xlator-option *dht.node-uuid=1327654c-0521-46f8-8be3-b0f9c183d137 --socket-file /var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock --pid-file /var/lib/glusterd/vols/nag_vol2/rebalance/1327654c-0521-46f8-8be3-b0f9c183d137.pid -l /var/log/glusterfs/nag_vol2-rebalance.log)
[2015-03-25 10:45:33.642596] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-03-25 10:45:38.631172] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'node-uuid' for volume 'tier-dht' with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631207] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'rebalance-cmd' for volume 'tier-dht' with value '1'
[2015-03-25 10:45:38.631222] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'xattr-name' for volume 'tier-dht' with value 'trusted.tier-gfid'
[2015-03-25 10:45:38.631244] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'readdir-optimize' for volume 'tier-dht' with value 'on'
[2015-03-25 10:45:38.631259] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'assert-no-child-down' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631272] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'lookup-unhashed' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631288] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'use-readdirp' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631300] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'node-uuid' for volume 'nag_vol2-hot-dht' with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631323] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'rebalance-cmd' for volume 'nag_vol2-hot-dht' with value '1'
[2015-03-25 10:45:38.631337] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'readdir-optimize' for volume 'nag_vol2-hot-dht' with value 'on'
[2015-03-25 10:45:38.631354] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'assert-no-child-down' for volume 'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631367] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'lookup-unhashed' for volume 'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631380] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'use-readdirp' for volume 'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631397] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'node-uuid' for volume 'nag_vol2-cold-dht' with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631415] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'rebalance-cmd' for volume 'nag_vol2-cold-dht' with value '1'
[2015-03-25 10:45:38.631427] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'readdir-optimize' for volume 'nag_vol2-cold-dht' with value 'on'
[2015-03-25 10:45:38.631443] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'assert-no-child-down' for volume 'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631455] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'lookup-unhashed' for volume 'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631471] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'use-readdirp' for volume 'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.632109] I [dht-shared.c:340:dht_init_regex] 0-tier-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2015-03-25 10:45:38.633278] W [options.c:1193:xlator_option_init_int32] 0-tier-dht: unknown option: write-freq-threshold
[2015-03-25 10:45:38.633313] E [xlator.c:426:xlator_init] 0-tier-dht: Initialization of volume 'tier-dht' failed, review your volfile again
[2015-03-25 10:45:38.633326] E [graph.c:322:glusterfs_graph_init] 0-tier-dht: initializing translator failed
[2015-03-25 10:45:38.633336] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2015-03-25 10:45:38.633716] W [glusterfsd.c:1212:cleanup_and_exit] (--> 0-: received signum (0), shutting down
##########################################################################################################################################################


[root@rhs-client44 glusterfs]#tail -f etc-glusterfs-glusterd.vol.log 
---------------------------------------------------------------------

[2015-03-25 10:45:33.548238] I [glusterd-utils.c:8923:glusterd_generate_and_set_task_id] 0-management: Generated task-id e8891b0d-3861-4104-bc96-1510aceed88d for key rebalance-id
[2015-03-25 10:45:38.626851] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2015-03-25 10:45:38.634730] W [socket.c:642:__socket_rwv] 0-management: readv on /var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock failed (No data available)
[2015-03-25 10:45:38.730494] I [MSGID: 106007] [glusterd-rebalance.c:173:__glusterd_defrag_notify] 0-management: Rebalance process for volume nag_vol2 has disconnected.
[2015-03-25 10:45:38.730534] I [mem-pool.c:557:mem_pool_destroy] 0-management: size=588 max=0 total=0
[2015-03-25 10:45:38.730550] I [mem-pool.c:557:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2015-03-25 10:45:43.733289] E [glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2015-03-25 10:45:43.746431] E [glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2015-03-25 10:45:48.840974] I [glusterd-handler.c:3970:__glusterd_handle_status_volume] 0-management: Received status volume req for volume nag_vol2
##########################################################################################################################################################

[root@rhs-client44 glusterfs]# tail -f cli.log 
----------------------------------------------
[2015-03-25 10:45:33.543436] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-03-25 10:45:33.543559] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:36.412879] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:39.413296] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:42.413729] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:43.879079] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2015-03-25 10:45:48.838839] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-03-25 10:45:48.839012] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:48.851709] I [input.c:36:cli_batch] 0-: Exiting with: 0




[root@rhs-client44 glusterfs]# gluster v info vol1
 
Volume Name: vol1
Type: Tier
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/vol1_hot/hb2
Brick2: rhs-client44:/pavanbrick2/vol1_hot/hb2
Brick3: rhs-client44:/pavanbrick1/vol1/b1
Brick4: rhs-client38:/pavanbrick1/vol1/b1
Brick5: rhs-client37:/pavanbrick1/vol1/b1
[root@rhs-client44 glusterfs]# gluster v rebalance start vol1
Usage: volume rebalance <VOLNAME> {{fix-layout start} | {start [force]|stop|status}}
[root@rhs-client44 glusterfs]# gluster v rebalance vol1 start
volume rebalance: vol1: success: Rebalance on vol1 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 368050e1-75ab-4332-83c1-f5fb7c4fea41

[root@rhs-client44 glusterfs]# gluster v rebalance vol1 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0               failed               0.00
                            rhs-client38                0        0Bytes             0             0             0               failed               0.00
                            rhs-client37                0        0Bytes             0             0             0               failed               0.00
volume rebalance: vol1: success: 
[root@rhs-client44 glusterfs]# gluster v info nag_vol2
 
Volume Name: nag_vol2
Type: Tier
Volume ID: 4f00d705-0ab4-4a6e-8605-15493153db76
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/nag_vol2
Brick2: rhs-client44:/pavanbrick2/nag_vol2/hb1
Brick3: rhs-client44:/pavanbrick1/nag_vol2/b1
Brick4: rhs-client37:/pavanbrick1/nag_vol2/b1
Brick5: rhs-client38:/pavanbrick1/nag_vol2/b1
[root@rhs-client44 glusterfs]# gluster v status nag_vol2
Status of volume: nag_vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-client37:/pavanbrick2/nag_vol2    49157     0          Y       32000
Brick rhs-client44:/pavanbrick2/nag_vol2/hb
1                                           49157     0          Y       32707
Brick rhs-client44:/pavanbrick1/nag_vol2/b1 49156     0          Y       32535
Brick rhs-client37:/pavanbrick1/nag_vol2/b1 49156     0          Y       31885
Brick rhs-client38:/pavanbrick1/nag_vol2/b1 49155     0          Y       625  
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on rhs-client38                  N/A       N/A        N       N/A  
NFS Server on rhs-client37                  N/A       N/A        N       N/A  
 
Task Status of Volume nag_vol2
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : e8891b0d-3861-4104-bc96-1510aceed88d
Status               : failed              
 
[root@rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0               failed               0.00
                            rhs-client38                0        0Bytes             0             0             0               failed               0.00
                            rhs-client37                0        0Bytes             0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 
[root@rhs-client44 glusterfs]# 
[root@rhs-client44 glusterfs]# 
[root@rhs-client44 glusterfs]# 
[root@rhs-client44 glusterfs]# gluster --version
glusterfs 3.7dev built on Mar 24 2015 01:04:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@rhs-client44 glusterfs]# gluster pool lsit
unrecognized word: lsit (position 1)
[root@rhs-client44 glusterfs]# gluster pool list
UUID					Hostname    	State
0f5fa6d4-8545-41ec-8f5e-9612fa72262a	rhs-client38	Connected 
456e0cc9-e2fc-44fb-a4ff-aec8fe60cba2	rhs-client37	Connected 
1327654c-0521-46f8-8be3-b0f9c183d137	localhost   	Connected

--- Additional comment from Dan Lambright on 2015-04-21 14:44:55 EDT ---

We do not support rebalance with tiered volumes. You need to detach a tier, then rebalance it. The CLI should probably say this and the rebalance command should fail gracefully.

--- Additional comment from Anand Avati on 2015-04-23 07:00:02 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#1) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Dan Lambright on 2015-04-24 07:52:36 EDT ---



--- Additional comment from Anand Avati on 2015-04-30 02:15:32 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#2) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Anand Avati on 2015-05-05 02:57:13 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#3) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Anand Avati on 2015-05-05 09:28:58 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#4) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Anand Avati on 2015-05-06 03:25:04 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#5) for review on master by mohammed rafi  kc (rkavunga)

Comment 1 Anand Avati 2015-05-14 06:50:47 UTC
REVIEW: http://review.gluster.org/10774 (tiering: Do not allow some operations on tiered volume) posted (#1) for review on release-3.7 by mohammed rafi  kc (rkavunga)

Comment 2 Anand Avati 2015-05-28 07:30:40 UTC
COMMIT: http://review.gluster.org/10774 committed in release-3.7 by Krishnan Parthasarathi (kparthas) 
------
commit d6effb1fb232266863eaee5d66c903b0eb623a1a
Author: Mohammed Rafi KC <rkavunga>
Date:   Thu Apr 23 16:24:43 2015 +0530

    tiering: Do not allow some operations on tiered volume
    
            Back port of http://review.gluster.org/10349
    
    Some operations like add-brick,remove-brick,rebalance,
    replace-brick are not supported on tiered volume.
    
    But there is no code level check for this. This patch
    will allow to do the same
    
    >Change-Id: I12689f4e902cf0cceaf6f7f29c71057305024977
    >BUG: 1205624
    >Signed-off-by: Mohammed Rafi KC <rkavunga>
    >Reviewed-on: http://review.gluster.org/10349
    >Tested-by: Gluster Build System <jenkins.com>
    >Tested-by: NetBSD Build System
    >Reviewed-by: Krishnan Parthasarathi <kparthas>
    >Tested-by: Krishnan Parthasarathi <kparthas>
    
    Change-Id: Idaf5469d24f03e79ffb4e4edcbe39e84585aca39
    BUG: 1221476
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: http://review.gluster.org/10774
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Atin Mukherjee <amukherj>
    Reviewed-by: Dan Lambright <dlambrig>
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Krishnan Parthasarathi <kparthas>

Comment 3 Niels de Vos 2015-06-02 08:02:43 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.1, please reopen this bug report.

glusterfs-3.7.1 has been announced on the Gluster Packaging mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.packaging/1
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.