Bug 1205624 - Data Tiering:rebalance fails on a tiered volume
Summary: Data Tiering:rebalance fails on a tiered volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: tiering
Version: mainline
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: Mohammed Rafi KC
QA Contact:
URL:
Whiteboard:
: 1213380 (view as bug list)
Depends On:
Blocks: qe_tracker_everglades 1221476 1227188 1229236 1260923
TreeView+ depends on / blocked
 
Reported: 2015-03-25 10:51 UTC by Nag Pavan Chilakam
Modified: 2018-12-04 20:22 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1221476 1229236 (view as bug list)
Environment:
Last Closed: 2016-06-16 12:45:03 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Nag Pavan Chilakam 2015-03-25 10:51:48 UTC
Description of problem:
=======================
rebalance operation fails on a tiered volume.
Have tried it on a regular volume , where it passes


Version-Release number of selected component (if applicable):
============================================================
3.7 upstream nightlies build http://download.gluster.org/pub/gluster/glusterfs/nightly/glusterfs/epel-6-x86_64/glusterfs-3.7dev-0.777.git2308c07.autobuild/


How reproducible:
=================
reproduced it twice on tiered volume


Steps to Reproduce:
==================
1.create a gluster volume(i created a distribute type) and start the volume
2.create some files on the volume
3.attach a tier to the volume using attach-tier
4. Now run a rebalance using "gluster v rebalance <vol> start" on the tiered volume
5.check the status of rebalance


Actual results:
===============
The rebalance action fails as below
[root@rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0               failed               0.00
                            rhs-client38                0        0Bytes             0             0             0               failed               0.00
                            rhs-client37                0        0Bytes             0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 


Expected results:
================
rebalance should pass on tiered vol too.


Additional info(CLI logs):
===============

[root@rhs-client44 glusterfs]# tail -f nag_vol2-rebalance.log 
-------------------------------------------------------------
[2015-03-25 10:45:33.631882] I [MSGID: 100030] [glusterfsd.c:2288:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7dev (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/nag_vol2 --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option *dht.rebalance-cmd=1 --xlator-option *dht.node-uuid=1327654c-0521-46f8-8be3-b0f9c183d137 --socket-file /var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock --pid-file /var/lib/glusterd/vols/nag_vol2/rebalance/1327654c-0521-46f8-8be3-b0f9c183d137.pid -l /var/log/glusterfs/nag_vol2-rebalance.log)
[2015-03-25 10:45:33.642596] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-03-25 10:45:38.631172] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'node-uuid' for volume 'tier-dht' with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631207] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'rebalance-cmd' for volume 'tier-dht' with value '1'
[2015-03-25 10:45:38.631222] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'xattr-name' for volume 'tier-dht' with value 'trusted.tier-gfid'
[2015-03-25 10:45:38.631244] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'readdir-optimize' for volume 'tier-dht' with value 'on'
[2015-03-25 10:45:38.631259] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'assert-no-child-down' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631272] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'lookup-unhashed' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631288] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht: adding option 'use-readdirp' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631300] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'node-uuid' for volume 'nag_vol2-hot-dht' with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631323] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'rebalance-cmd' for volume 'nag_vol2-hot-dht' with value '1'
[2015-03-25 10:45:38.631337] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'readdir-optimize' for volume 'nag_vol2-hot-dht' with value 'on'
[2015-03-25 10:45:38.631354] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'assert-no-child-down' for volume 'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631367] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'lookup-unhashed' for volume 'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631380] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-hot-dht: adding option 'use-readdirp' for volume 'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631397] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'node-uuid' for volume 'nag_vol2-cold-dht' with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631415] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'rebalance-cmd' for volume 'nag_vol2-cold-dht' with value '1'
[2015-03-25 10:45:38.631427] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'readdir-optimize' for volume 'nag_vol2-cold-dht' with value 'on'
[2015-03-25 10:45:38.631443] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'assert-no-child-down' for volume 'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631455] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'lookup-unhashed' for volume 'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631471] I [graph.c:269:gf_add_cmdline_options] 0-nag_vol2-cold-dht: adding option 'use-readdirp' for volume 'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.632109] I [dht-shared.c:340:dht_init_regex] 0-tier-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2015-03-25 10:45:38.633278] W [options.c:1193:xlator_option_init_int32] 0-tier-dht: unknown option: write-freq-threshold
[2015-03-25 10:45:38.633313] E [xlator.c:426:xlator_init] 0-tier-dht: Initialization of volume 'tier-dht' failed, review your volfile again
[2015-03-25 10:45:38.633326] E [graph.c:322:glusterfs_graph_init] 0-tier-dht: initializing translator failed
[2015-03-25 10:45:38.633336] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2015-03-25 10:45:38.633716] W [glusterfsd.c:1212:cleanup_and_exit] (--> 0-: received signum (0), shutting down
##########################################################################################################################################################


[root@rhs-client44 glusterfs]#tail -f etc-glusterfs-glusterd.vol.log 
---------------------------------------------------------------------

[2015-03-25 10:45:33.548238] I [glusterd-utils.c:8923:glusterd_generate_and_set_task_id] 0-management: Generated task-id e8891b0d-3861-4104-bc96-1510aceed88d for key rebalance-id
[2015-03-25 10:45:38.626851] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2015-03-25 10:45:38.634730] W [socket.c:642:__socket_rwv] 0-management: readv on /var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock failed (No data available)
[2015-03-25 10:45:38.730494] I [MSGID: 106007] [glusterd-rebalance.c:173:__glusterd_defrag_notify] 0-management: Rebalance process for volume nag_vol2 has disconnected.
[2015-03-25 10:45:38.730534] I [mem-pool.c:557:mem_pool_destroy] 0-management: size=588 max=0 total=0
[2015-03-25 10:45:38.730550] I [mem-pool.c:557:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2015-03-25 10:45:43.733289] E [glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2015-03-25 10:45:43.746431] E [glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2015-03-25 10:45:48.840974] I [glusterd-handler.c:3970:__glusterd_handle_status_volume] 0-management: Received status volume req for volume nag_vol2
##########################################################################################################################################################

[root@rhs-client44 glusterfs]# tail -f cli.log 
----------------------------------------------
[2015-03-25 10:45:33.543436] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-03-25 10:45:33.543559] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:36.412879] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:39.413296] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:42.413729] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:43.879079] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2015-03-25 10:45:48.838839] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-03-25 10:45:48.839012] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now
[2015-03-25 10:45:48.851709] I [input.c:36:cli_batch] 0-: Exiting with: 0




[root@rhs-client44 glusterfs]# gluster v info vol1
 
Volume Name: vol1
Type: Tier
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/vol1_hot/hb2
Brick2: rhs-client44:/pavanbrick2/vol1_hot/hb2
Brick3: rhs-client44:/pavanbrick1/vol1/b1
Brick4: rhs-client38:/pavanbrick1/vol1/b1
Brick5: rhs-client37:/pavanbrick1/vol1/b1
[root@rhs-client44 glusterfs]# gluster v rebalance start vol1
Usage: volume rebalance <VOLNAME> {{fix-layout start} | {start [force]|stop|status}}
[root@rhs-client44 glusterfs]# gluster v rebalance vol1 start
volume rebalance: vol1: success: Rebalance on vol1 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 368050e1-75ab-4332-83c1-f5fb7c4fea41

[root@rhs-client44 glusterfs]# gluster v rebalance vol1 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0               failed               0.00
                            rhs-client38                0        0Bytes             0             0             0               failed               0.00
                            rhs-client37                0        0Bytes             0             0             0               failed               0.00
volume rebalance: vol1: success: 
[root@rhs-client44 glusterfs]# gluster v info nag_vol2
 
Volume Name: nag_vol2
Type: Tier
Volume ID: 4f00d705-0ab4-4a6e-8605-15493153db76
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/nag_vol2
Brick2: rhs-client44:/pavanbrick2/nag_vol2/hb1
Brick3: rhs-client44:/pavanbrick1/nag_vol2/b1
Brick4: rhs-client37:/pavanbrick1/nag_vol2/b1
Brick5: rhs-client38:/pavanbrick1/nag_vol2/b1
[root@rhs-client44 glusterfs]# gluster v status nag_vol2
Status of volume: nag_vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-client37:/pavanbrick2/nag_vol2    49157     0          Y       32000
Brick rhs-client44:/pavanbrick2/nag_vol2/hb
1                                           49157     0          Y       32707
Brick rhs-client44:/pavanbrick1/nag_vol2/b1 49156     0          Y       32535
Brick rhs-client37:/pavanbrick1/nag_vol2/b1 49156     0          Y       31885
Brick rhs-client38:/pavanbrick1/nag_vol2/b1 49155     0          Y       625  
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on rhs-client38                  N/A       N/A        N       N/A  
NFS Server on rhs-client37                  N/A       N/A        N       N/A  
 
Task Status of Volume nag_vol2
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : e8891b0d-3861-4104-bc96-1510aceed88d
Status               : failed              
 
[root@rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0               failed               0.00
                            rhs-client38                0        0Bytes             0             0             0               failed               0.00
                            rhs-client37                0        0Bytes             0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 
[root@rhs-client44 glusterfs]# 
[root@rhs-client44 glusterfs]# 
[root@rhs-client44 glusterfs]# 
[root@rhs-client44 glusterfs]# gluster --version
glusterfs 3.7dev built on Mar 24 2015 01:04:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@rhs-client44 glusterfs]# gluster pool lsit
unrecognized word: lsit (position 1)
[root@rhs-client44 glusterfs]# gluster pool list
UUID					Hostname    	State
0f5fa6d4-8545-41ec-8f5e-9612fa72262a	rhs-client38	Connected 
456e0cc9-e2fc-44fb-a4ff-aec8fe60cba2	rhs-client37	Connected 
1327654c-0521-46f8-8be3-b0f9c183d137	localhost   	Connected

Comment 1 Dan Lambright 2015-04-21 18:44:55 UTC
We do not support rebalance with tiered volumes. You need to detach a tier, then rebalance it. The CLI should probably say this and the rebalance command should fail gracefully.

Comment 2 Anand Avati 2015-04-23 11:00:02 UTC
REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#1) for review on master by mohammed rafi  kc (rkavunga@redhat.com)

Comment 3 Dan Lambright 2015-04-24 11:52:36 UTC
*** Bug 1213380 has been marked as a duplicate of this bug. ***

Comment 4 Anand Avati 2015-04-30 06:15:32 UTC
REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#2) for review on master by mohammed rafi  kc (rkavunga@redhat.com)

Comment 5 Anand Avati 2015-05-05 06:57:13 UTC
REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#3) for review on master by mohammed rafi  kc (rkavunga@redhat.com)

Comment 6 Anand Avati 2015-05-05 13:28:58 UTC
REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#4) for review on master by mohammed rafi  kc (rkavunga@redhat.com)

Comment 7 Anand Avati 2015-05-06 07:25:04 UTC
REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations on tiered volume) posted (#5) for review on master by mohammed rafi  kc (rkavunga@redhat.com)

Comment 8 Niels de Vos 2015-05-15 12:57:36 UTC
This change should not be in "ON_QA", the patch posted for this bug is only available in the master branch and not in a release yet. Moving back to MODIFIED until there is an beta release for the next GlusterFS version.

Comment 9 Nagaprasad Sathyanarayana 2015-10-25 14:56:30 UTC
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 10 Niels de Vos 2016-06-16 12:45:03 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.