Bug 808353

Summary: rebalance doesn't actually migrate the data in striped volumes
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: access-controlAssignee: shishir gowda <sgowda>
Status: CLOSED CURRENTRELEASE QA Contact: M S Vishwanath Bhat <vbhat>
Severity: high Docs Contact:
Priority: high    
Version: pre-releaseCC: gluster-bugs, mzywusko, nsathyan, rfortier
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:41:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description M S Vishwanath Bhat 2012-03-30 08:30:41 UTC
Description of problem:
In striped volume, rebalance actually doesn't migrate the data in striped volumes. It only does the fix-layout part.

Version-Release number of selected component (if applicable):
glusterfs-3.3.0qa32

How reproducible:
Consistent

Steps to Reproduce:
1. Create a start a 2*2 striped-replicated volume.
2. Now add data on the mountpoint. like untarring the linux kernel.
3. Now add 4 more bricks to make it 2*2*2 distributed-striped-replicated.
4. Start rebalance on the volume. It says rebalance started successfully.
5. Run rebalance status.
  
Actual results:

[root@QA-25 ~]# gluster v rebalance hosdu start
Starting rebalance on volume hosdu has been successful
[root@QA-25 ~]# gluster v rebalance hosdu status
                                    Node Rebalanced-files          size       scanned         status
                               ---------      -----------   -----------   -----------   ------------
    2003c4d3-c566-4204-a441-18c2996c8018                0            0            0     completed
    a2a39d3d-57c4-45bc-af6f-6e4877c18646                0            0            0     completed
    ff343188-8c3d-401c-9c99-d0fb16a1d8dc                0            0            0     completed
    f6a6454e-b337-4339-af6f-a4a749720f3c                0            0            0     completed

Only fix-layout is done but not migrate data. When any new data is added on the mount point, it will get hased to newly added bricks also.

Expected results:
rebalance should actually migrate the data.

Additional info:

Entries from the rebalance logs from the node where I executed the rebalance command.



[2012-03-30 04:18:29.001483] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_92
[2012-03-30 04:18:29.001529] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_92
[2012-03-30 04:18:29.001607] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_92
[2012-03-30 04:18:29.001656] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_92
[2012-03-30 04:18:29.001724] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_93
[2012-03-30 04:18:29.001768] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_93
[2012-03-30 04:18:29.001813] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_93
[2012-03-30 04:18:29.001882] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_93
[2012-03-30 04:18:29.001984] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_94
[2012-03-30 04:18:29.002041] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_94
[2012-03-30 04:18:29.002089] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_94
[2012-03-30 04:18:29.002128] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_94
[2012-03-30 04:18:29.002178] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_95
[2012-03-30 04:18:29.002218] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_95
[2012-03-30 04:18:29.002264] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_95
[2012-03-30 04:18:29.002302] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_95
[2012-03-30 04:18:29.002351] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_96
[2012-03-30 04:18:29.002391] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_96
[2012-03-30 04:18:29.002438] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_96
[2012-03-30 04:18:29.002475] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_96
[2012-03-30 04:18:29.002525] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_97
[2012-03-30 04:18:29.002565] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_97
[2012-03-30 04:18:29.002611] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_97
[2012-03-30 04:18:29.002649] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_97
[2012-03-30 04:18:29.002708] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_98
[2012-03-30 04:18:29.002751] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_98
[2012-03-30 04:18:29.002796] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_98
[2012-03-30 04:18:29.002835] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_98
[2012-03-30 04:18:29.002884] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_99
[2012-03-30 04:18:29.006004] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_99
[2012-03-30 04:18:29.006068] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-2: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_99
[2012-03-30 04:18:29.006178] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-hosdu-client-3: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000>/dd_file_99
[2012-03-30 04:18:29.016007] I [dht-rebalance.c:1579:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 1

Comment 1 Anand Avati 2012-05-04 06:02:06 UTC
CHANGE: http://review.gluster.com/3232 (stripe: don't send parent pointer in stripe_readdirp_lookup()) merged in master by Anand Avati (avati)

Comment 2 M S Vishwanath Bhat 2012-05-13 12:14:52 UTC
With glusterfs-3.3.0qa41, With striped volume later changed to distributed-striped volume, I can see that data has been moved to newly added bricks.

And with striped-replicated volume rebalance crashed with same back trace as with the bug https://bugzilla.redhat.com/show_bug.cgi?id=820355

Comment 3 M S Vishwanath Bhat 2012-05-29 20:16:05 UTC
Fixed in glusterfs-3.3.0qa43.

I see that both from striped to dist-stripe and from stripe-rep to dist-stripe-rep data is actually being moved after rebalance.