Bug 1138393 - rebalance is not resulting in the hash layout changes being available to nfs client
Summary: rebalance is not resulting in the hash layout changes being available to nfs ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.6.0
Hardware: aarch64
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: Shyamsundar
QA Contact:
URL:
Whiteboard:
Depends On: 1120456 1125824 1139997 1140338
Blocks: glusterfs-3.6.0 1125958
TreeView+ depends on / blocked
 
Reported: 2014-09-04 17:22 UTC by Shyamsundar
Modified: 2015-05-15 17:03 UTC (History)
14 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1125824
Environment:
Last Closed: 2014-11-11 08:38:11 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Shyamsundar 2014-09-04 17:22:12 UTC
+++ This bug was initially created as a clone of Bug #1125824 +++

Description of problem:
Testing volume expansion and rebalance on a volume used by application, resulted in files being unable to be copied/deleted. originally I had a 4 brick dist-repl volume, and expanded this to an 8 brick configuration by
- running add-brick
- running rebalance start

The rebalance was executed during application activity (writes of cold buckets to the volume and reads to the volume across buckets during upto 36 concurrent search sessions. rebalance completed successfully.

However, two problems have been identified following the rebalance;

1. a subsequent benchmark test that attempts to refresh the environment by deleting existing files - failed (nfs.log attached)
2. the migration of data from one of the indexers to the RHS volume started to fail, leaving the data on local disk instead of migrating to the nfs mounted rhs volume.

Steps to Reproduce:
1. any attempt to delete the file's listed in the nfs.log, fail

[root@focil-rhs1 rawdata]# rm slicesv2.dat 
rm: remove regular file `slicesv2.dat'? y
rm: cannot remove `slicesv2.dat': Invalid argument

Actual results:
file deletion fails.

Expected results:
file access/manipulation following rebalance should work

Comment 1 Anand Avati 2014-09-04 18:48:01 UTC
REVIEW: http://review.gluster.org/8608 (cluster/dht: Fix dht_access treating directory like files) posted (#1) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 2 Anand Avati 2014-09-04 20:25:47 UTC
REVIEW: http://review.gluster.org/8608 (cluster/dht: Fix dht_access treating directory like files) posted (#2) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 3 Anand Avati 2014-09-05 15:11:12 UTC
REVIEW: http://review.gluster.org/8608 (cluster/dht: Fix dht_access treating directory like files) posted (#3) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 4 Anand Avati 2014-09-05 15:20:07 UTC
REVIEW: http://review.gluster.org/8608 (cluster/dht: Fix dht_access treating directory like files) posted (#4) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 5 Anand Avati 2014-09-09 17:52:49 UTC
COMMIT: http://review.gluster.org/8608 committed in release-3.6 by Vijay Bellur (vbellur) 
------
commit 7fa8f593e1375e6a917de0a24efa91f82aab05a4
Author: Shyam <srangana>
Date:   Thu Sep 4 14:10:02 2014 -0400

    cluster/dht: Fix dht_access treating directory like files
    
    When the cluster topology changes due to add-brick, all sub
    volumes of DHT will not contain the directories till a rebalance
    is completed. Till the rebalance is run, if a caller bypasses
    lookup and calls access due to saved/cached inode information
    (like NFS server does) then, dht_access misreads the error
    (ESTALE/ENOENT) from the new subvolumes and incorrectly tries
    to handle the inode as a file. This results in the directories
    in memory state in DHT to be corrupted and not heal even post
    a rebalance.
    
    This commit fixes the problem in dht_access thereby preventing
    DHT from misrepresenting a directory as a file in the case
    presented above.
    
    Change-Id: Idcdaa3837db71c8fe0a40ec0084a6c3dbe27e772
    BUG: 1138393
    Signed-off-by: Shyam <srangana>
    Reviewed-on-master: http://review.gluster.org/8462
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>
    Reviewed-on: http://review.gluster.org/8608
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 6 Niels de Vos 2014-09-22 12:45:42 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 7 Niels de Vos 2014-11-11 08:38:11 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.