Description of problem: volume type 6x2 just trying to do "rm -rf *" on nfs mount Version-Release number of selected component (if applicable): glusterfs-3.3.0.8rhs-1.el6rhs.x86_64 How reproducible: happened on this release Steps to Reproduce: 1. create volume of type 6x3, start the volume 2. add some data to it 3. execute "rm -rf *" on the mount point Actual results: rm -rf finished, but throws error in nfs.log nfs.log says, [2013-05-29 09:25:32.172816] I [glusterfsd-mgmt.c:1568:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2013-05-29 09:25:32.173503] I [glusterfsd-mgmt.c:1568:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2013-05-29 09:27:33.502273] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_rec+0x80) [0x7f150ff028e0] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_nonblocking_inodelk+0x608) [0x7f150ff1e5b8] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0 [2013-05-29 09:27:33.502847] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_rec+0x80) [0x7f150ff028e0] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_nonblocking_inodelk+0x608) [0x7f150ff1e5b8] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0 [2013-05-29 09:27:33.502963] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_blocking_lock+0x84) [0x7f150ff1f354] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_blocking+0xad7) [0x7f150ff1f207] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0 [2013-05-29 09:27:33.503050] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(+0x4545e) [0x7f150ff1f45e] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_blocking+0xad7) [0x7f150ff1f207] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0 [2013-05-29 09:27:33.503089] I [afr-lk-common.c:996:afr_lock_blocking] 0-dist-rep-replicate-1: unable to lock on even one child [2013-05-29 09:27:33.503119] I [afr-transaction.c:1031:afr_post_blocking_inodelk_cbk] 0-dist-rep-replicate-1: Blocking inodelks failed. [2013-05-29 09:27:33.503161] E [dht-linkfile.c:213:dht_linkfile_setattr_cbk] 0-dist-rep-dht: setattr of uid/gid on <gfid:64c844f0-1ddf-48cf-b195-330a781a2a07>/c4 :<gfid:00000000-0000-0000-0000-000000000000> failed (Invalid argument) Expected results: rm -rf should finish to completion Additional info:
I ll have a look as the summary says NFS related.
The log snippet which shows assertion failure: ============================================== [2013-05-29 09:27:33.502273] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_rec+0x80) [0x7f150ff028e0] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_nonblocking_inodelk+0x608) [0x7f150ff1e5b8] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0 The assertion failure happens in client3_1_inodelk(). There are two code paths involved: either [afr_lock_rec() / afr_nonblocking_inodelk() / client_inodelk()] or [afr_blocking_lock() / afr_lock_blocking() / client_inodelk() ]. This sounds more like an AFR issue rather belongs to NFS. Again, this looks more similar to BZ 800291, already in CLOSED state though not sure what was the FIX.
I would like to hear from AFR if this is a known issue and already addressed. Reassigning to Amar.
Pranith, can you check this out? I am not sure if we already have the fix for this backported/rebased in rhs-2.1 branch.
Amar, Following log suggests that for linkfile setattr gfid is not present either in loc->gfid/loc->inode->gfid. [2013-05-29 09:27:33.503161] E [dht-linkfile.c:213:dht_linkfile_setattr_cbk] 0-dist-rep-dht: setattr of uid/gid on <gfid:64c844f0-1ddf-48cf-b195-330a781a2a07>/c4 :<gfid:00000000-0000-0000-0000-000000000000> failed (Invalid argument) I could re-create the issue on master by running bug-884597.t Here are the details: (gdb) fr 7 #7 0x00007f0b4093561a in dht_lookup_linkfile_create_cbk (frame=0x7f0b43c523a8, (gdb) p local->loc $1 = {path = 0x7f0b34002220 "/2", name = 0x7f0b34002221 "2", inode = 0x7f0b3b0ab0e8, parent = 0x7f0b3b0ab04c, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>, "\001"} (gdb) p local->loc->inode->gfid $2 = '\000' <repeats 15 times> The following patch fixes the problem. Will send the patch. pranithk@pranithk-laptop - ~/workspace/gerrit-repo/tests/bugs (master) 15:28:59 :) ⚡ git diff diff --git a/xlators/cluster/dht/src/dht-linkfile.c b/xlators/cluster/dht/src/dht-linkfile.c index 39d72ae..ae5bd49 100644 --- a/xlators/cluster/dht/src/dht-linkfile.c +++ b/xlators/cluster/dht/src/dht-linkfile.c @@ -302,6 +302,8 @@ dht_linkfile_attr_heal (call_frame_t *frame, xlator_t *this) is_equal (frame->root->gid, local->stbuf.ia_gid))) return 0; + uuid_copy (local->loc.gfid, local->stbuf.ia_gfid); + copy = copy_frame (frame); if (!copy)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html