Bug 968289 - nfs: "rm -rf" throws "E [client3_1-fops.c:5214:client3_1_inodelk]" Assertion failed
Summary: nfs: "rm -rf" throws "E [client3_1-fops.c:5214:client3_1_inodelk]" Assertio...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: 2.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Pranith Kumar K
QA Contact: Saurabh
URL:
Whiteboard:
Depends On:
Blocks: 971805
TreeView+ depends on / blocked
 
Reported: 2013-05-29 11:38 UTC by Saurabh
Modified: 2016-01-19 06:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 971805 (view as bug list)
Environment:
Last Closed: 2013-09-23 22:39:48 UTC
Embargoed:


Attachments (Terms of Use)

Description Saurabh 2013-05-29 11:38:24 UTC
Description of problem:
volume type 6x2

just trying to do "rm -rf *" on nfs mount

Version-Release number of selected component (if applicable):
glusterfs-3.3.0.8rhs-1.el6rhs.x86_64

How reproducible:
happened on this release

Steps to Reproduce:
1. create volume of type 6x3, start the volume
2. add some data to it
3. execute "rm -rf *" on the mount point

Actual results:
rm -rf finished, but throws error in nfs.log

nfs.log says,
[2013-05-29 09:25:32.172816] I [glusterfsd-mgmt.c:1568:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-05-29 09:25:32.173503] I [glusterfsd-mgmt.c:1568:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-05-29 09:27:33.502273] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_rec+0x80) [0x7f150ff028e0] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_nonblocking_inodelk+0x608) [0x7f150ff1e5b8] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0
[2013-05-29 09:27:33.502847] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_rec+0x80) [0x7f150ff028e0] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_nonblocking_inodelk+0x608) [0x7f150ff1e5b8] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0
[2013-05-29 09:27:33.502963] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_blocking_lock+0x84) [0x7f150ff1f354] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_blocking+0xad7) [0x7f150ff1f207] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0
[2013-05-29 09:27:33.503050] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(+0x4545e) [0x7f150ff1f45e] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_blocking+0xad7) [0x7f150ff1f207] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0
[2013-05-29 09:27:33.503089] I [afr-lk-common.c:996:afr_lock_blocking] 0-dist-rep-replicate-1: unable to lock on even one child
[2013-05-29 09:27:33.503119] I [afr-transaction.c:1031:afr_post_blocking_inodelk_cbk] 0-dist-rep-replicate-1: Blocking inodelks failed.
[2013-05-29 09:27:33.503161] E [dht-linkfile.c:213:dht_linkfile_setattr_cbk] 0-dist-rep-dht: setattr of uid/gid on <gfid:64c844f0-1ddf-48cf-b195-330a781a2a07>/c4 :<gfid:00000000-0000-0000-0000-000000000000> failed (Invalid argument)


Expected results:
rm -rf should finish to completion

Additional info:

Comment 1 santosh pradhan 2013-05-29 12:55:09 UTC
I ll have a look as the summary says NFS related.

Comment 3 santosh pradhan 2013-05-30 12:10:07 UTC

The log snippet which shows assertion failure:
==============================================

[2013-05-29 09:27:33.502273] E [client3_1-fops.c:5214:client3_1_inodelk] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_lock_rec+0x80) [0x7f150ff028e0] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/cluster/replicate.so(afr_nonblocking_inodelk+0x608) [0x7f150ff1e5b8] (-->/usr/lib64/glusterfs/3.3.0.8rhs/xlator/protocol/client.so(client_inodelk+0x9e) [0x7f151015279e]))) 0-: Assertion failed: 0


The assertion failure happens in client3_1_inodelk(). There are two code paths involved: either [afr_lock_rec() / afr_nonblocking_inodelk() / client_inodelk()] or [afr_blocking_lock() / afr_lock_blocking() / client_inodelk() ].

This sounds more like an AFR issue rather belongs to NFS.

Again, this looks more similar to BZ 800291, already in CLOSED state though not sure what was the FIX.

Comment 4 santosh pradhan 2013-05-30 13:11:21 UTC
I would like to hear from AFR if this is a known issue and already addressed. Reassigning to Amar.

Comment 5 Amar Tumballi 2013-06-05 13:10:32 UTC
Pranith, can you check this out? I am not sure if we already have the fix for this backported/rebased in rhs-2.1 branch.

Comment 6 Pranith Kumar K 2013-06-07 10:10:05 UTC
Amar,
    Following log suggests that for linkfile setattr gfid is not present either in loc->gfid/loc->inode->gfid.

[2013-05-29 09:27:33.503161] E [dht-linkfile.c:213:dht_linkfile_setattr_cbk] 0-dist-rep-dht: setattr of uid/gid on <gfid:64c844f0-1ddf-48cf-b195-330a781a2a07>/c4 :<gfid:00000000-0000-0000-0000-000000000000> failed (Invalid argument)

I could re-create the issue on master by running bug-884597.t

Here are the details:
(gdb) fr 7
#7  0x00007f0b4093561a in dht_lookup_linkfile_create_cbk (frame=0x7f0b43c523a8, 

(gdb) p local->loc
$1 = {path = 0x7f0b34002220 "/2", name = 0x7f0b34002221 "2", inode = 0x7f0b3b0ab0e8, 
  parent = 0x7f0b3b0ab04c, gfid = '\000' <repeats 15 times>, 
  pargfid = '\000' <repeats 15 times>, "\001"}
(gdb) p local->loc->inode->gfid
$2 = '\000' <repeats 15 times>

The following patch fixes the problem. Will send the patch.

pranithk@pranithk-laptop - ~/workspace/gerrit-repo/tests/bugs (master)
15:28:59 :) ⚡ git diff
diff --git a/xlators/cluster/dht/src/dht-linkfile.c b/xlators/cluster/dht/src/dht-linkfile.c
index 39d72ae..ae5bd49 100644
--- a/xlators/cluster/dht/src/dht-linkfile.c
+++ b/xlators/cluster/dht/src/dht-linkfile.c
@@ -302,6 +302,8 @@ dht_linkfile_attr_heal (call_frame_t *frame, xlator_t *this)
              is_equal (frame->root->gid, local->stbuf.ia_gid)))
                 return 0;
 
+        uuid_copy (local->loc.gfid, local->stbuf.ia_gfid);
+
         copy = copy_frame (frame);
 
         if (!copy)

Comment 10 Scott Haines 2013-09-23 22:39:48 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 11 Scott Haines 2013-09-23 22:43:48 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.