Bug 1321554 - assert failure happens when parallel rm -rf is issued on nfs mounts
Summary: assert failure happens when parallel rm -rf is issued on nfs mounts
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
Depends On:
Blocks: 1327864
TreeView+ depends on / blocked
Reported: 2016-03-28 11:34 UTC by Pranith Kumar K
Modified: 2016-06-16 14:01 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1327864 (view as bug list)
Last Closed: 2016-06-16 14:01:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)

Description Pranith Kumar K 2016-03-28 11:34:21 UTC
Description of problem:
(gdb) bt
#0  0x00007f652b091a98 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f652b09369a in __GI_abort () at abort.c:89
#2  0x00007f652b08a227 in __assert_fail_base (fmt=<optimized out>,
    assertion=assertion@entry=0x7f652c5a529c "inode->nlookup >=
nlookup", file=file@entry=0x7f652c5a512a "inode.c", line=line@entry=711,
    function=function@entry=0x7f652c5a5848 <__PRETTY_FUNCTION__.10534>
"__inode_forget") at assert.c:92
#3  0x00007f652b08a2d2 in __GI___assert_fail (
    assertion=0x7f652c5a529c "inode->nlookup >= nlookup",
    file=0x7f652c5a512a "inode.c", line=711,
    function=0x7f652c5a5848 <__PRETTY_FUNCTION__.10534>
"__inode_forget") at assert.c:101
#4  0x00007f652c5203e8 in __inode_forget (inode=0x7f6504038aec,
---Type <return> to continue, or q <return> to quit---
    nlookup=1) at inode.c:711
#5  0x00007f652c5210f8 in inode_forget (inode=0x7f6504038aec,
    nlookup=1) at inode.c:1123
#6  0x00007f651f75258c in afr_lookup_sh_metadata_wrap (
    opaque=0x7f65180a9b3c) at afr-common.c:1928
#7  0x00007f652c54d925 in synctask_wrap (old_task=0x7f65040467b0)
    at syncop.c:375
#8  0x00007f652b0a5f10 in ?? () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()
(gdb) f 4
#4  0x00007f652c5203e8 in __inode_forget (inode=0x7f6504038aec,
nlookup=1) at inode.c:711
711            GF_ASSERT (inode->nlookup >= nlookup);
(gdb) p inode->nlookup
$1 = 0
(gdb) p nlookup
$2 = 1

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 Vijay Bellur 2016-03-28 12:01:00 UTC
REVIEW: http://review.gluster.org/13834 (cluster/afr: Don't lookup/forget inodes) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu@redhat.com)

Comment 2 Vijay Bellur 2016-03-28 15:27:48 UTC
REVIEW: http://review.gluster.org/13834 (cluster/afr: Don't lookup/forget inodes) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu@redhat.com)

Comment 3 Vijay Bellur 2016-03-31 12:46:37 UTC
COMMIT: http://review.gluster.org/13834 committed in master by Pranith Kumar Karampuri (pkarampu@redhat.com) 
commit b2a5eed9b17a82ec4b6366b0107fe2271328c16a
Author: Pranith Kumar K <pkarampu@redhat.com>
Date:   Mon Mar 28 16:31:12 2016 +0530

    cluster/afr: Don't lookup/forget inodes
    All inodes that are looked-up are always forgotten without fail in
    afr removing the benefits of them being in lru. This same code can
    cause crashes if between inode_lookup, inode_forget in afr if the
    top xlator does inode_forget(0).
    Don't use lookup/forget in afr. No benefits are there at the moment
    for keeping this code. It is impossible to prevent top xlators to
    do inode_forget(0). Found similar instances in ec
    and removed them even though those code paths are not going to
    be executed in any place other than heal-daemon.
    BUG: 1321554
    Change-Id: Ia4cb236178f7f129cc898d53f0bbd26f494a2a8d
    Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
    Reviewed-on: http://review.gluster.org/13834
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anuradha Talur <atalur@redhat.com>

Comment 4 Niels de Vos 2016-06-16 14:01:30 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.