Description of problem: Version-Release number of selected component (if applicable): mainline How reproducible: always Steps to Reproduce: 1.Create a volume 2.Export the volume using nfs-ganesha 3.Mount the volume using nfsv4 protocol 4.set nfs4_acl on same file twice. nfs4_setfacl -a "A::<user_name>:<perm_set>" <filepath> Actual results: The brick got crashed. Expected results: It should succeed Additional info: back trace of brick process crash #0 0x00007fd3339ba8e2 in posix_acl_ctx_update (inode=0x7fd33407275c, this=0x7fd334011fc0, buf=0x7fd331391800) at posix-acl.c:766 #1 0x00007fd3339c16d9 in posix_acl_setattr_cbk (frame=0x7fd328000d9c, cookie=0x7fd32800169c, this=0x7fd334011fc0, op_ret=0, op_errno=0, prebuf=0x7fd331391870, postbuf=0x7fd331391800, xdata=0x0) at posix-acl.c:1762 #2 0x00007fd333de9a5d in changelog_setattr_cbk (frame=0x7fd32800169c, cookie=0x7fd3280028ac, this=0x7fd33400ee20, op_ret=0, op_errno=0, preop_stbuf=0x7fd331391870, postop_stbuf=0x7fd331391800, xdata=0x0) at changelog.c:1202 #3 0x00007fd33859acef in ctr_setattr_cbk (frame=0x7fd3280028ac, cookie=0x7fd32800222c, this=0x7fd33400b8f0, op_ret=0, op_errno=0, preop_stbuf=0x7fd331391870, postop_stbuf=0x7fd331391800, xdata=0x0) at changetimerecorder.c:451 #4 0x00007fd338dddc30 in posix_setattr (frame=0x7fd32800222c, this=0x7fd334007820, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at posix.c:449 #5 0x00007fd3435c51a0 in default_setattr (frame=0x7fd32800222c, this=0x7fd334009f50, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at defaults.c:2107 #6 0x00007fd33859b2d4 in ctr_setattr (frame=0x7fd3280028ac, this=0x7fd33400b8f0, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at changetimerecorder.c:484 #7 0x00007fd333de9e2d in changelog_setattr (frame=0x7fd32800169c, this=0x7fd33400ee20, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at changelog.c:1239 #8 0x00007fd3435c51a0 in default_setattr (frame=0x7fd32800169c, this=0x7fd3340109b0, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at defaults.c:2107 #9 0x00007fd3339c1a97 in posix_acl_setattr (frame=0x7fd328000d9c, this=0x7fd334011fc0, loc=0x7fd32c007d7c, buf=0x7fd32c0082b4, valid=1, xdata=0x0) at posix-acl.c:1784 #10 0x00007fd3435c51a0 in default_setattr (frame=0x7fd328000d9c, this=0x7fd334013400, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at defaults.c:2107 #11 0x00007fd33358623d in up_setattr (frame=0x7fd328000bac, this=0x7fd334014810, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at upcall.c:368 #12 0x00007fd3435c238b in default_setattr_resume (frame=0x7fd32c00290c, this=0x7fd334015da0, loc=0x7fd32c007d7c, stbuf=0x7fd32c0082b4, valid=1, xdata=0x0) at defaults.c:1662 #13 0x00007fd3435e2870 in call_resume_wind (stub=0x7fd32c007d3c) at call-stub.c:2252 #14 0x00007fd3435ea7fa in call_resume (stub=0x7fd32c007d3c) at call-stub.c:2571 #15 0x00007fd33337a317 in iot_worker (data=0x7fd334042260) at io-threads.c:210 #16 0x000000315c8079d1 in start_thread () from /lib64/libpthread.so.0 #17 0x000000315c0e886d in clone () from /lib64/libc.so.6 From the bt it is clear that crash is happened in access-control translator's posix_setattr_cbk(). It seems to context of access_control_translator contains invalid entries. (gdb) p ctx $1 = (struct posix_acl_ctx *) 0x7fd328003640 (gdb) p *ctx $2 = {uid = 0, gid = 0, perm = 32768, acl_access = 0x7fd328001850, acl_default = 0x0} (gdb) p *ctx->acl_access $3 = {refcnt = -1379869184, count = 8377310, entries = 0x7fd328001850} The crash happened when it tried to access invalid memory.
REVIEW: http://review.gluster.org/11633 (access_control : avoid double unrefing of acl variable in its context.) posted (#1) for review on release-3.7 by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11632 (access_control : avoid double unrefing of acl variable in its context.) posted (#3) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11632 (access_control : avoid double unrefing of acl variable in its context.) posted (#4) for review on master by jiffin tony Thottan (jthottan)
COMMIT: http://review.gluster.org/11632 committed in master by Kaleb KEITHLEY (kkeithle) ------ commit 7f21238bb918a9b6eefcff5d76516a92a9271ae2 Author: Jiffin Tony Thottan <jthottan> Date: Sat Jul 11 15:47:18 2015 +0530 access_control : avoid double unrefing of acl variable in its context. In handling_other_acl_related_xattr(), acl variable is unrefered twice after updating the context of access_control translator.So the acl variable stored in the inmemory context will become invalid one. When the variable accessed again , it will result in brick crash. This patch fixes the same. Change-Id: Ib95d2e3d67b0fb20d201244a206379d6261aeb23 BUG: 1242041 Signed-off-by: Jiffin Tony Thottan <jthottan> Reviewed-on: http://review.gluster.org/11632 Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos> Reviewed-by: soumya k <skoduri> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Kaleb KEITHLEY <kkeithle>
Can this happen when mounting over FUSE (with "acl" mount option) or Gluster/NFS too? If so, could you add a test-case?
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user