Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1241839 - nfs-ganesha: bricks crash while executing acl related operation for named group/user
nfs-ganesha: bricks crash while executing acl related operation for named gro...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
3.1
x86_64 Linux
urgent Severity urgent
: ---
: RHGS 3.1.0
Assigned To: Jiffin
Saurabh
:
Depends On:
Blocks: 1202842 1242030 1242031
  Show dependency treegraph
 
Reported: 2015-07-10 04:17 EDT by Saurabh
Modified: 2016-01-19 01:14 EST (History)
11 users (show)

See Also:
Fixed In Version: glusterfs-3.7.1-9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1242030 (view as bug list)
Environment:
Last Closed: 2015-07-29 01:11:15 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
coredump of a brick from nfs12 (630.22 KB, application/x-xz)
2015-07-10 04:17 EDT, Saurabh
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 04:26:26 EDT

  None (edit)
Description Saurabh 2015-07-10 04:17:00 EDT
Created attachment 1050606 [details]
coredump of a brick from nfs12

Description of problem:
I tried to execute nfs4_setfacl and nfs4_getfacl operation on directory for a non-root user and found that the bricks have crashed.
[root@nfs12 ~]# gluster volume status vol4
Status of volume: vol4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.8:/rhs/brick1/d1r14          N/A       N/A        N       N/A  
Brick 10.70.46.27:/rhs/brick1/d1r24         N/A       N/A        N       N/A  
Brick 10.70.46.25:/rhs/brick1/d2r14         N/A       N/A        N       N/A  
Brick 10.70.46.29:/rhs/brick1/d2r24         N/A       N/A        N       N/A  
Brick 10.70.46.8:/rhs/brick1/d3r14          N/A       N/A        N       N/A  
Brick 10.70.46.27:/rhs/brick1/d3r24         N/A       N/A        N       N/A  
Brick 10.70.46.25:/rhs/brick1/d4r14         N/A       N/A        N       N/A  
Brick 10.70.46.29:/rhs/brick1/d4r24         N/A       N/A        N       N/A  
Brick 10.70.46.8:/rhs/brick1/d5r14          N/A       N/A        N       N/A  
Brick 10.70.46.27:/rhs/brick1/d5r24         N/A       N/A        N       N/A  
Brick 10.70.46.25:/rhs/brick1/d6r14         N/A       N/A        N       N/A  
Brick 10.70.46.29:/rhs/brick1/d6r24         N/A       N/A        N       N/A  
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       21916
NFS Server on 10.70.46.8                    N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.46.8              N/A       N/A        Y       7920 
NFS Server on 10.70.46.29                   N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.46.29             N/A       N/A        Y       8702 
NFS Server on 10.70.46.25                   N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.46.25             N/A       N/A        Y       24895
NFS Server on 10.70.46.39                   2049      0          Y       31393
Self-heal Daemon on 10.70.46.39             N/A       N/A        Y       31402
NFS Server on 10.70.46.22                   2049      0          Y       12105
Self-heal Daemon on 10.70.46.22             N/A       N/A        Y       12113
 
Task Status of Volume vol4
------------------------------------------------------------------------------
There are no active volume tasks


Version-Release number of selected component (if applicable):
glusterfs-3.7.1-8.el6rhs.x86_64
nfs-ganesha-2.2.0-4.el6rhs.x86_64

How reproducible:
seen for first time

Steps to Reproduce:
1. create a volume of 6x2 type, start it
2. configure nfs-ganesha 
3. create a group with a specific id on all RHGS servers and client
4. create a non-root user with a specific id on all RHGS servers and client. keep the group as the one created in step 3.
5. enable acls and mount the volume on client with vers=4
6. create a direcotory on mount point
7. chown the directory with group and user created in step 3 and 4.
8. execute the command,
"nfs4_setfacl -a "A::acl_user1@lab.eng.blr.redhat.com:rwx" acl_user1_dir/"

followed by,
9. execute the command "nfs4_getfacl acl_user1_dir/"


Actual results:
step 9, result,
[root@rhsauto009 vol4]# nfs4_getfacl acl_user1_dir
Invalid filename: acl_user1_dir

gluster volume status displays that all the bricks have crashed and coredumped.

bt of one of the brick,
(gdb) bt
#0  0x00007f1479cfd625 in raise () from /lib64/libc.so.6
#1  0x00007f1479cfee05 in abort () from /lib64/libc.so.6
#2  0x00007f1479d3b537 in __libc_message () from /lib64/libc.so.6
#3  0x00007f1479d40f4e in malloc_printerr () from /lib64/libc.so.6
#4  0x00007f1479d43c5d in _int_free () from /lib64/libc.so.6
#5  0x00007f147b358215 in data_destroy (data=0x7f1478731f0c) at dict.c:235
#6  0x00007f147b358c4e in dict_destroy (this=0x7f1478913d8c) at dict.c:564
#7  0x00007f146dc4278e in posix_setxattr (frame=0x7f1478f1aa80, this=<value optimized out>, loc=<value optimized out>, dict=<value optimized out>, 
    flags=0, xdata=0x7f1478913e18) at posix.c:3408
#8  0x00007f147b367d43 in default_setxattr (frame=0x7f1478f1aa80, this=0x7f1468009110, loc=0x7f14789a2cc8, dict=0x7f1478913c74, 
    flags=<value optimized out>, xdata=<value optimized out>) at defaults.c:1777
#9  0x00007f146d1fb0e1 in ctr_setxattr (frame=0x7f1478f1a31c, this=0x7f146800a730, loc=0x7f14789a2cc8, xattr=0x7f1478913c74, flags=0, 
    xdata=0x7f1478913e18) at changetimerecorder.c:1056
#10 0x00007f146cb471dd in changelog_setxattr (frame=0x7f1478f1a270, this=0x7f146800cff0, loc=0x7f14789a2cc8, dict=0x7f1478913c74, flags=0, 
    xdata=0x7f1478913e18) at changelog.c:1475
#11 0x00007f146c71a641 in br_stub_setxattr (frame=0x7f1478f1a270, this=0x7f146800ef00, loc=0x7f14789a2cc8, dict=0x7f1478913c74, flags=0, 
    xdata=0x7f1478913e18) at bit-rot-stub.c:1113
#12 0x00007f146c50f824 in posix_acl_setxattr (frame=<value optimized out>, this=0x7f1468010390, loc=0x7f14789a2cc8, xattr=0x7f1478913c74, flags=0, 
    xdata=0x7f1478913e18) at posix-acl.c:2023
#13 0x00007f147b367d43 in default_setxattr (frame=0x7f1478f1a9d4, this=0x7f1468011720, loc=0x7f14789a2cc8, dict=0x7f1478913c74, 
    flags=<value optimized out>, xdata=<value optimized out>) at defaults.c:1777
#14 0x00007f147b367d43 in default_setxattr (frame=0x7f1478f1a9d4, this=0x7f1468012aa0, loc=0x7f14789a2cc8, dict=0x7f1478913c74, 
    flags=<value optimized out>, xdata=<value optimized out>) at defaults.c:1777
#15 0x00007f147b36c433 in default_setxattr_resume (frame=0x7f1478f1a928, this=0x7f1468013f00, loc=0x7f14789a2cc8, dict=0x7f1478913c74, flags=0, 
    xdata=0x7f1478913e18) at defaults.c:1334
#16 0x00007f147b389580 in call_resume (stub=0x7f14789a2c88) at call-stub.c:2576
#17 0x00007f1467dfb541 in iot_worker (data=0x7f146804f900) at io-threads.c:215
#18 0x00007f147a449a51 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f1479db396d in clone () from /lib64/libc.so.6


bricks logs crash update,
pending frames:
patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 
2015-07-10 07:23:48
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f147b35e826]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f147b37e3ef]
/lib64/libc.so.6(+0x3d0c2326a0)[0x7f1479cfd6a0]
/lib64/libc.so.6(gsignal+0x35)[0x7f1479cfd625]
/lib64/libc.so.6(abort+0x175)[0x7f1479cfee05]
/lib64/libc.so.6(+0x3d0c270537)[0x7f1479d3b537]
/lib64/libc.so.6(+0x3d0c275f4e)[0x7f1479d40f4e]
/lib64/libc.so.6(+0x3d0c278c5d)[0x7f1479d43c5d]
/usr/lib64/libglusterfs.so.0(data_destroy+0x55)[0x7f147b358215]
/usr/lib64/libglusterfs.so.0(dict_destroy+0x3e)[0x7f147b358c4e]
/usr/lib64/glusterfs/3.7.1/xlator/storage/posix.so(posix_setxattr+0x31e)[0x7f146dc4278e]
/usr/lib64/libglusterfs.so.0(default_setxattr+0x83)[0x7f147b367d43]
/usr/lib64/glusterfs/3.7.1/xlator/features/changetimerecorder.so(ctr_setxattr+0x191)[0x7f146d1fb0e1]
/usr/lib64/glusterfs/3.7.1/xlator/features/changelog.so(changelog_setxattr+0x17d)[0x7f146cb471dd]
/usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_setxattr+0x281)[0x7f146c71a641]
/usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_setxattr+0x244)[0x7f146c50f824]
/usr/lib64/libglusterfs.so.0(default_setxattr+0x83)[0x7f147b367d43]
/usr/lib64/libglusterfs.so.0(default_setxattr+0x83)[0x7f147b367d43]
/usr/lib64/libglusterfs.so.0(default_setxattr_resume+0x143)[0x7f147b36c433]
/usr/lib64/libglusterfs.so.0(call_resume+0x80)[0x7f147b389580]
/usr/lib64/glusterfs/3.7.1/xlator/performance/io-threads.so(iot_worker+0x171)[0x7f1467dfb541]
/lib64/libpthread.so.0(+0x3d0c607a51)[0x7f147a449a51]
/lib64/libc.so.6(clone+0x6d)[0x7f1479db396d]


Expected results:
1. nfs4_setfacl should pass and nfs4_getfacl should display the change done.
2. there should not be any crash of the bricks or any other process with acl related operations.

Additional info:
Comment 2 Jiffin 2015-07-10 14:32:01 EDT
The crash is due to double free in posix_setxattr() call. The fix is send in upstream

After fixing the crash , I noticed another crash , which explained in https://bugzilla.redhat.com/show_bug.cgi?id=1242046

I am still try  to find the root cause of the same
Comment 4 Jiffin 2015-07-12 01:57:29 EDT
The fix is send in upstream http://review.gluster.org/#/c/11627/
Comment 5 Jiffin 2015-07-12 11:59:02 EDT
The patch is posted in downstream https://code.engineering.redhat.com/gerrit/#/c/52816/
Comment 6 Jiffin 2015-07-13 01:30:19 EDT
The patch is merged in downstream https://code.engineering.redhat.com/gerrit/#/c/52816/
Comment 7 Saurabh 2015-07-13 09:08:23 EDT
post fix of the BZ,
[root@rhsauto009 mnt1]# nfs4_setfacl -a "A::acl_user1@lab.eng.blr.redhat.com:rwx" acl_user1_dir1/
[root@rhsauto009 mnt1]# nfs4_getfacl acl_user1_dir1/
A::OWNER@:rwaDxtTcCy
A::acl_user1@lab.eng.blr.redhat.com:rwaDxtcy
A::GROUP@:rxtcy
A::EVERYONE@:rxtcy

[root@rhsauto009 mnt1]# mount | grep mnt1
10.70.44.92:/vol4 on /export/mnt1 type nfs (rw,vers=4,addr=10.70.44.92,clientaddr=10.70.36.239)




[root@nfs11 ~]# gluster volume status vol4
Status of volume: vol4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.8:/rhs/brick1/d1r14          49156     0          Y       16720
Brick 10.70.46.27:/rhs/brick1/d1r24         49155     0          Y       31744
Brick 10.70.46.25:/rhs/brick1/d2r14         49157     0          Y       30081
Brick 10.70.46.29:/rhs/brick1/d2r24         49156     0          Y       22951
Brick 10.70.46.8:/rhs/brick1/d3r14          49157     0          Y       16738
Brick 10.70.46.27:/rhs/brick1/d3r24         49156     0          Y       31762
Brick 10.70.46.25:/rhs/brick1/d4r14         49158     0          Y       30099
Brick 10.70.46.29:/rhs/brick1/d4r24         49157     0          Y       22969
Brick 10.70.46.8:/rhs/brick1/d5r14          49158     0          Y       16756
Brick 10.70.46.27:/rhs/brick1/d5r24         49157     0          Y       31780
Brick 10.70.46.25:/rhs/brick1/d6r14         49159     0          Y       30117
Brick 10.70.46.29:/rhs/brick1/d6r24         49158     0          Y       22987
Self-heal Daemon on localhost               N/A       N/A        Y       10581
Self-heal Daemon on 10.70.46.25             N/A       N/A        Y       21878
Self-heal Daemon on 10.70.46.22             N/A       N/A        Y       1465 
Self-heal Daemon on 10.70.46.39             N/A       N/A        Y       20442
Self-heal Daemon on 10.70.46.29             N/A       N/A        Y       14763
Self-heal Daemon on 10.70.46.27             N/A       N/A        Y       24236
 
Task Status of Volume vol4
------------------------------------------------------------------------------
There are no active volume tasks
Comment 9 errata-xmlrpc 2015-07-29 01:11:15 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html

Note You need to log in before you can comment on or make changes to this bug.