Created attachment 818125 [details] core dump Description of problem: I updated my system from glusterfs-.3.4.0.36rhs to glusterfs-3.4.0.37rhs. created a new volume, enabled quota and set a limit of 100GB. Started fs-sanity tests over nfs mount. Now, the gluster-nfs process is killed as there is a crash seen. [2013-10-31 08:19:02.791448] E [dht-helper.c:761:dht_migration_complete_check_task] 0-dist-rep12-dht: /run23329/system_light/linux-2.6.31.1/scripts/ba sic/.hash.cmd: failed to get the 'linkto' xattr No data available [2013-10-31 08:19:02.791553] W [nfs3.c:739:nfs3svc_getattr_stat_cbk] 0-nfs: c380655a: /run23329/system_light/linux-2.6.31.1/scripts/basic/.hash.cmd => -1 (Success) pending frames: frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2013-10-31 08:19:02configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.4.0.37rhs /lib64/libc.so.6[0x3cdd832960] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3_stat_to_fattr3+0x28)[0x7f68db122a78] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3_fill_getattr3res+0x35)[0x7f68db122ca5] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3_getattr_reply+0x3a)[0x7f68db110aca] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3svc_getattr_stat_cbk+0x4d)[0x7f68db1138ed] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs_fop_stat_cbk+0x41)[0x7f68db107c91] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/debug/io-stats.so(io_stats_stat_cbk+0xf6)[0x7f68db350196] /usr/lib64/libglusterfs.so.0(default_stat_cbk+0xc2)[0x7f68e047e892] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/cluster/distribute.so(dht_attr2+0x234)[0x7f68db79cda4] /usr/lib64/glusterfs/3.4.0.37rhs/xlator/cluster/distribute.so(+0xb226)[0x7f68db773226] /usr/lib64/libglusterfs.so.0(synctask_wrap+0x2a)[0x7f68e04a1aea] /lib64/libc.so.6[0x3cdd843bb0] Version-Release number of selected component (if applicable): glusterfs-3.4.0.37rhs How reproducible: found on this build Volume Name: dist-rep12 Type: Distributed-Replicate Volume ID: 9d072702-1230-421c-ad9c-41c8ed1a1c97 Status: Started Number of Bricks: 6 x 2 = 12 Transport-type: tcp Bricks: Brick1: 10.70.37.58:/rhs/bricks/d1r1-n12 Brick2: 10.70.37.196:/rhs/bricks/d1r2-n12 Brick3: 10.70.37.138:/rhs/bricks/d2r1-n12 Brick4: 10.70.37.186:/rhs/bricks/d2r2-n12 Brick5: 10.70.37.58:/rhs/bricks/d3r1-n12 Brick6: 10.70.37.196:/rhs/bricks/d3r2-n12 Brick7: 10.70.37.138:/rhs/bricks/d4r1-n12 Brick8: 10.70.37.186:/rhs/bricks/d4r2-n12 Brick9: 10.70.37.58:/rhs/bricks/d5r1-n12 Brick10: 10.70.37.196:/rhs/bricks/d5r2-n12 Brick11: 10.70.37.138:/rhs/bricks/d6r1-n12 Brick12: 10.70.37.186:/rhs/bricks/d6r2-n12 Options Reconfigured: features.quota: on [root@nfs1 ~]# gluster volume quota dist-rep12 list Path Hard-limit Soft-limit Used Available -------------------------------------------------------------------------------- / 100.0GB 80% 478.5MB 99.5GB [root@nfs1 ~]# [root@nfs1 ~]# gluster volume status dist-rep12 Status of volume: dist-rep12 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.37.58:/rhs/bricks/d1r1-n12 49159 Y 28802 Brick 10.70.37.196:/rhs/bricks/d1r2-n12 49159 Y 23312 Brick 10.70.37.138:/rhs/bricks/d2r1-n12 49161 Y 12080 Brick 10.70.37.186:/rhs/bricks/d2r2-n12 49158 Y 21315 Brick 10.70.37.58:/rhs/bricks/d3r1-n12 49160 Y 28813 Brick 10.70.37.196:/rhs/bricks/d3r2-n12 49160 Y 23323 Brick 10.70.37.138:/rhs/bricks/d4r1-n12 49162 Y 12091 Brick 10.70.37.186:/rhs/bricks/d4r2-n12 49159 Y 21326 Brick 10.70.37.58:/rhs/bricks/d5r1-n12 49161 Y 28824 Brick 10.70.37.196:/rhs/bricks/d5r2-n12 49161 Y 23334 Brick 10.70.37.138:/rhs/bricks/d6r1-n12 49163 Y 12102 Brick 10.70.37.186:/rhs/bricks/d6r2-n12 49160 Y 21337 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 28842 Quota Daemon on localhost N/A Y 28914 NFS Server on 10.70.37.186 2049 Y 21349 Self-heal Daemon on 10.70.37.186 N/A Y 21357 Quota Daemon on 10.70.37.186 N/A Y 21400 NFS Server on 10.70.37.196 2049 Y 23347 Self-heal Daemon on 10.70.37.196 N/A Y 23352 Quota Daemon on 10.70.37.196 N/A Y 23403 NFS Server on 10.70.37.138 2049 Y 12114 Self-heal Daemon on 10.70.37.138 N/A Y 12120 Quota Daemon on 10.70.37.138 N/A Y 12176 There are no active volume tasks Core dump is attached.
I am still seeing this on the 3.4.0.38rhs build. To note, I was running FS sanity on a glusterfs mount with no quota enabled: [New Thread 6509] [New Thread 6508] [New Thread 6547] [New Thread 6506] [New Thread 6507] [New Thread 6515] Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/'. Program terminated with signal 11, Segmentation fault. #0 nfs3_stat_to_fattr3 (buf=0x0) at nfs3-helpers.c:287 287 if (IA_ISDIR (buf->ia_type)) Thread 6 (Thread 6515): #0 0x00007f8564918d2d in ?? () No symbol table info available. #1 0x0000000000000000 in ?? () No symbol table info available. Thread 5 (Thread 6507): #0 0x00007f85649192a5 in ?? () No symbol table info available. #1 0x0000000000000000 in ?? () No symbol table info available. Thread 4 (Thread 6506): #0 0x00007f85642c5f43 in ?? () No symbol table info available. #1 0x0000000000000000 in ?? () No symbol table info available. Thread 3 (Thread 6547): #0 0x00007f85642bc293 in ?? () No symbol table info available. #1 0x0000000000000003 in ?? () No symbol table info available. #2 0x00007f85ffffffff in ?? () No symbol table info available. #3 0x0000000000000002 in ?? () No symbol table info available. #4 0x00007f8550003040 in ?? () No symbol table info available. #5 0x0000000000000002 in ?? () No symbol table info available. #6 0x00007f85642f2c90 in ?? () No symbol table info available. #7 0x0000000000000000 in ?? () No symbol table info available. Thread 2 (Thread 6508): #0 0x00007f85649157bb in ?? () No symbol table info available. #1 0x0000000400000000 in ?? () No symbol table info available. #2 0x000000000076b130 in ?? () No symbol table info available. #3 0x000000000076b108 in ?? () No symbol table info available. #4 0x0000000000000008 in ?? () No symbol table info available. #5 0x000000005272a154 in ?? () No symbol table info available. #6 0x00000000000c6153 in ?? () No symbol table info available. #7 0x00007f856456be80 in ?? () No symbol table info available. #8 0x0000000000000003 in ?? () No symbol table info available. #9 0x000000000076b130 in ?? () No symbol table info available. #10 0x000000000076b0d8 in ?? () No symbol table info available. #11 0x00000000007674d0 in ?? () No symbol table info available. #12 0x00007f8564f9ab7f in syncenv_task (proc=0x7674d0) at syncop.c:307 env = 0x11 task = 0x0 sleep_till = {tv_sec = 1383244716, tv_nsec = 0} ret = <value optimized out> #13 0x00007f8564f9f120 in syncenv_processor (thdata=0x7674d0) at syncop.c:385 env = 0x7674d0 proc = 0x7674d0 task = <value optimized out> #14 0x00007f8564911851 in ?? () No symbol table info available. #15 0x00007f8561fe8700 in ?? () No symbol table info available. #16 0x0000000000000000 in ?? () No symbol table info available. Thread 1 (Thread 6509): #0 nfs3_stat_to_fattr3 (buf=0x0) at nfs3-helpers.c:287 fa = {type = 0, mode = 0, nlink = 0, uid = 0, gid = 0, size = 0, used = <value optimized out>, rdev = {specdata1 = <value optimized out>, specdata2 = <value optimized out>}, fsid = <value optimized out>, fileid = <value optimized out>, atime = {seconds = <value optimized out>, nseconds = <value optimized out>}, mtime = {seconds = <value optimized out>, nseconds = <value optimized out>}, ctime = {seconds = <value optimized out>, nseconds = <value optimized out>}} #1 0x00007f855dc88ca5 in nfs3_fill_getattr3res (res=0xcae270, stat=<value optimized out>, buf=0x0, deviceid=<value optimized out>) at nfs3-helpers.c:466 No locals. #2 0x00007f855dc76aca in nfs3_getattr_reply (req=0x7f855d8da3fc, status=NFS3_OK, buf=0x0) at nfs3.c:681 res = {status = NFS3_OK, getattr3res_u = {resok = {obj_attributes = {type = 0, mode = 0, nlink = 0, uid = 0, gid = 0, size = 0, used = 0, rdev = {specdata1 = 0, specdata2 = 0}, fsid = 0, fileid = 0, atime = {seconds = 0, nseconds = 0}, mtime = {seconds = 0, nseconds = 0}, ctime = {seconds = 0, nseconds = 0}}}}} deviceid = <value optimized out> #3 0x00007f855dc798ed in nfs3svc_getattr_stat_cbk (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=0, buf=0x0, xdata=0x0) at nfs3.c:746 status = NFS3_OK cs = 0x7f855769d7dc __FUNCTION__ = "nfs3svc_getattr_stat_cbk" #4 0x00007f855dc6dc91 in nfs_fop_stat_cbk (frame=0x7f8563038e9c, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at nfs-fops.c:490 nfl = 0x7f855dbf3d74 progcbk = <value optimized out> #5 0x00007f855deb6196 in io_stats_stat_cbk (frame=0x7f856322b0c8, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=0, buf=0x0, xdata=0x0) at io-stats.c:1311 fn = 0x7f855dc6dc50 <nfs_fop_stat_cbk> _parent = 0x7f8563038e9c old_THIS = 0x79a690 __FUNCTION__ = "io_stats_stat_cbk" #6 0x00007f8564f77892 in default_stat_cbk (frame=0x7f8563220a28, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=0, buf=<value optimized out>, xdata=0x0) at defaults.c:47 fn = 0x7f855deb60a0 <io_stats_stat_cbk> _parent = 0x7f856322b0c8 old_THIS = 0x799480 __FUNCTION__ = "default_stat_cbk" #7 0x00007f855e302da4 in dht_attr2 (this=<value optimized out>, frame=0x7f856324c2e4, op_ret=<value optimized out>) at dht-inode-read.c:210 fn = 0x7f8564f777d0 <default_stat_cbk> _parent = 0x7f8563220a28 old_THIS = 0x798260 __local = 0x7f8556af57a0 __xl = 0x798260 local = 0x7f8556af57a0 subvol = 0x0 op_errno = <value optimized out> __FUNCTION__ = "dht_attr2" #8 0x00007f855e2d9226 in dht_migration_complete_check_done (op_ret=-1, frame=0x7f856324c2e4, data=<value optimized out>) at dht-helper.c:709 local = <value optimized out> #9 0x00007f8564f9aaea in synctask_wrap (old_task=<value optimized out>) at syncop.c:134 task = 0xaad810 #10 0x00007f8564220bb0 in ?? () No symbol table info available. #11 0x0000000000000000 in ?? () No symbol table info available.
There is an existing bug BZ 1010241 which is exactly the same.
*** This bug has been marked as a duplicate of bug 1010239 ***
I haven't seen this in FS sanity since the fix was merged. Marking verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1769.html