Bug 1025604 - nfs process crashed while running FS sanity on a glusterfs mount.
nfs process crashed while running FS sanity on a glusterfs mount.
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
x86_64 Linux
high Severity urgent
: ---
: ---
Assigned To: Raghavendra G
Ben Turner
: Reopened, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-01 00:34 EDT by Saurabh
Modified: 2016-01-19 01:13 EST (History)
6 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0.39rhs
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-27 10:45:38 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
core dump (7.98 MB, application/x-xz)
2013-11-01 00:34 EDT, Saurabh
no flags Details

  None (edit)
Description Saurabh 2013-11-01 00:34:20 EDT
Created attachment 818125 [details]
core dump

Description of problem:

I updated my system from glusterfs-.3.4.0.36rhs to glusterfs-3.4.0.37rhs.
created a new volume, enabled quota and set a limit of 100GB.
Started fs-sanity tests over nfs mount.

Now, the gluster-nfs process is killed as there is a crash seen.

[2013-10-31 08:19:02.791448] E [dht-helper.c:761:dht_migration_complete_check_task] 0-dist-rep12-dht: /run23329/system_light/linux-2.6.31.1/scripts/ba
sic/.hash.cmd: failed to get the 'linkto' xattr No data available
[2013-10-31 08:19:02.791553] W [nfs3.c:739:nfs3svc_getattr_stat_cbk] 0-nfs: c380655a: /run23329/system_light/linux-2.6.31.1/scripts/basic/.hash.cmd =>
 -1 (Success)
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-10-31 08:19:02configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.0.37rhs
/lib64/libc.so.6[0x3cdd832960]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3_stat_to_fattr3+0x28)[0x7f68db122a78]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3_fill_getattr3res+0x35)[0x7f68db122ca5]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3_getattr_reply+0x3a)[0x7f68db110aca]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs3svc_getattr_stat_cbk+0x4d)[0x7f68db1138ed]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/nfs/server.so(nfs_fop_stat_cbk+0x41)[0x7f68db107c91]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/debug/io-stats.so(io_stats_stat_cbk+0xf6)[0x7f68db350196]
/usr/lib64/libglusterfs.so.0(default_stat_cbk+0xc2)[0x7f68e047e892]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/cluster/distribute.so(dht_attr2+0x234)[0x7f68db79cda4]
/usr/lib64/glusterfs/3.4.0.37rhs/xlator/cluster/distribute.so(+0xb226)[0x7f68db773226]
/usr/lib64/libglusterfs.so.0(synctask_wrap+0x2a)[0x7f68e04a1aea]
/lib64/libc.so.6[0x3cdd843bb0]



Version-Release number of selected component (if applicable):
glusterfs-3.4.0.37rhs

How reproducible:
found on this build

Volume Name: dist-rep12
Type: Distributed-Replicate
Volume ID: 9d072702-1230-421c-ad9c-41c8ed1a1c97
Status: Started
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.37.58:/rhs/bricks/d1r1-n12
Brick2: 10.70.37.196:/rhs/bricks/d1r2-n12
Brick3: 10.70.37.138:/rhs/bricks/d2r1-n12
Brick4: 10.70.37.186:/rhs/bricks/d2r2-n12
Brick5: 10.70.37.58:/rhs/bricks/d3r1-n12
Brick6: 10.70.37.196:/rhs/bricks/d3r2-n12
Brick7: 10.70.37.138:/rhs/bricks/d4r1-n12
Brick8: 10.70.37.186:/rhs/bricks/d4r2-n12
Brick9: 10.70.37.58:/rhs/bricks/d5r1-n12
Brick10: 10.70.37.196:/rhs/bricks/d5r2-n12
Brick11: 10.70.37.138:/rhs/bricks/d6r1-n12
Brick12: 10.70.37.186:/rhs/bricks/d6r2-n12
Options Reconfigured:
features.quota: on

[root@nfs1 ~]# gluster volume quota dist-rep12 list
                  Path                   Hard-limit Soft-limit   Used  Available
--------------------------------------------------------------------------------
/                                        100.0GB       80%     478.5MB  99.5GB
[root@nfs1 ~]# 


[root@nfs1 ~]# gluster volume status dist-rep12
Status of volume: dist-rep12
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.58:/rhs/bricks/d1r1-n12			49159	Y	28802
Brick 10.70.37.196:/rhs/bricks/d1r2-n12			49159	Y	23312
Brick 10.70.37.138:/rhs/bricks/d2r1-n12			49161	Y	12080
Brick 10.70.37.186:/rhs/bricks/d2r2-n12			49158	Y	21315
Brick 10.70.37.58:/rhs/bricks/d3r1-n12			49160	Y	28813
Brick 10.70.37.196:/rhs/bricks/d3r2-n12			49160	Y	23323
Brick 10.70.37.138:/rhs/bricks/d4r1-n12			49162	Y	12091
Brick 10.70.37.186:/rhs/bricks/d4r2-n12			49159	Y	21326
Brick 10.70.37.58:/rhs/bricks/d5r1-n12			49161	Y	28824
Brick 10.70.37.196:/rhs/bricks/d5r2-n12			49161	Y	23334
Brick 10.70.37.138:/rhs/bricks/d6r1-n12			49163	Y	12102
Brick 10.70.37.186:/rhs/bricks/d6r2-n12			49160	Y	21337
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	Y	28842
Quota Daemon on localhost				N/A	Y	28914
NFS Server on 10.70.37.186				2049	Y	21349
Self-heal Daemon on 10.70.37.186			N/A	Y	21357
Quota Daemon on 10.70.37.186				N/A	Y	21400
NFS Server on 10.70.37.196				2049	Y	23347
Self-heal Daemon on 10.70.37.196			N/A	Y	23352
Quota Daemon on 10.70.37.196				N/A	Y	23403
NFS Server on 10.70.37.138				2049	Y	12114
Self-heal Daemon on 10.70.37.138			N/A	Y	12120
Quota Daemon on 10.70.37.138				N/A	Y	12176
 
There are no active volume tasks


Core dump is attached.
Comment 2 Ben Turner 2013-11-01 12:09:47 EDT
I am still seeing this on the 3.4.0.38rhs build.  To note, I was running FS sanity on a glusterfs mount with no quota enabled:

[New Thread 6509]
[New Thread 6508]
[New Thread 6547]
[New Thread 6506]
[New Thread 6507]
[New Thread 6515]
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/'.
Program terminated with signal 11, Segmentation fault.
#0  nfs3_stat_to_fattr3 (buf=0x0) at nfs3-helpers.c:287
287	        if (IA_ISDIR (buf->ia_type))

Thread 6 (Thread 6515):
#0  0x00007f8564918d2d in ?? ()
No symbol table info available.
#1  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 5 (Thread 6507):
#0  0x00007f85649192a5 in ?? ()
No symbol table info available.
#1  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 4 (Thread 6506):
#0  0x00007f85642c5f43 in ?? ()
No symbol table info available.
#1  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 3 (Thread 6547):
#0  0x00007f85642bc293 in ?? ()
No symbol table info available.
#1  0x0000000000000003 in ?? ()
No symbol table info available.
#2  0x00007f85ffffffff in ?? ()
No symbol table info available.
#3  0x0000000000000002 in ?? ()
No symbol table info available.
#4  0x00007f8550003040 in ?? ()
No symbol table info available.
#5  0x0000000000000002 in ?? ()
No symbol table info available.
#6  0x00007f85642f2c90 in ?? ()
No symbol table info available.
#7  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 2 (Thread 6508):
#0  0x00007f85649157bb in ?? ()
No symbol table info available.
#1  0x0000000400000000 in ?? ()
No symbol table info available.
#2  0x000000000076b130 in ?? ()
No symbol table info available.
#3  0x000000000076b108 in ?? ()
No symbol table info available.
#4  0x0000000000000008 in ?? ()
No symbol table info available.
#5  0x000000005272a154 in ?? ()
No symbol table info available.
#6  0x00000000000c6153 in ?? ()
No symbol table info available.
#7  0x00007f856456be80 in ?? ()
No symbol table info available.
#8  0x0000000000000003 in ?? ()
No symbol table info available.
#9  0x000000000076b130 in ?? ()
No symbol table info available.
#10 0x000000000076b0d8 in ?? ()
No symbol table info available.
#11 0x00000000007674d0 in ?? ()
No symbol table info available.
#12 0x00007f8564f9ab7f in syncenv_task (proc=0x7674d0) at syncop.c:307
        env = 0x11
        task = 0x0
        sleep_till = {tv_sec = 1383244716, tv_nsec = 0}
        ret = <value optimized out>
#13 0x00007f8564f9f120 in syncenv_processor (thdata=0x7674d0) at syncop.c:385
        env = 0x7674d0
        proc = 0x7674d0
        task = <value optimized out>
#14 0x00007f8564911851 in ?? ()
No symbol table info available.
#15 0x00007f8561fe8700 in ?? ()
No symbol table info available.
#16 0x0000000000000000 in ?? ()
No symbol table info available.

Thread 1 (Thread 6509):
#0  nfs3_stat_to_fattr3 (buf=0x0) at nfs3-helpers.c:287
        fa = {type = 0, mode = 0, nlink = 0, uid = 0, gid = 0, size = 0, used = <value optimized out>, rdev = {specdata1 = <value optimized out>, specdata2 = <value optimized out>}, fsid = <value optimized out>, fileid = <value optimized out>, atime = {seconds = <value optimized out>, nseconds = <value optimized out>}, mtime = {seconds = <value optimized out>, nseconds = <value optimized out>}, ctime = {seconds = <value optimized out>, nseconds = <value optimized out>}}
#1  0x00007f855dc88ca5 in nfs3_fill_getattr3res (res=0xcae270, stat=<value optimized out>, buf=0x0, deviceid=<value optimized out>) at nfs3-helpers.c:466
No locals.
#2  0x00007f855dc76aca in nfs3_getattr_reply (req=0x7f855d8da3fc, status=NFS3_OK, buf=0x0) at nfs3.c:681
        res = {status = NFS3_OK, getattr3res_u = {resok = {obj_attributes = {type = 0, mode = 0, nlink = 0, uid = 0, gid = 0, size = 0, used = 0, rdev = {specdata1 = 0, specdata2 = 0}, fsid = 0, fileid = 0, atime = {seconds = 0, nseconds = 0}, mtime = {seconds = 0, nseconds = 0}, ctime = {seconds = 0, nseconds = 0}}}}}
        deviceid = <value optimized out>
#3  0x00007f855dc798ed in nfs3svc_getattr_stat_cbk (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=0, buf=0x0, xdata=0x0) at nfs3.c:746
        status = NFS3_OK
        cs = 0x7f855769d7dc
        __FUNCTION__ = "nfs3svc_getattr_stat_cbk"
#4  0x00007f855dc6dc91 in nfs_fop_stat_cbk (frame=0x7f8563038e9c, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at nfs-fops.c:490
        nfl = 0x7f855dbf3d74
        progcbk = <value optimized out>
#5  0x00007f855deb6196 in io_stats_stat_cbk (frame=0x7f856322b0c8, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=0, buf=0x0, xdata=0x0) at io-stats.c:1311
        fn = 0x7f855dc6dc50 <nfs_fop_stat_cbk>
        _parent = 0x7f8563038e9c
        old_THIS = 0x79a690
        __FUNCTION__ = "io_stats_stat_cbk"
#6  0x00007f8564f77892 in default_stat_cbk (frame=0x7f8563220a28, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=0, buf=<value optimized out>, xdata=0x0) at defaults.c:47
        fn = 0x7f855deb60a0 <io_stats_stat_cbk>
        _parent = 0x7f856322b0c8
        old_THIS = 0x799480
        __FUNCTION__ = "default_stat_cbk"
#7  0x00007f855e302da4 in dht_attr2 (this=<value optimized out>, frame=0x7f856324c2e4, op_ret=<value optimized out>) at dht-inode-read.c:210
        fn = 0x7f8564f777d0 <default_stat_cbk>
        _parent = 0x7f8563220a28
        old_THIS = 0x798260
        __local = 0x7f8556af57a0
        __xl = 0x798260
        local = 0x7f8556af57a0
        subvol = 0x0
        op_errno = <value optimized out>
        __FUNCTION__ = "dht_attr2"
#8  0x00007f855e2d9226 in dht_migration_complete_check_done (op_ret=-1, frame=0x7f856324c2e4, data=<value optimized out>) at dht-helper.c:709
        local = <value optimized out>
#9  0x00007f8564f9aaea in synctask_wrap (old_task=<value optimized out>) at syncop.c:134
        task = 0xaad810
#10 0x00007f8564220bb0 in ?? ()
No symbol table info available.
#11 0x0000000000000000 in ?? ()
No symbol table info available.
Comment 3 santosh pradhan 2013-11-04 00:47:18 EST
There is an existing bug BZ 1010241 which is exactly the same.
Comment 4 santosh pradhan 2013-11-04 00:51:19 EST

*** This bug has been marked as a duplicate of bug 1010239 ***
Comment 5 Ben Turner 2013-11-12 16:40:04 EST
I haven't seen this in FS sanity since the fix was merged.  Marking verified.
Comment 6 errata-xmlrpc 2013-11-27 10:45:38 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html

Note You need to log in before you can comment on or make changes to this bug.