Description Raghavendra Bhat 2012-11-19 02:22:29 EST
Description of problem:
Created a replicate volume. did a fuse mount. touched a file and created a directory. Now did the nfs mount of the volume and did rm -rf on the nfs mount.
nfs server crashed with the below backtrace.

Program terminated with signal 6, Aborted.
#0  0x0000003a99ce8eef in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.15-37.fc17.x86_64 keyutils-libs-1.5.5-2.fc17.x86_64 krb5-libs-1.10.2-2.fc17.x86_64 libcom_err-1.42.3-2.fc17.x86_64 libgcc-4.7.0-5.fc17.x86_64 libselinux-2.1.10-3.fc17.x86_64 openssl-1.0.0j-1.fc17.x86_64 zlib-1.2.5-6.fc17.x86_64
(gdb) bt
#0  0x0000003a99ce8eef in poll () from /lib64/libc.so.6
#1  0x0000003a99d26937 in svc_run () from /lib64/libc.so.6
#2  0x000000000951b9d5 in nsm_thread (argv=0x0) at ../../../../../xlators/nfs/server/src/nlmcbk_svc.c:118
#3  0x0000003a9a007d14 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003a99cf199d in clone () from /lib64/libc.so.6
(gdb) t 2
[Switching to thread 2 (LWP 6825)]
#0  0x0000003a9a00e80d in nanosleep () from /lib64/libpthread.so.0
(gdb) t 3
[Switching to thread 3 (LWP 6823)]
#0  0x0000003a9a00b902 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) t 4
[Switching to thread 4 (LWP 6822)]
#0  0x0000003a9a00b902 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) t 5
[Switching to thread 5 (LWP 6821)]
#0  0x0000003a9a00ed70 in sigwait () from /lib64/libpthread.so.0
(gdb) t 6
[Switching to thread 6 (LWP 6817)]
#0  0x0000003a99c35965 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003a99c35965 in raise () from /lib64/libc.so.6
#1  0x0000003a99c37118 in abort () from /lib64/libc.so.6
#2  0x0000003a99c2e6e2 in __assert_fail_base () from /lib64/libc.so.6
#3  0x0000003a99c2e792 in __assert_fail () from /lib64/libc.so.6
#4  0x0000000008bc1036 in client3_3_opendir (frame=0x53700b0, this=0x56cb7c0, data=0x7feffed80)
    at ../../../../../xlators/protocol/client/src/client-rpc-fops.c:4226
#5  0x0000000008ba76f9 in client_opendir (frame=0x53700b0, this=0x56cb7c0, loc=0xb31e444, fd=0x9a5189c, xdata=0x0)
    at ../../../../../xlators/protocol/client/src/client.c:1085
#6  0x0000000008debc89 in afr_opendir (frame=0x5370004, this=0x56ce420, loc=0xb31e444, fd=0x9a5189c)
    at ../../../../../xlators/cluster/afr/src/afr-dir-read.c:325
#7  0x0000000009099cd5 in dht_rmdir (frame=0x536feac, this=0x56cf9e0, loc=0xb31e444, flags=0, xdata=0x0)
    at ../../../../../xlators/cluster/dht/src/dht-common.c:4554
#8  0x00000000092c7fbd in io_stats_rmdir (frame=0x536fd54, this=0x56d0f50, loc=0xb31e444, flags=0, xdata=0x0)
    at ../../../../../xlators/debug/io-stats/src/io-stats.c:1949
#9  0x00000000094edcde in nfs_fop_rmdir (nfsx=0x56d2860, xl=0x56d0f50, nfu=0x7fefff380, pathloc=0xb31e444, 
    cbk=0x94f2423 <nfs_inode_rmdir_cbk>, local=0x667a6c4) at ../../../../../xlators/nfs/server/src/nfs-fops.c:1114
#10 0x00000000094f2648 in nfs_inode_rmdir (nfsx=0x56d2860, xl=0x56d0f50, nfu=0x7fefff380, pathloc=0xb31e444, 
    cbk=0x95048bd <nfs3svc_rmdir_cbk>, local=0xb31e05c) at ../../../../../xlators/nfs/server/src/nfs-inodes.c:447
#11 0x00000000094f360f in nfs_rmdir (nfsx=0x56d2860, xl=0x56d0f50, nfu=0x7fefff380, path=0xb31e444, 
    cbk=0x95048bd <nfs3svc_rmdir_cbk>, local=0xb31e05c) at ../../../../../xlators/nfs/server/src/nfs-generics.c:268
#12 0x0000000009504b74 in nfs3_rmdir_resume (carg=0xb31e05c) at ../../../../../xlators/nfs/server/src/nfs3.c:3578
#13 0x00000000095132d5 in nfs3_fh_resolve_entry_lookup_cbk (frame=0x541ead0, cookie=0x56d0f50, this=0x56d2860, 
    op_ret=0, op_errno=117, inode=0x9b34218, buf=0xb01e8e8, xattr=0x5625474, postparent=0xb01eb18)
    at ../../../../../xlators/nfs/server/src/nfs3-helpers.c:3581
#14 0x00000000094e8fd7 in nfs_fop_lookup_cbk (frame=0x541ead0, cookie=0x56d0f50, this=0x56d2860, op_ret=0, 
    op_errno=117, inode=0x9b34218, buf=0xb01e8e8, xattr=0x5625474, postparent=0xb01eb18)
    at ../../../../../xlators/nfs/server/src/nfs-fops.c:409
#15 0x00000000092c26e4 in io_stats_lookup_cbk (frame=0x536ff58, cookie=0x536f9f8, this=0x56d0f50, op_ret=0, 
    op_errno=117, inode=0x9b34218, buf=0xb01e8e8, xdata=0x5625474, postparent=0xb01eb18)
    at ../../../../../xlators/debug/io-stats/src/io-stats.c:1479
#16 0x0000000009081c74 in dht_lookup_dir_cbk (frame=0x536f9f8, cookie=0x536fbfc, this=0x56cf9e0, op_ret=0, 
    op_errno=0, inode=0x9b34218, stbuf=0xc4a9d18, xattr=0x5625474, postparent=0xc4a9d88)
    at ../../../../../xlators/cluster/dht/src/dht-common.c:498
#17 0x0000000008e3d5da in afr_lookup_done (frame=0x536fbfc, this=0x56ce420)
    at ../../../../../xlators/cluster/afr/src/afr-common.c:1940
#18 0x0000000008e3e10f in afr_lookup_cbk (frame=0x536fbfc, cookie=0x1, this=0x56ce420, op_ret=0, op_errno=0, 
    inode=0x9b34218, buf=0x7fefffc00, xattr=0x5625474, postparent=0x7fefffb90)
    at ../../../../../xlators/cluster/afr/src/afr-common.c:2175
#19 0x0000000008bba177 in client3_3_lookup_cbk (req=0xb115198, iov=0xb1151d8, count=1, myframe=0x536fe00)
    at ../../../../../xlators/protocol/client/src/client-rpc-fops.c:2616
#20 0x0000000004eab407 in rpc_clnt_handle_reply (clnt=0x6712090, pollin=0x9a96d60)
    at ../../../../rpc/rpc-lib/src/rpc-clnt.c:775
#21 0x0000000004eab776 in rpc_clnt_notify (trans=0x9a841f0, mydata=0x67120c0, event=RPC_TRANSPORT_MSG_RECEIVED, 
    data=0x9a96d60) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:894
#22 0x0000000004ea7f2a in rpc_transport_notify (this=0x9a841f0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x9a96d60)
    at ../../../../rpc/rpc-lib/src/rpc-transport.c:495
#23 0x0000000007f60a92 in socket_event_poll_in (this=0x9a841f0)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:1986
#24 0x0000000007f60f2a in socket_event_handler (fd=17, idx=9, data=0x9a841f0, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:2098
#25 0x0000000004c7b85e in event_dispatch_epoll_handler (event_pool=0x530f690, events=0x6673e90, i=0)
    at ../../../libglusterfs/src/event-epoll.c:384
#26 0x0000000004c7ba43 in event_dispatch_epoll (event_pool=0x530f690)
    at ../../../libglusterfs/src/event-epoll.c:445
#27 0x0000000004c5343d in event_dispatch (event_pool=0x530f690) at ../../../libglusterfs/src/event.c:113
#28 0x000000000040840e in main (argc=11, argv=0x7ff0002b8) at ../../../glusterfsd/src/glusterfsd.c:1883
f 4
#4  0x0000000008bc1036 in client3_3_opendir (frame=0x53700b0, this=0x56cb7c0, data=0x7feffed80)
    at ../../../../../xlators/protocol/client/src/client-rpc-fops.c:4226
4226	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name,
(gdb) l
4221	        if (!uuid_is_null (args->loc->inode->gfid))
4222	                memcpy (req.gfid,  args->loc->inode->gfid, 16);
4223	        else
4224	                memcpy (req.gfid, args->loc->gfid, 16);
4226	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name,
4227	                                       !uuid_is_null (*((uuid_t*)req.gfid)),
4228	                                       unwind, op_errno, EINVAL);
4230	        conf = this->private;
(gdb) p req.gfid
$1 = '\000' <repeats 15 times>
(gdb) p *args->loc
$2 = {path = 0x9a94b80 "/okpa", name = 0x9a94b81 "okpa", inode = 0x9b34218, parent = 0x9b3405c, 
  gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>, "\001"}
(gdb) f 13
#13 0x00000000095132d5 in nfs3_fh_resolve_entry_lookup_cbk (frame=0x541ead0, cookie=0x56d0f50, this=0x56d2860, 
    op_ret=0, op_errno=117, inode=0x9b34218, buf=0xb01e8e8, xattr=0x5625474, postparent=0xb01eb18)
    at ../../../../../xlators/nfs/server/src/nfs3-helpers.c:3581
3581	        nfs3_call_resume (cs);
(gdb) l
3576	        if (linked_inode) {
3577	                inode_lookup (linked_inode);
3578	                inode_unref (linked_inode);
3579	        }
3580	err:
3581	        nfs3_call_resume (cs);
3582	        return 0;
3583	}
(gdb) p linked_inode->gfid
$3 = "\303\177\244\247\321\060Ko\205dm\305\063", <incomplete sequence \367\270>
(gdb) p cs->resolvedloc.inode->gfid
$4 = '\000' <repeats 15 times>

In resolvedloc structure the older inode should be unrefed and the linked_inode should be used.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. create a replicate volume, start.
2. mount the volume via fuse and create a file and directory on the mount point
3. now do the nfs mount and through nfs mount do rm -rf *
Actual results:
the nfs server crashes thus blocking the commands executed on the mountpoint.

Expected results:
nfs server should not crash.

Additional info:

gluster v i repl
Volume Name: repl
Type: Replicate
Volume ID: 1f89c060-18b0-4c4a-89b6-1bd2cafc869d
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: thinkpad:/export1/repl
Brick2: thinkpad:/export2/repl
Comment 1 Vijay Bellur 2012-11-19 05:51:06 EST
CHANGE: http://review.gluster.org/4205 (nfs: after resolving the entry use the linked inode instead of old inode) merged in master by Vijay Bellur (vbellur@redhat.com)

