Bug 763027 (GLUSTER-1295)

Summary: gnfs core dump in afr_local_cleanup on rmdir
Product: [Community] GlusterFS Reporter: Lakshmipathi G <lakshmipathi>
Component: nfsAssignee: Shehjar Tikoo <shehjart>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: 3.1-alphaCC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lakshmipathi G 2010-08-06 05:25:36 UTC
While running nfsalpha tests gnfs core dumped.

Core was generated by `/opt/glusterfs/3.1.0qa3/sbin/glusterfsd -f /opt/users/laks/nfs/alpha/test-tcp.v'.
Program terminated with signal 6, Aborted.
#0  0x00002aaaab791ec5 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt full 
#0  0x00002aaaab791ec5 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
	pid = <value optimized out>
	selftid = 0
#1  0x00002aaaab793970 in abort () at abort.c:88
	act = {__sigaction_handler = {sa_handler = 0x2aaaab88148e <_libc_intl_domainname>, sa_sigaction = 0x2aaaab88148e <_libc_intl_domainname>}, sa_mask = {
    __val = {46912510634709, 46912512927808, 46912498577789, 140737245675680, 46912498577658, 140737245675920, 46912509846331, 206158430232, 140737245675936, 
      140737245675712, 46912509762648, 206158430256, 140737245675960, 46912646924848, 48, 46912646924928}}, sa_flags = -1417135479, sa_restorer = 0x7ffff18929f5}
	sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00002aaaab78b11f in __assert_fail (assertion=0x2aaaaad0317d "inode->ref", file=0x2aaaaad030fa "inode.c", line=350, function=0x2aaaaad03300 "__inode_unref")
    at assert.c:78
	buf = 0x2aaab3a7ca30 "\200"
	errstr = "Unexpected error.\n"
#3  0x00002aaaaace8584 in __inode_unref (inode=0x1dcdcf0) at inode.c:350
	__PRETTY_FUNCTION__ = "__inode_unref"
#4  0x00002aaaaace9194 in inode_unref (inode=0x1dcdcf0) at inode.c:389
	table = (inode_table_t *) 0x17c8540
#5  0x00002aaaaacdb312 in loc_wipe (loc=0x2aaab74260c0) at xlator.c:1000
No locals.
#6  0x00002aaaac54ec8a in afr_local_cleanup (local=0x2aaab7426090, this=0x631d80) at afr.c:337
	i = <value optimized out>
#7  0x00002aaaac55ded3 in afr_rmdir_done (frame=<value optimized out>, this=<value optimized out>) at afr-dir-write.c:1938
	ctx = (glusterfs_ctx_t *) 0x628010
	__local = (afr_local_t *) 0x2aaab7426090
	__this = (xlator_t *) 0x631d80
#8  0x00002aaaac56bc94 in afr_unlock_common_cbk (frame=0x2aaab7434488, cookie=<value optimized out>, this=0x631d80, op_ret=<value optimized out>, 
    op_errno=<value optimized out>) at afr-transaction.c:514
	local = (afr_local_t *) 0x2aaab7426090
#9  0x00002aaaac32f924 in client3_1_entrylk_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x2aaab3a74bc0)
    at client3_1-fops.c:1134
	_parent = (call_frame_t *) 0x2aaab7434488
	old_THIS = (xlator_t *) 0x62f720
	rsp = {gfs_id = 0, op_ret = 0, op_errno = 0}
	__FUNCTION__ = "client3_1_entrylk_cbk"
#10 0x00002aaaaaf1ee5d in rpc_clnt_handle_reply (clnt=<value optimized out>, pollin=<value optimized out>) at rpc-clnt.c:672
	conn = (rpc_clnt_connection_t *) 0x637b40
	saved_frame = (struct saved_frame *) 0x2aaaada06960
	request_info = (rpc_request_info_t *) 0x2aaab3a51ac0
---Type <return> to continue, or q <return> to quit--- 
	ret = 0
	req = (struct rpc_req *) 0x2aaaad90da20
	__FUNCTION__ = "rpc_clnt_handle_reply"
#11 0x00002aaaaaf1f042 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x637b40, event=<value optimized out>, data=0xffffffffffffffff) at rpc-clnt.c:781
	clnt = (struct rpc_clnt *) 0x637b10
	ret = <value optimized out>
	tv = {tv_sec = 140737245676512, tv_usec = 6521152}
#12 0x00002aaaaaf1a988 in rpc_transport_notify (this=0x0, event=RPC_TRANSPORT_MSG_SENT, data=0xffffffffffffffff) at rpc-transport.c:1239
	ret = <value optimized out>
#13 0x00002aaaacc8ad8f in socket_event_poll_in (this=0x637cd0) at socket.c:1406
	ret = 0
	pollin = (rpc_transport_pollin_t *) 0x2aaab3a4fe40
#14 0x00002aaaacc8ae91 in socket_event_handler (fd=<value optimized out>, idx=2, data=0x637cd0, poll_in=1, poll_out=0, poll_err=0) at socket.c:1512
	this = (rpc_transport_t *) 0x3792
	priv = (socket_private_t *) 0x638060
	ret = 0
	__FUNCTION__ = "socket_event_handler"
#15 0x00002aaaaacf5ff7 in event_dispatch_epoll (event_pool=0x628330) at event.c:812
	events = (struct epoll_event *) 0x638a70
	i = 0
	ret = 1
	__FUNCTION__ = "event_dispatch_epoll"
#16 0x0000000000403d3b in main (argc=7, argv=0x7ffff1890b68) at glusterfsd.c:1273
	ctx = (glusterfs_ctx_t *) 0x628010
	ret = 0
(gdb)
log file
---------
D [nfs3-helpers.c:2227:nfs3_log_common_call] nfs-nfsv3: XID: b297b4bc, GETATTR: args: FH: hashcount 7, xlid 0, gen 5501895476029224925, ino 155063608
[2010-08-05 22:30:47.680686] D [nfs3-helpers.c:2412:nfs3_log_newfh_res] nfs-nfsv3: XID: a997b4bc, CREATE: NFS: 0(Call completed successfully.), POSIX: 0(Success), FH: hashcount 8, xlid 0, gen 5501895476029228936, ino 155191868
[2010-08-05 22:30:47.681578] D [nfs3-helpers.c:2437:nfs3_log_readdirp_res] nfs-nfsv3: XID: ddd1f61a, READDIRPLUS: NFS: 0(Call completed successfully.), POSIX: 0(Success), dircount: 512, maxcount: 4096, cverf: 46912588667328, is_eof: 0
[2010-08-05 22:30:47.681636] D [afr-self-heal-data.c:984:afr_sh_data_lock_cbk] mirror: locking of /nfsalpha/ip-10-212-70-131/test7/linux-2.6.35/drivers/gpu/drm/nouveau/nv04_display.c on child 0 failed: Success
[2010-08-05 22:30:47.683998] D [rpcsvc.c:1817:rpcsvc_request_create] rpc-service: RPC XID: ded1f61a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 17
[2010-08-05 22:30:47.684032] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - READDIRPLUS
[2010-08-05 22:30:47.684050] D [nfs3-helpers.c:2465:nfs3_log_readdir_call] nfs-nfsv3: XID: ded1f61a, READDIRPLUS: args: FH: hashcount 5, xlid 0, gen 5501895476029098273, ino 141560788, dircount: 512, maxcount: 4096
[2010-08-05 22:30:47.707235] D [rpcsvc.c:1817:rpcsvc_request_create] rpc-service: RPC XID: b397b4bc, Ver: 2, Program: 100003, ProgVers: 3, Proc: 4
[2010-08-05 22:30:47.707576] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - ACCESS
[2010-08-05 22:30:47.707600] D [nfs3-helpers.c:2227:nfs3_log_common_call] nfs-nfsv3: XID: b397b4bc, ACCESS: args: FH: hashcount 6, xlid 0, gen 5501895476029228935, ino 155191864
[2010-08-05 22:30:47.707686] D [rpcsvc.c:1817:rpcsvc_request_create] rpc-service: RPC XID: b497b4bc, Ver: 2, Program: 100003, ProgVers: 3, Proc: 2
[2010-08-05 22:30:47.707708] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - SETATTR
[2010-08-05 22:30:47.707730] D [nfs3-helpers.c:2227:nfs3_log_common_call] nfs-nfsv3: XID: b497b4bc, SETATTR: args: FH: hashcount 8, xlid 0, gen 5501895476029228936, ino 155191868
pending frames:

patchset: v3.1.0qa3
signal received: 6
time of crash: 2010-08-05 22:30:47
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.0qa3
[2010-08-05 22:30:48.207631] D [name.c:155:client_fill_address_family] ec2-72-44-60-35.compute-1.amazonaws.com-1: address-family not specified, guessing it to be inet/inet6
/lib64/libc.so.6[0x2aaaab791f30]
/lib64/libc.so.6(gsignal+0x35)[0x2aaaab791ec5]
/lib64/libc.so.6(abort+0x110)[0x2aaaab793970]
/lib64/libc.so.6(__assert_fail+0xef)[0x2aaaab78b11f]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaace8584]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(inode_unref+0x24)[0x2aaaaace9194]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(loc_wipe+0x12)[0x2aaaaacdb312]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/cluster/replicate.so(afr_local_cleanup+0x3a)[0x2aaaac54ec8a]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/cluster/replicate.so(afr_rmdir_done+0x133)[0x2aaaac55ded3]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/cluster/replicate.so(afr_unlock_common_cbk+0x54)[0x2aaaac56bc94]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/client.so(client3_1_entrylk_cbk+0xb4)[0x2aaaac32f924]
/opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x8d)[0x2aaaaaf1ee5d]
/opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_clnt_notify+0x82)[0x2aaaaaf1f042]
/opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x2aaaaaf1a988]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x2aaaacc8ad8f]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_handler+0xe1)[0x2aaaacc8ae91]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaacf5ff7]
/opt/glusterfs/3.1.0qa3/sbin/glusterfsd(main+0x32b)[0x403d3b]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2aaaab77f074]
/opt/glusterfs/3.1.0qa3/sbin/glusterfsd[0x402839]
---------

Comment 1 Shehjar Tikoo 2010-08-17 10:41:10 UTC
*** Bug 1342 has been marked as a duplicate of this bug. ***

Comment 2 Shehjar Tikoo 2010-09-18 06:16:31 UTC
Havent seen this crash in many days of compilebench tests. Closing. Most probably related to 1619.