Bug 763074 (GLUSTER-1342)

Summary: gnfs crash on replicate inode unref
Product: [Community] GlusterFS Reporter: Lakshmipathi G <lakshmipathi>
Component: replicateAssignee: Pavan Vilas Sondur <pavan>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: 3.1-alphaCC: gluster-bugs, shehjart, tejas
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tejas Bhise 2010-08-12 05:55:20 UTC
Pavan - please look at this and let the components at the lower components on the stack know if they are sending afr wrong values. If yes, then send it to the lower component.

Comment 1 Lakshmipathi G 2010-08-12 07:51:22 UTC
Running mixed test with 4x3 distributed-replicated with 3.1.0qa3- following core found.

(gdb) bt full
#0  0x00002aaaab567343 in pthread_mutex_lock () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00002aaaaaceef92 in inode_unref (inode=0x2aaab0a2cd20) at inode.c:387
	table = (inode_table_t *) 0x0
#2  0x00002aaaaace0a8c in loc_wipe (loc=0x12dfc60) at xlator.c:1000
No locals.
#3  0x00002aaaac567d1e in afr_local_cleanup (local=0x12dfc30, this=0x639c10) at afr.c:337
	i = 10922
	priv = (afr_private_t *) 0x63ff00
#4  0x00002aaaac57f8df in afr_unlink_done (frame=0x12e81b8, this=0x639c10) at afr-dir-write.c:1723
	__local = (afr_local_t *) 0x12dfc30
	__this = (xlator_t *) 0x639c10
	local = (afr_local_t *) 0x12dfc30
#5  0x00002aaaac5907c7 in afr_unlock_common_cbk (frame=0x12e81b8, cookie=0x2aaab51306d0, this=0x639c10, op_ret=0, op_errno=0) at afr-transaction.c:514
	local = (afr_local_t *) 0x12dfc30
	call_count = 0
#6  0x00002aaaac3484a0 in client3_1_entrylk_cbk (req=0x2aaaae20f1a0, iov=0x2aaaae20f1e0, count=1, myframe=0x2aaab51306d0) at client3_1-fops.c:1134
	fn = (fop_entrylk_cbk_t) 0x2aaaac59074c <afr_unlock_common_cbk>
	_parent = (call_frame_t *) 0x12e81b8
	old_THIS = (xlator_t *) 0x634bb0
	frame = (call_frame_t *) 0x2aaab51306d0
	rsp = {gfs_id = 0, op_ret = 0, op_errno = 0}
	ret = 16
	__FUNCTION__ = "client3_1_entrylk_cbk"
#7  0x00002aaaaaf33044 in rpc_clnt_handle_reply (clnt=0x644f60, pollin=0x2aaab515aaf0) at rpc-clnt.c:672
	conn = (rpc_clnt_connection_t *) 0x644f90
	saved_frame = (struct saved_frame *) 0x2aaaae43c130
	request_info = (rpc_request_info_t *) 0x2aaab51abc60
	ret = 0
	req = (struct rpc_req *) 0x2aaaae20f1a0
	__FUNCTION__ = "rpc_clnt_handle_reply"
#8  0x00002aaaaaf33355 in rpc_clnt_notify (trans=0x645050, mydata=0x644f90, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2aaab515aaf0) at rpc-clnt.c:781
	conn = (rpc_clnt_connection_t *) 0x644f90
	clnt = (struct rpc_clnt *) 0x644f60
	ret = -1
	req_info = (rpc_request_info_t *) 0x0
	pollin = (rpc_transport_pollin_t *) 0x2aaab515aaf0

tv = {tv_sec = 0, tv_usec = 0}
#9  0x00002aaaaaf3109b in rpc_transport_notify (this=0x645050, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2aaab515aaf0) at rpc-transport.c:1239
	ret = -1
#10 0x00002aaaacef97c3 in socket_event_poll_in (this=0x645050) at socket.c:1406
	ret = 0
	pollin = (rpc_transport_pollin_t *) 0x2aaab515aaf0
#11 0x00002aaaacef9ab7 in socket_event_handler (fd=12, idx=5, data=0x645050, poll_in=1, poll_out=0, poll_err=0) at socket.c:1512
	this = (rpc_transport_t *) 0x645050
	priv = (socket_private_t *) 0x6454a0
	ret = 0
	__FUNCTION__ = "socket_event_handler"
#12 0x00002aaaaad038e4 in event_dispatch_epoll_handler (event_pool=0x629330, events=0x6489c0, i=3) at event.c:812
	event_data = (struct event_data *) 0x6489e8
	handler = (event_handler_t) 0x2aaaacef9a00 <socket_event_handler>
	data = (void *) 0x645050
	idx = 5
	ret = -1
	__FUNCTION__ = "event_dispatch_epoll_handler"
#13 0x00002aaaaad03ac5 in event_dispatch_epoll (event_pool=0x629330) at event.c:876
	events = (struct epoll_event *) 0x6489c0
	size = 5
	i = 3
	ret = 0
	__FUNCTION__ = "event_dispatch_epoll"
#14 0x00002aaaaad03ddb in event_dispatch (event_pool=0x629330) at event.c:984
	ret = -1
	__FUNCTION__ = "event_dispatch"
#15 0x0000000000405062 in main (argc=7, argv=0x7fff729154e8) at glusterfsd.c:1273
	ctx = (glusterfs_ctx_t *) 0x629010
	ret = 0

Log file
========
#tail -100  gNFS.log
[2010-08-12 03:11:05.311430] T [nfs3-helpers.c:2192:nfs3_file_open_and_resume] nfs-nfsv3: Opening: /NFSalpha28/ip-10-244-167-207/test15/fi2
[2010-08-12 03:11:05.311453] T [nfs3-helpers.c:2172:nfs3_fdcache_getfd] nfs-nfsv3: fd found in state: 21
[2010-08-12 03:11:05.311474] T [nfs3-helpers.c:1880:__nfs3_fdcache_update_entry] nfs-nfsv3: Updating fd: 0x2aaab5193b70
[2010-08-12 03:11:05.311491] T [nfs.c:409:nfs_user_create] nfs: uid: 0, gid 0, gids: 0
[2010-08-12 03:11:05.311543] T [rpc-clnt.c:1164:rpc_clnt_record] : Auth Info: pid: 2963677920, uid: 0, gid: 0, owner: 2963677920
[2010-08-12 03:11:05.311564] T [rpc-clnt.c:1056:rpc_clnt_record_build_header] rpc-clnt: Request fraglen 208, payload: 80, rpc hdr: 128
[2010-08-12 03:11:05.324709] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 4, dataread: 4
[2010-08-12 03:11:05.324732] T [rpcsvc.c:1064:rpcsvc_record_read_complete_fraghdr] rpc-service: Received fragment size: 100
[2010-08-12 03:11:05.324748] T [rpcsvc.c:1071:rpcsvc_record_read_complete_fraghdr] rpc-service: Regular RPC header, remaining: 100
[2010-08-12 03:11:05.324767] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 100, dataread: 100
[2010-08-12 03:11:05.324788] T [rpcsvc.c:2333:rpcsvc_record_update_state] rpc-service: Regular frag
[2010-08-12 03:11:05.324802] T [rpcsvc.c:1158:rpcsvc_record_read_complete_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.324837] T [rpcsvc.c:2355:rpcsvc_record_update_state] rpc-service: Full Record Received.
[2010-08-12 03:11:05.324853] T [rpcsvc-auth.c:247:rpcsvc_auth_request_init] rpc-service: Auth handler: AUTH_NULL
[2010-08-12 03:11:05.324867] D [rpcsvc.c:1817:rpcsvc_request_create] rpc-service: RPC XID: 537186f4, Ver: 2, Program: 100003, ProgVers: 3, Proc: 17
[2010-08-12 03:11:05.324883] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - READDIRPLUS
[2010-08-12 03:11:05.324902] D [nfs3-helpers.c:2465:nfs3_log_readdir_call] nfs-nfsv3: XID: 537186f4, READDIRPLUS: args: FH: hashcount 5, xlid 0, gen 5504105043428747165, ino 636420636, dircount: 512, maxcount: 4096
[2010-08-12 03:11:05.324919] T [nfs3.c:3870:nfs3_readdir] nfs-nfsv3: FH to Volume: distribute
[2010-08-12 03:11:05.324934] T [nfs3-helpers.c:2973:nfs3_fh_resolve_inode] nfs-nfsv3: FH needs inode resolution
[2010-08-12 03:11:05.324949] T [nfs3-helpers.c:2477:nfs3_fh_resolve_inode_done] nfs-nfsv3: FH inode resolved
[2010-08-12 03:11:05.324968] T [nfs3-helpers.c:1787:nfs3_dir_open_and_resume] nfs-nfsv3: Opening: /NFSalpha30/ip-10-244-167-207/test10/clients/client6
[2010-08-12 03:11:05.324982] T [nfs3-helpers.c:1790:nfs3_dir_open_and_resume] nfs-nfsv3: fd found in state: ref: 4
[2010-08-12 03:11:05.324997] T [nfs.c:409:nfs_user_create] nfs: uid: 0, gid 0, gids: 0
[2010-08-12 03:11:05.325012] T [nfs-fops.c:506:nfs_fop_readdirp] nfs: readdir
[2010-08-12 03:11:05.325043] T [rpc-clnt.c:1164:rpc_clnt_record] : Auth Info: pid: 0, uid: 0, gid: 0, owner: 0
[2010-08-12 03:11:05.325073] T [rpc-clnt.c:1056:rpc_clnt_record_build_header] rpc-clnt: Request fraglen 172, payload: 44, rpc hdr: 128
[2010-08-12 03:11:05.325112] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 4, dataread: 4
[2010-08-12 03:11:05.325131] T [rpcsvc.c:1064:rpcsvc_record_read_complete_fraghdr] rpc-service: Received fragment size: 65632
[2010-08-12 03:11:05.325147] T [rpcsvc.c:1067:rpcsvc_record_read_complete_fraghdr] rpc-service: Vectored RPC header, remaining: 32
[2010-08-12 03:11:05.325165] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 32, dataread: 32
[2010-08-12 03:11:05.325181] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325195] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.325271] T [rpcsvc.c:2293:rpcsvc_handle_vectored_frag] rpc-service: Vectored frag complete
[2010-08-12 03:11:05.325305] T [rpcsvc.c:2072:rpcsvc_update_vectored_barerpc] rpc-service: Vectored RPC verfsz remaining: 8
[2010-08-12 03:11:05.325323] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 8, dataread: 8
[2010-08-12 03:11:05.325338] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325353] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.325367] T [rpcsvc.c:2293:rpcsvc_handle_vectored_frag] rpc-service: Vectored frag complete
[2010-08-12 03:11:05.325381] T [rpcsvc.c:2030:rpcsvc_update_vectored_verfsz] rpc-service: Vectored RPC preparing call
[2010-08-12 03:11:05.325396] T [rpcsvc-auth.c:247:rpcsvc_auth_request_init] rpc-service: Auth handler: AUTH_NULL
[2010-08-12 03:11:05.325411] D [rpcsvc.c:1817:rpcsvc_request_create] rpc-service: RPC XID: 547186f4, Ver: 2, Program: 100003, ProgVers: 3, Proc: 7
[2010-08-12 03:11:05.325427] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - WRITE
[2010-08-12 03:11:05.325441] T [rpcsvc.c:1989:rpcsvc_handle_vectored_prep_rpc_call] rpc-service: Vectored RPC proc header remaining: 4
[2010-08-12 03:11:05.325459] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 4, dataread: 4
[2010-08-12 03:11:05.325475] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325489] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.325503] T [rpcsvc.c:2293:rpcsvc_handle_vectored_frag] rpc-service: Vectored frag complete
[2010-08-12 03:11:05.325517] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - WRITE
[2010-08-12 03:11:05.325532] T [rpcsvc.c:2137:rpcsvc_handle_vectored_rpc_call] rpc-service: Vectored RPC proc remaining: 32
[2010-08-12 03:11:05.325551] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 32, dataread: 32
[2010-08-12 03:11:05.325566] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325580] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.325595] T [rpcsvc.c:2293:rpcsvc_handle_vectored_frag] rpc-service: Vectored frag complete
[2010-08-12 03:11:05.325609] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - WRITE
[2010-08-12 03:11:05.325624] T [rpcsvc.c:2137:rpcsvc_handle_vectored_rpc_call] rpc-service: Vectored RPC proc remaining: 20
[2010-08-12 03:11:05.325642] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 20, dataread: 20
[2010-08-12 03:11:05.325657] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325672] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.325686] T [rpcsvc.c:2293:rpcsvc_handle_vectored_frag] rpc-service: Vectored frag complete
[2010-08-12 03:11:05.325711] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - WRITE
[2010-08-12 03:11:05.325728] T [rpcsvc.c:2133:rpcsvc_handle_vectored_rpc_call] rpc-service: Vectored RPC buf remaining: 65536
[2010-08-12 03:11:05.325774] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 65536, dataread: 34652
[2010-08-12 03:11:05.325791] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325805] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 30884
[2010-08-12 03:11:05.325956] T [rpcsvc.c:2396:rpcsvc_conn_data_poll_in] rpc-service: conn: 0x2aaab510be90, readsize: 30884, dataread: 30884
[2010-08-12 03:11:05.325975] T [rpcsvc.c:2330:rpcsvc_record_update_state] rpc-service: Vectored frag
[2010-08-12 03:11:05.325989] T [rpcsvc.c:1176:rpcsvc_record_read_partial_frag] rpc-service: Fragment remaining: 0
[2010-08-12 03:11:05.326003] T [rpcsvc.c:2293:rpcsvc_handle_vectored_frag] rpc-service: Vectored frag complete
[2010-08-12 03:11:05.326017] T [rpcsvc.c:2235:rpcsvc_update_vectored_state] rpc-service: Vectored RPC vector read
[2010-08-12 03:11:05.326031] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - WRITE
[2010-08-12 03:11:05.326047] D [nfs3-helpers.c:2348:nfs3_log_rw_call] nfs-nfsv3: XID: 547186f4, WRITE: args: FH: hashcount 4, xlid 0, gen 5504105043428745476, ino 407966271, offset: 77668352,  count: 65536, UNSTABLE
[2010-08-12 03:11:05.326064] T [nfs3.c:1837:nfs3_write] nfs-nfsv3: FH to Volume: distribute
[2010-08-12 03:11:05.326079] T [nfs3-helpers.c:2973:nfs3_fh_resolve_inode] nfs-nfsv3: FH needs inode resolution
[2010-08-12 03:11:05.326093] T [nfs3-helpers.c:2477:nfs3_fh_resolve_inode_done] nfs-nfsv3: FH inode resolved
[2010-08-12 03:11:05.326113] T [nfs3-helpers.c:2192:nfs3_file_open_and_resume] nfs-nfsv3: Opening: /NFSalpha28/ip-10-244-167-207/test15/fi2
[2010-08-12 03:11:05.326128] T [nfs3-helpers.c:2172:nfs3_fdcache_getfd] nfs-nfsv3: fd found in state: 23
[2010-08-12 03:11:05.326142] T [nfs3-helpers.c:1880:__nfs3_fdcache_update_entry] nfs-nfsv3: Updating fd: 0x2aaab5193b70
[2010-08-12 03:11:05.326157] T [nfs.c:409:nfs_user_create] nfs: uid: 0, gid 0, gids: 0
[2010-08-12 03:11:05.326186] T [rpc-clnt.c:1164:rpc_clnt_record] : Auth Info: pid: 2963531568, uid: 0, gid: 0, owner: 2963531568
[2010-08-12 03:11:05.326204] T [rpc-clnt.c:1056:rpc_clnt_record_build_header] rpc-clnt: Request fraglen 208, payload: 80, rpc hdr: 128
/lib64/libc.so.6[0x2aaaab7aaf30]
/lib64/libpthread.so.0(pthread_mutex_lock+0x13)[0x2aaaab567343]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(inode_unref+0x28)[0x2aaaaaceef92]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(loc_wipe+0x26)[0x2aaaaace0a8c]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/cluster/replicate.so(afr_local_cleanup+0x59)[0x2aaaac567d1e]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/cluster/replicate.so(afr_unlink_done+0x7e)[0x2aaaac57f8df]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/cluster/replicate.so(afr_unlock_common_cbk+0x7b)[0x2aaaac5907c7]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/client.so(client3_1_entrylk_cbk+0x1e0)[0x2aaaac3484a0]
/opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x1f1)[0x2aaaaaf33044]
/opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_clnt_notify+0x270)[0x2aaaaaf33355]
/opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_transport_notify+0xd8)[0x2aaaaaf3109b]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_poll_in+0x4b)[0x2aaaacef97c3]
/opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_handler+0xb7)[0x2aaaacef9ab7]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaad038e4]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaad03ac5]
/opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(event_dispatch+0x73)[0x2aaaaad03ddb]
/opt/glusterfs/3.1.0qa3/sbin/glusterfsd(main+0xec)[0x405062]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2aaaab798074]
/opt/glusterfs/3.1.0qa3/sbin/glusterfsd[0x4027a9]
---------

Comment 2 Shehjar Tikoo 2010-08-17 10:41:10 UTC

*** This bug has been marked as a duplicate of bug 1295 ***