Hide Forgot
part of log file can be found under /share/tickets/1315/gnfs.tar
while running mixed (fileop) tests on 4glfs server + 4 gnfs server and 8 clients with glusterfs-3.1.0qa3 - gnfs server crashed. (gdb) bt full #0 0x00002aaaab8786ab in xdrmem_getint32 () from /lib64/libc.so.6 No symbol table info available. #1 0x00002aaaab87e97c in xdr_u_quad_t () from /lib64/libc.so.6 No symbol table info available. #2 0x00002aaaab14560f in xdr_gf_common_rsp (xdrs=0x7fffabb00570, objp=0x7fffabb00610) at glusterfs3-xdr.c:1659 No locals. #3 0x00002aaaaaf34d16 in xdr_to_generic (inmsg={iov_base = 0x100000000000, iov_len = 16}, args=0x7fffabb00610, proc=0x2aaaab1455f2 <xdr_gf_common_rsp>) at rpc-common.c:40 xdr = {x_op = XDR_DECODE, x_ops = 0x2aaaabac8880, x_public = 0x2aaa06872d12 <Address 0x2aaa06872d12 out of bounds>, x_private = 0x100000000004 <Address 0x100000000004 out of bounds>, x_base = 0x100000000000 <Address 0x100000000000 out of bounds>, x_handy = 12} ret = -1 #4 0x00002aaaab147ae7 in xdr_to_common_rsp (outmsg={iov_base = 0x100000000000, iov_len = 16}, rsp=0x7fffabb00610) at glusterfs3.c:1126 No locals. #5 0x00002aaaac34832e in client3_1_entrylk_cbk (req=0x2aaaad98d3e0, iov=0x2aaaad98d420, count=1, myframe=0x2f907d20) at client3_1-fops.c:1124 frame = (call_frame_t *) 0x2f907d20 rsp = {gfs_id = 0, op_ret = 0, op_errno = 0} ret = 0 __FUNCTION__ = "client3_1_entrylk_cbk" #6 0x00002aaaaaf33044 in rpc_clnt_handle_reply (clnt=0x638ba0, pollin=0x2f9170f0) at rpc-clnt.c:672 conn = (rpc_clnt_connection_t *) 0x638bd0 saved_frame = (struct saved_frame *) 0x2aaaada4b2b0 request_info = (rpc_request_info_t *) 0x2f913e50 ret = 0 req = (struct rpc_req *) 0x2aaaad98d3e0 __FUNCTION__ = "rpc_clnt_handle_reply" #7 0x00002aaaaaf33355 in rpc_clnt_notify (trans=0x638d30, mydata=0x638bd0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2f9170f0) at rpc-clnt.c:781 conn = (rpc_clnt_connection_t *) 0x638bd0 clnt = (struct rpc_clnt *) 0x638ba0 ret = -1 req_info = (rpc_request_info_t *) 0x0 pollin = (rpc_transport_pollin_t *) 0x2f9170f0 tv = {tv_sec = 0, tv_usec = 0} #8 0x00002aaaaaf3109b in rpc_transport_notify (this=0x638d30, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2f9170f0) at rpc-transport.c:1239 ret = -1 #9 0x00002aaaaccc27c3 in socket_event_poll_in (this=0x638d30) at socket.c:1406 ret = 0 pollin = (rpc_transport_pollin_t *) 0x2f9170f0 #10 0x00002aaaaccc2ab7 in socket_event_handler (fd=9, idx=1, data=0x638d30, poll_in=1, poll_out=0, poll_err=0) at socket.c:1512 this = (rpc_transport_t *) 0x638d30 priv = (socket_private_t *) 0x6390c0 ret = 0 __FUNCTION__ = "socket_event_handler" #11 0x00002aaaaad038e4 in event_dispatch_epoll_handler (event_pool=0x629330, events=0x639c00, i=0) at event.c:812 event_data = (struct event_data *) 0x639c04 handler = (event_handler_t) 0x2aaaaccc2a00 <socket_event_handler> data = (void *) 0x638d30 idx = 1 ret = -1 __FUNCTION__ = "event_dispatch_epoll_handler" #12 0x00002aaaaad03ac5 in event_dispatch_epoll (event_pool=0x629330) at event.c:876 events = (struct epoll_event *) 0x639c00 size = 1 i = 0 ret = 1 __FUNCTION__ = "event_dispatch_epoll" #13 0x00002aaaaad03ddb in event_dispatch (event_pool=0x629330) at event.c:984 ret = -1 __FUNCTION__ = "event_dispatch" #14 0x0000000000405062 in main (argc=7, argv=0x7fffabb00b58) at glusterfsd.c:1273 ctx = (glusterfs_ctx_t *) 0x629010 ret = 0 log file- ================================ [2010-08-09 10:29:37.226035] T [nfs3.c:1039:nfs3_lookup] nfs-nfsv3: FH to Volume: mirror [2010-08-09 10:29:37.226050] T [nfs3-helpers.c:2940:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: ino: 205328953, gen: 5502220922886637152, entry: fileop_L1_58_L2_17, hashidx: 0 [2010-08-09 10:29:37.226082] T [nfs.c:409:nfs_user_create] nfs: uid: 0, gid 0, gids: 0 [2010-08-09 10:29:37.226098] T [nfs-fops.c:279:nfs_fop_lookup] nfs: Lookup: /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 [2010-08-09 10:29:37.226144] T [rpc-clnt.c:1164:rpc_clnt_record] : Auth Info: pid: 0, uid: 0, gid: 0, owner: 0 [2010-08-09 10:29:37.226162] T [rpc-clnt.c:1056:rpc_clnt_record_build_header] rpc-clnt: Request fraglen 640, payload: 512, rpc hdr: 128 [2010-08-09 10:29:37.227409] T [rpc-clnt.c:596:rpc_clnt_reply_init] rpc-clnt: RPC XID: 109522193 Program: GlusterFS 3.1, ProgVers: 310, Proc: 27 [2010-08-09 10:29:37.227446] D [afr.c:533:afr_lookup_collect_xattr] mirror: entry self-heal is pending for /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17. [2010-08-09 10:29:37.227468] N [afr.c:688:afr_lookup_done] mirror: background entry self-heal triggered. path: /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 [2010-08-09 10:29:37.227484] T [afr-self-heal-common.c:1574:afr_self_heal] mirror: performing self heal on /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 (metadata=0 data=0 entry=1) [2010-08-09 10:29:37.227508] T [afr-self-heal-common.c:1622:afr_self_heal] mirror: proceeding to metadata check on /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 [2010-08-09 10:29:37.227523] T [afr-self-heal-common.c:781:afr_sh_missing_entries_done] mirror: proceeding to metadata check on /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 [2010-08-09 10:29:37.227538] T [afr-self-heal-metadata.c:95:afr_sh_metadata_done] mirror: proceeding to entry check on /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 [2010-08-09 10:29:37.227553] T [afr-self-heal-entry.c:2514:afr_sh_entry_lock] mirror: locking /nfsalpha2/ip-10-212-117-143/test5/fileop_L1_58/fileop_L1_58_L2_17 on subvolume ec2-204-236-197-168.compute-1.amazonaws.com-1 [2010-08-09 10:29:37.227573] T [rpc-clnt.c:1164:rpc_clnt_record] : Auth Info: pid: 0, uid: 0, gid: 0, owner: 0 [2010-08-09 10:29:37.227590] T [rpc-clnt.c:1056:rpc_clnt_record_build_header] rpc-clnt: Request fraglen 256, payload: 128, rpc hdr: 128 [2010-08-09 10:29:37.227872] T [rpc-clnt.c:596:rpc_clnt_reply_init] rpc-clnt: RPC XID: 109522194 Program: GlusterFS 3.1, ProgVers: 310, Proc: 31 pending frames: patchset: v3.1.0qa3 signal received: 11 time of crash: 2010-08-09 10:29:37 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.1.0qa3 /lib64/libc.so.6[0x2aaaab7aaf30] /lib64/libc.so.6[0x2aaaab8786ab] /lib64/libc.so.6(xdr_int64_t+0x8c)[0x2aaaab87e97c] /opt/glusterfs/3.1.0qa3//lib/libgfxdr.so.0(xdr_gf_common_rsp+0x1d)[0x2aaaab14560f] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(xdr_to_generic+0x75)[0x2aaaaaf34d16] /opt/glusterfs/3.1.0qa3//lib/libgfxdr.so.0(xdr_to_common_rsp+0x35)[0x2aaaab147ae7] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/client.so(client3_1_entrylk_cbk+0x6e)[0x2aaaac34832e] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x1f1)[0x2aaaaaf33044] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_clnt_notify+0x270)[0x2aaaaaf33355] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_transport_notify+0xd8)[0x2aaaaaf3109b] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_poll_in+0x4b)[0x2aaaaccc27c3] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_handler+0xb7)[0x2aaaaccc2ab7] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaad038e4] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaad03ac5] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(event_dispatch+0x73)[0x2aaaaad03ddb] /opt/glusterfs/3.1.0qa3/sbin/glusterfsd(main+0xec)[0x405062] /lib64/libc.so.6(__libc_start_main+0xf4)[0x2aaaab798074] /opt/glusterfs/3.1.0qa3/sbin/glusterfsd[0x4027a9] --------- ================================
here check frame 4 #4 0x00002aaaab147ae7 in xdr_to_common_rsp (outmsg={iov_base = 0x100000000000, Clearly shows that outmsg is corrupted (or assigned)
While testing fileop with 4x3 distributed replicate with 3.1.0qa5 It passed first time when i ran with lesser depth value /opt/qa/tools/fileop -f 30 -t But ran again with higher depth it failed and gnfs crashes. Log file can be found /share/tickets/1315/1315.log core file- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (gdb) bt full #0 0x00002aaaab568343 in pthread_mutex_lock () from /lib64/libpthread.so.0 No symbol table info available. #1 0x00002aaaaacee55d in inode_ref (inode=0x4514a70) at inode.c:478 table = (inode_table_t *) 0x0 #2 0x00002aaaaace0463 in loc_copy (dst=0x4514a78, src=0x2aaaaac4b408) at xlator.c:1023 ret = -1 #3 0x00002aaaacc8c3ce in dht_lookup (frame=0x2aaaac3354a0, this=0x63d050, loc=0x2aaaaac4b408, xattr_req=0x0) at dht-common.c:815 subvol = (xlator_t *) 0x0 hashed_subvol = (xlator_t *) 0x0 cached_subvol = (xlator_t *) 0x0 local = (dht_local_t *) 0x4514a70 conf = (dht_conf_t *) 0x63f920 ret = -1 op_errno = -1 layout = (dht_layout_t *) 0x0 i = 0 call_cnt = 0 __FUNCTION__ = "dht_lookup" #4 0x00002aaaacec3518 in nfs_fop_lookup (nfsx=0x63e080, xl=0x63d050, nfu=0x40affd10, loc=0x2aaaaac4b408, cbk=0x2aaaacee3cc8 <nfs3_fh_resolve_entry_lookup_cbk>, local=0x2aaaaac4b020) at nfs-fops.c:284 _new = (call_frame_t *) 0x2aaaac3354a0 old_THIS = (xlator_t *) 0x2aaaaaf22560 tmp_cbk = (fop_lookup_cbk_t) 0x2aaaacec2e21 <nfs_fop_lookup_cbk> frame = (call_frame_t *) 0x2aaaac536e78 ret = -14 nfl = (struct nfs_fop_local *) 0x2aaaaab25280 __FUNCTION__ = "nfs_fop_lookup" #5 0x00002aaaacecb58a in nfs_lookup (nfsx=0x63e080, xl=0x63d050, nfu=0x40affd10, pathloc=0x2aaaaac4b408, cbk=0x2aaaacee3cc8 <nfs3_fh_resolve_entry_lookup_cbk>, local=0x2aaaaac4b020) at nfs-generics.c:86 ret = -14 #6 0x00002aaaacee5097 in nfs3_fh_resolve_entry_hard (cs=0x2aaaaac4b020) at nfs3-helpers.c:2949 ret = -2 nfu = {uid = 0, gids = {0 <repeats 17 times>}, ngrps = 1} __FUNCTION__ = "nfs3_fh_resolve_entry_hard" #7 0x00002aaaacee5232 in nfs3_fh_resolve_entry (cs=0x2aaaaac4b020) at nfs3-helpers.c:2995 ret = -14 #8 0x00002aaaacee537b in nfs3_fh_resolve_and_resume (cs=0x2aaaaac4b020, fh=0x40affed0, entry=0x40afff10 "fileop_dir_29_8_13", Missing separate debuginfos, use: debuginfo-install gcc.x86_64 glibc.x86_64 resum_fn=0x2aaaaced5838 <nfs3_mkdir_resume>) at nfs3-helpers.c:3022 ret = -14 #9 0x00002aaaaced5d68 in nfs3_mkdir (req=0x67f230, dirfh=0x40affed0, name=0x40afff10 "fileop_dir_29_8_13", sattr=0x40affe88) at nfs3.c:2439 vol = (xlator_t *) 0x63d050 stat = NFS3ERR_SERVERFAULT ret = -14 nfs3 = (struct nfs3_state *) 0x65a270 cs = (nfs3_call_state_t *) 0x2aaaaac4b020 __FUNCTION__ = "nfs3_mkdir" #10 0x00002aaaaced5ece in nfs3svc_mkdir (req=0x67f230) at nfs3.c:2473 name = "fileop_dir_29_8_13\000\000\000\000\000\000\000 �@\000\000\000\000\000\020\000\000\000\000\000\000,\017\177��*\000\000\000\020\000\000\000\000\000\000\200٬��*\000\000`\002�@", '\0' <repeats 21 times>, " �@\000\000\000\000\000\020\000\000\000\000\000\000,\017\177��*\000\000����", '\0' <repeats 28 times>, "P�b", '\0' <repeats 13 times>, "`\002�@\000\000\000\0001\017Ϊ�*\000\000�Y���*\000\000\034U���*\000\000\000\000\000\000\000\000\000\000��P\004\000\000\000\000p\237P\004\000\000\000\000P\000�@\000\000\000\000P\000�@\000\000\000\000"... dirfh = {ident = ":O", hashcount = 2, xlatorid = 0, gen = 5505199980916578812, ino = 634077276, entryhash = {257, 62611, 0 <repeats 19 times>}} args = {where = {dir = {data = {data_len = 26, data_val = 0x40affed0 ":O\002"}}, name = 0x40afff10 "fileop_dir_29_8_13"}, attributes = {mode = {set_it = 1, set_mode3_u = {mode = 493}}, uid = {set_it = 0, set_uid3_u = {uid = 0}}, gid = {set_it = 0, set_gid3_u = {gid = 0}}, size = {set_it = 0, set_size3_u = { size = 0}}, atime = {set_it = DONT_CHANGE, set_atime_u = {atime = {seconds = 0, nseconds = 0}}}, mtime = {set_it = DONT_CHANGE, set_mtime_u = {mtime = { seconds = 0, nseconds = 0}}}}} ret = -1 __FUNCTION__ = "nfs3svc_mkdir" #11 0x00002aaaaceebaec in nfs_rpcsvc_handle_rpc_call (conn=0x64b450) at ../../../../xlators/nfs/lib/src//rpcsvc.c:1901 actor = (rpcsvc_actor_t *) 0x2aaaad104140 req = (rpcsvc_request_t *) 0x67f230 ret = -1 #12 0x00002aaaaceec7ee in nfs_rpcsvc_record_update_state (conn=0x64b450, dataread=0) at ../../../../xlators/nfs/lib/src//rpcsvc.c:2383 rs = (rpcsvc_record_state_t *) 0x64b468 svc = (rpcsvc_t *) 0x0 __FUNCTION__ = "nfs_rpcsvc_record_update_state" #13 0x00002aaaaceec944 in nfs_rpcsvc_conn_data_poll_in (conn=0x64b450) at ../../../../xlators/nfs/lib/src//rpcsvc.c:2426 dataread = 124 readsize = 124 readaddr = 0x2aaaac134000 "���g" ret = -1 __FUNCTION__ = "nfs_rpcsvc_conn_data_poll_in" #14 0x00002aaaaceecd68 in nfs_rpcsvc_conn_data_handler (fd=21, idx=3, data=0x64b450, poll_in=1, poll_out=0, poll_err=0) at ../../../../xlators/nfs/lib/src//rpcsvc.c:2555 conn = (rpcsvc_conn_t *) 0x64b450 ret = 0 #15 0x00002aaaaad02165 in event_dispatch_epoll_handler (event_pool=0x63f2f0, events=0x659360, i=0) at event.c:812 event_data = (struct event_data *) 0x659364 handler = (event_handler_t) 0x2aaaaceecce1 <nfs_rpcsvc_conn_data_handler> data = (void *) 0x64b450 idx = 3 ret = -1 __FUNCTION__ = "event_dispatch_epoll_handler" #16 0x00002aaaaad02338 in event_dispatch_epoll (event_pool=0x63f2f0) at event.c:876 events = (struct epoll_event *) 0x659360 size = 1 i = 0 ret = 1 __FUNCTION__ = "event_dispatch_epoll" #17 0x00002aaaaad02608 in event_dispatch (event_pool=0x63f2f0) at event.c:984 ret = -1 __FUNCTION__ = "event_dispatch" #18 0x00002aaaacee8236 in nfs_rpcsvc_stage_proc (arg=0x62b300) at ../../../../xlators/nfs/lib/src//rpcsvc.c:64 stg = (rpcsvc_stage_t *) 0x62b300 #19 0x00002aaaab566407 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #20 0x00002aaaab84fb0d in clone () from /lib64/libc.so.6 No symbol table info available.
Is this bug still valid? if yes, please let me know when it happens.
I'll check with upcoming gnfs-glfs version with dvm integration and update the status.
(In reply to comment #5) > I'll check with upcoming gnfs-glfs version with dvm integration and update the > status. Hi Lakshmi, Can you check the issues now? latest release should have proper fix for gNFS with gfid changes.
> > Hi Lakshmi, > > Can you check the issues now? latest release should have proper fix for gNFS > with gfid changes. Can this be verified now?
Closing. I am confident this is not present anymore.