| Summary: | crash during nfs alpha test | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Lakshmipathi G <lakshmipathi> |
| Component: | protocol | Assignee: | Amar Tumballi <amarts> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | low | ||
| Version: | 3.1-alpha | CC: | gluster-bugs, shehjart, vraman |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | RTP | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Looks like resolve_and_resume path does not set bname resulting in a crash. Re-assigning to Avati. PATCH: http://patches.gluster.com/patch/4086 in master (argument sanity checks added in inode.c) hey amar, We just saw this crash again which made me think. I dont understand how this patch will fix the problem of proto/server calling the resolve function with bname as NULL. Sure, the sanity checks will fix the NULL dereference and the crash but the fop overall will fail because the resolution will fail because of bname being NULL. Comments? *** Bug 1354 has been marked as a duplicate of this bug. *** (In reply to comment #3) > hey amar, We just saw this crash again which made me think. I dont understand > how this patch will fix the problem of proto/server calling the resolve > function with bname as NULL. Sure, the sanity checks will fix the NULL > dereference and the crash but the fop overall will fail because the resolution > will fail because of bname being NULL. Comments? Actually, if you want to do a inode_grep (), then bname should never be NULL. Yes, its true that the root cause for this is not yet fixed. But the sanity checks will make sure that the core is not crashing for some mistakes in higher layer. We can open new bug for 'fop' failing and have to investigate why 'bname' is NULL. |
While running nfs alpha mixed tests with 4 glusterfs servers and 4 gnfs servers and 8 clients . Following glusterfs core found. -------------- (gdb) bt full #0 0x00002aaaaacee419 in hash_dentry (parent=0x6ecf68, name=0x0, mod=14057) at inode.c:63 hash = 0 ret = 0 #1 0x00002aaaaacef59e in __dentry_grep (table=0x63d428, parent=0x6ecf68, name=0x0) at inode.c:565 hash = 0 dentry = (dentry_t *) 0x0 tmp = (dentry_t *) 0x0 #2 0x00002aaaaacef666 in inode_grep (table=0x63d428, parent=0x6ecf68, name=0x0) at inode.c:586 inode = (inode_t *) 0x0 dentry = (dentry_t *) 0x0 #3 0x00002aaaacb95069 in resolve_entry_simple (frame=0x2aaab4553500) at server-resolve.c:364 state = (server_state_t *) 0x2aaab4686f08 this = (xlator_t *) 0x6346d8 resolve = (server_resolve_t *) 0x2aaab4686f78 parent = (inode_t *) 0x6ecf68 inode = (inode_t *) 0x0 ret = 0 __FUNCTION__ = "resolve_entry_simple" #4 0x00002aaaacb951ed in server_resolve_entry (frame=0x2aaab4553500) at server-resolve.c:419 state = (server_state_t *) 0x2aaab4686f08 ret = 0 loc = (loc_t *) 0x2aaab4686f28 #5 0x00002aaaacb95524 in server_resolve (frame=0x2aaab4553500) at server-resolve.c:548 state = (server_state_t *) 0x2aaab4686f08 resolve = (server_resolve_t *) 0x2aaab4686f78 #6 0x00002aaaacb95655 in server_resolve_all (frame=0x2aaab4553500) at server-resolve.c:605 state = (server_state_t *) 0x2aaab4686f08 this = (xlator_t *) 0x6346d8 __FUNCTION__ = "server_resolve_all" #7 0x00002aaaacb9574d in resolve_and_resume (frame=0x2aaab4553500, fn=0x2aaaacba5ff9 <server_lookup_resume>) at server-resolve.c:635 state = (server_state_t *) 0x2aaab4686f08 #8 0x00002aaaacbab37b in server_lookup (req=0x2aaaabb85948) at server3_1-fops.c:4813 frame = (call_frame_t *) 0x2aaab4553500 conn = (server_connection_t *) 0x64acb8 state = (server_state_t *) 0x2aaab4686f08 xattr_req = (dict_t *) 0x0 buf = 0x0 args = {gfs_id = 27, ino = 0, par = 37684044, gen = 5502220811211473897, flags = 0, path = 0x7fff663d4e20 "", bname = 0x7fff663d0e20 "", dict = { dict_len = 0, dict_val = 0x7fff663cce20 ""}} ret = 0 path = '\0' <repeats 16383 times> bname = '\0' <repeats 16383 times> dict_val = '\0' <repeats 16383 times> __FUNCTION__ = "server_lookup" #9 0x00002aaaaaf2bde5 in rpcsvc_handle_rpc_call (conn=0x6568e8, msg=0x2aaab4436bb8) at rpcsvc.c:1195 actor = (rpcsvc_actor_t *) 0x2aaaacdb4bc0 req = (rpcsvc_request_t *) 0x2aaaabb85948 ret = -1 __FUNCTION__ = "rpcsvc_handle_rpc_call" #10 0x00002aaaaaf2bfc6 in rpcsvc_notify (trans=0x65fae8, mydata=0x6568e8, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2aaab4436bb8) at rpcsvc.c:1241 conn = (rpcsvc_conn_t *) 0x6568e8 ret = -1 msg = (rpc_transport_pollin_t *) 0x2aaab4436bb8 new_trans = (rpc_transport_t *) 0x0 __FUNCTION__ = "rpcsvc_notify" #11 0x00002aaaaaf3109b in rpc_transport_notify (this=0x65fae8, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2aaab4436bb8) at rpc-transport.c:1239 ret = -1 #12 0x00002aaaacfbe7c3 in socket_event_poll_in (this=0x65fae8) at socket.c:1406 ret = 0 pollin = (rpc_transport_pollin_t *) 0x2aaab4436bb8 #13 0x00002aaaacfbeab7 in socket_event_handler (fd=8, idx=3, data=0x65fae8, poll_in=1, poll_out=0, poll_err=0) at socket.c:1512 this = (rpc_transport_t *) 0x65fae8 priv = (socket_private_t *) 0x65acc8 ret = 0 __FUNCTION__ = "socket_event_handler" #14 0x00002aaaaad038e4 in event_dispatch_epoll_handler (event_pool=0x62b218, events=0x63a988, i=0) at event.c:812 event_data = (struct event_data *) 0x63a98c handler = (event_handler_t) 0x2aaaacfbea00 <socket_event_handler> data = (void *) 0x65fae8 idx = 3 ret = -1 __FUNCTION__ = "event_dispatch_epoll_handler" #15 0x00002aaaaad03ac5 in event_dispatch_epoll (event_pool=0x62b218) at event.c:876 events = (struct epoll_event *) 0x63a988 size = 1 i = 0 ret = 1 __FUNCTION__ = "event_dispatch_epoll" #16 0x00002aaaaad03ddb in event_dispatch (event_pool=0x62b218) at event.c:984 ret = -1 __FUNCTION__ = "event_dispatch" #17 0x0000000000405062 in main (argc=7, argv=0x7fff663d9368) at glusterfsd.c:1273 ctx = (glusterfs_ctx_t *) 0x629010 ret = 0 (gdb) ========== log file- ======= [2010-08-06 18:25:28.511788] T [server3_1-fops.c:254:server_inodelk_cbk] server-tcp: 10758552: INODELK /nfsalpha2/ip-10-245-210-193/test7/linux-2.6.35/arch/mn10300/include/asm/page_offset.h (41338621) ==> -1 (Success) [2010-08-06 18:25:28.511808] T [rpcsvc.c:1513:rpcsvc_submit_generic] rpc-service: Tx message: 16 [2010-08-06 18:25:28.511825] T [rpcsvc.c:1319:rpcsvc_record_build_header] rpc-service: Reply fraglen 40, payload: 16, rpc hdr: 24 [2010-08-06 18:25:28.520723] T [rpcsvc-auth.c:276:rpcsvc_auth_request_init] rpc-service: Auth handler: AUTH_GLUSTERFS [2010-08-06 18:25:28.599687] T [rpcsvc.c:1119:rpcsvc_request_create] rpc-service: RPC XID: b0dbdc, Ver: 2, Program: 1298437, ProgVers: 310, Proc: 27 [2010-08-06 18:25:28.594181] T [rpcsvc.c:1513:rpcsvc_submit_generic] rpc-service: Tx message: 332 [2010-08-06 18:25:28.593957] T [rpcsvc.c:1513:rpcsvc_submit_generic] rpc-service: Tx message: 200 [2010-08-06 18:25:28.599733] T [auth-glusterfs.c:176:auth_glusterfs_authenticate] rpc-service: Auth Info: pid: 0, uid: 0, gid: 0, owner: 0 [2010-08-06 18:25:28.599786] T [rpcsvc.c:955:rpcsvc_program_actor] rpc-service: Actor found: GlusterFS-3.1.0 - LOOKUP pending frames: patchset: v3.1.0qa3 signal received: 11 time of crash: 2010-08-06 18:25:28 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.1.0qa3 [2010-08-06 18:25:28.599752] T [rpcsvc.c:1319:rpcsvc_record_build_header] rpc-service: Reply fraglen 224, payload: 200, rpc hdr: 24 [2010-08-06 18:25:28.599727] T [rpcsvc.c:1319:rpcsvc_record_build_header] rpc-service: Reply fraglen 356, payload: 332, rpc hdr: 24 /lib64/libc.so.6[0x2aaaab7aaf30] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaacee419] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(__dentry_grep+0x42)[0x2aaaaacef59e] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(inode_grep+0x3e)[0x2aaaaacef666] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/server.so(resolve_entry_simple+0x1c9)[0x2aaaacb95069] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/server.so(server_resolve_entry+0x4a)[0x2aaaacb951ed] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/server.so(server_resolve+0x69)[0x2aaaacb95524] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/server.so(server_resolve_all+0x76)[0x2aaaacb95655] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/server.so(resolve_and_resume+0x3c)[0x2aaaacb9574d] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/xlator/protocol/server.so(server_lookup+0x38b)[0x2aaaacbab37b] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x19b)[0x2aaaaaf2bde5] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpcsvc_notify+0x167)[0x2aaaaaf2bfc6] /opt/glusterfs/3.1.0qa3//lib/libgfrpc.so.0(rpc_transport_notify+0xd8)[0x2aaaaaf3109b] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_poll_in+0x4b)[0x2aaaacfbe7c3] /opt/glusterfs/3.1.0qa3//lib/glusterfs/3.1.0qa3/rpc-transport/socket.so(socket_event_handler+0xb7)[0x2aaaacfbeab7] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaad038e4] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0[0x2aaaaad03ac5] /opt/glusterfs/3.1.0qa3//lib/libglusterfs.so.0(event_dispatch+0x73)[0x2aaaaad03ddb] /opt/glusterfs/3.1.0qa3/sbin/glusterfsd(main+0xec)[0x405062] /lib64/libc.so.6(__libc_start_main+0xf4)[0x2aaaab798074] /opt/glusterfs/3.1.0qa3/sbin/glusterfsd[0x4027a9] ---------