| Summary: | [glusterfs-3.3.0qa11]: nfs server crashed due to frame->local being NULL in afr | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Raghavendra Bhat <rabhat> |
| Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | low | ||
| Version: | pre-release | CC: | gluster-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
CHANGE: http://review.gluster.com/536 (Change-Id: Ica845035781f47de990e9dcfefdeb37bed99d515) merged in master by Vijay Bellur (vijay) |
gluster nfs server crashed while sanity scripts was running. Setup: 2 replica volume. 1 fuse client and 1 nfs client Both the clients were executing sanity script and volume set operation was happening parallely. Killed one of the bricks after few minutes of tests running, slept for 15-20 minutes and restarted the volume. gluster nfs server crashed with the following backtrace. Core was generated by `/usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/'. Program terminated with signal 11, Segmentation fault. #0 0x00002aaaaad0f191 in afr_changelog_post_op_cbk (frame=0x2b97ee4a0228, cookie=0x2b97ee1ee9d4, this=0x134934c0, op_ret=0, op_errno=22, xattr=0x134ac210) at ../../../../../xlators/cluster/afr/src/afr-transaction.c:350 350 call_count = --local->call_count; (gdb) bt #0 0x00002aaaaad0f191 in afr_changelog_post_op_cbk (frame=0x2b97ee4a0228, cookie=0x2b97ee1ee9d4, this=0x134934c0, op_ret=0, op_errno=22, xattr=0x134ac210) at ../../../../../xlators/cluster/afr/src/afr-transaction.c:350 #1 0x00002aaaaaac5192 in client3_1_xattrop_cbk (req=0x2aaaad72b4f8, iov=0x2aaaad72b538, count=1, myframe=0x2b97ee1ee9d4) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1425 #2 0x00002b97ed55c25e in rpc_clnt_handle_reply (clnt=0x134a6ec0, pollin=0x134f1610) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789 #3 0x00002b97ed55c586 in rpc_clnt_notify (trans=0x134a71f0, mydata=0x134a6ef0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x134f1610) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:902 #4 0x00002b97ed5589f3 in rpc_transport_notify (this=0x134a71f0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x134f1610) at ../../../../rpc/rpc-lib/src/rpc-transport.c:498 #5 0x00002aaaad4daea7 in socket_event_poll_in (this=0x134a71f0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1675 #6 0x00002aaaad4db3e9 in socket_event_handler (fd=6, idx=3, data=0x134a71f0, poll_in=1, poll_out=0, poll_err=0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1790 #7 0x00002b97ed30484c in event_dispatch_epoll_handler (event_pool=0x1348b960, events=0x134a9e30, i=0) at ../../../libglusterfs/src/event.c:794 #8 0x00002b97ed304a51 in event_dispatch_epoll (event_pool=0x1348b960) at ../../../libglusterfs/src/event.c:856 #9 0x00002b97ed304dab in event_dispatch (event_pool=0x1348b960) at ../../../libglusterfs/src/event.c:956 #10 0x000000000040784d in main (argc=7, argv=0x7fffbc72d2a8) at ../../../glusterfsd/src/glusterfsd.c:1592 (gdb) f 0 #0 0x00002aaaaad0f191 in afr_changelog_post_op_cbk (frame=0x2b97ee4a0228, cookie=0x2b97ee1ee9d4, this=0x134934c0, op_ret=0, op_errno=22, xattr=0x134ac210) at ../../../../../xlators/cluster/afr/src/afr-transaction.c:350 350 call_count = --local->call_count; (gdb) p local $1 = (afr_local_t *) 0x0 (gdb) l 345 346 child_index = (long) cookie; 347 348 LOCK (&frame->lock); 349 { 350 call_count = --local->call_count; 351 } 352 UNLOCK (&frame->lock); 353 354 if (call_count == 0) { (gdb) p *frame $2 = {root = 0x2b97ee4a01a0, parent = 0x0, next = 0x0, prev = 0x0, local = 0x0, this = 0x134934c0, ret = 0, ref_count = 0, lock = 0, cookie = 0x0, complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x0, wind_to = 0x0, unwind_from = 0x0, unwind_to = 0x0} (gdb) info thr 4 Thread 13575 0x0000003eb660e838 in do_sigwait () from /lib64/libpthread.so.0 3 Thread 13576 0x0000003eb660aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 2 Thread 13577 0x0000003eb5e9a541 in nanosleep () from /lib64/libc.so.6 * 1 Thread 13574 0x00002aaaaad0f191 in afr_changelog_post_op_cbk (frame=0x2b97ee4a0228, cookie=0x2b97ee1ee9d4, this=0x134934c0, op_ret=0, op_errno=22, xattr=0x134ac210) at ../../../../../xlators/cluster/afr/src/afr-transaction.c:350 (gdb) t 3 [Switching to thread 3 (Thread 13576)]#0 0x0000003eb660aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) bt #0 0x0000003eb660aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00002b97ed314b61 in syncenv_task (env=0x1348c490) at ../../../libglusterfs/src/syncop.c:194 #2 0x00002b97ed314ca0 in syncenv_processor (thdata=0x1348c490) at ../../../libglusterfs/src/syncop.c:232 #3 0x0000003eb660673d in start_thread () from /lib64/libpthread.so.0 #4 0x0000003eb5ed44bd in clone () from /lib64/libc.so.6 (gdb) f 1 #1 0x00002b97ed314b61 in syncenv_task (env=0x1348c490) at ../../../libglusterfs/src/syncop.c:194 194 pthread_cond_wait (&env->cond, &env->mutex); (gdb) l 189 struct synctask *task = NULL; 190 191 pthread_mutex_lock (&env->mutex); 192 { 193 while (list_empty (&env->runq)) 194 pthread_cond_wait (&env->cond, &env->mutex); 195 196 task = list_entry (env->runq.next, struct synctask, all_tasks); 197 198 list_del_init (&task->all_tasks); (gdb)