Bug 1042764 - glusterfsd process crashes while doing ltable cleanup
Summary: glusterfsd process crashes while doing ltable cleanup
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: locks
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-13 10:13 UTC by Raghavendra Bhat
Modified: 2014-07-11 09:11 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-11 09:11:10 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2013-12-13 10:13:50 UTC
Description of problem:

glusterfsd processes crash while doing ltable cleanup (this crash happened when graph change was done and glusterfsd was doing ltable cleanup of the older graph).
This is the backtrace of the core


(gdb) bt
#0  0x0000003fc580c0d0 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00007f3880ff20e4 in __gf_free (free_ptr=0x7f384800c0b0) at ../../../libglusterfs/src/mem-pool.c:265
#2  0x00007f387c74b15a in ltable_delete_locks (ltable=0x7f38640ba6e0) at ../../../../../xlators/features/locks/src/posix.c:2553
#3  0x00007f387c74b3ad in disconnect_cbk (this=0x24b7b30, client=0x2512220) at ../../../../../xlators/features/locks/src/posix.c:2633
#4  0x00007f3881020e3d in gf_client_disconnect (client=0x2512220) at ../../../libglusterfs/src/client_t.c:374
#5  0x00007f3877993f95 in server_connection_cleanup (this=0x24bf4d0, client=0x2512220, flags=3) at ../../../../../xlators/protocol/server/src/server-helpers.c:244
#6  0x00007f3877990510 in server_rpc_notify (rpc=0x24c2730, xl=0x24bf4d0, event=RPCSVC_EVENT_DISCONNECT, data=0x2510830)
    at ../../../../../xlators/protocol/server/src/server.c:529
#7  0x00007f3880d8cc0b in rpcsvc_handle_disconnect (svc=0x24c2730, trans=0x2510830) at ../../../../rpc/rpc-lib/src/rpcsvc.c:682
#8  0x00007f3880d8cd87 in rpcsvc_notify (trans=0x2510830, mydata=0x24c2730, event=RPC_TRANSPORT_DISCONNECT, data=0x2510830) at ../../../../rpc/rpc-lib/src/rpcsvc.c:720
#9  0x00007f3880d91f61 in rpc_transport_notify (this=0x2510830, event=RPC_TRANSPORT_DISCONNECT, data=0x2510830) at ../../../../rpc/rpc-lib/src/rpc-transport.c:512
#10 0x00007f387d9ecc4b in socket_event_poll_err (this=0x2510830) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1071
#11 0x00007f387d9f144e in socket_event_handler (fd=13, idx=6, data=0x2510830, poll_in=1, poll_out=0, poll_err=0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:2239
#12 0x00007f388102406a in event_dispatch_epoll_handler (event_pool=0x248efe0, events=0x24ad4d0, i=0) at ../../../libglusterfs/src/event-epoll.c:384
#13 0x00007f388102424f in event_dispatch_epoll (event_pool=0x248efe0) at ../../../libglusterfs/src/event-epoll.c:445
#14 0x00007f3880ff16f1 in event_dispatch (event_pool=0x248efe0) at ../../../libglusterfs/src/event.c:113
#15 0x0000000000409248 in main (argc=19, argv=0x7fff44b8eff8) at ../../../glusterfsd/src/glusterfsd.c:1966
(gdb) 
f 0
#0  0x0000003fc580c0d0 in pthread_spin_lock () from /lib64/libpthread.so.0
(gdb) f 1
#1  0x00007f3880ff20e4 in __gf_free (free_ptr=0x7f384800c0b0) at ../../../libglusterfs/src/mem-pool.c:265
265	        LOCK (&xl->mem_acct.rec[type].lock);
(gdb) p type
$1 = 1208015088
(gdb) p xl->mem_acct.rec[type]
Cannot access memory at address 0x90268c830
(gdb) 

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. create a replicate volume, start it and mount it
2. start running dbench on the mount point 
3. start doing graph changes (performance xlators on/off)

Actual results:
glusterfsd processes crash while doing ltable cleanup

Expected results:
glusterfsd processes should not crash and should cleanup the ltable of the older graph

Additional info:

Comment 1 Pranith Kumar K 2014-07-11 09:11:10 UTC
lock table implementation is replaced with client_t implementation so the bug is not valid on >= 3.5. In 3.4 mem-accounting is disabled. Considering no one reported this upstream in the released versions, I am closing the bug. Please feel free to reopen it if you come across it.


Note You need to log in before you can comment on or make changes to this bug.