Bug 783090

Summary: [36cedb338ec1d021e189379f30100f0d983e3e01]: client & brick pair crash with sigabrt during dict_unref
Product: [Community] GlusterFS Reporter: Rahul C S <rahulcs>
Component: access-controlAssignee: shishir gowda <sgowda>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: gluster-bugs, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-01 07:46:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rahul C S 2012-01-19 10:12:33 UTC
Description of problem:
Mount glusterfs with acl for a distributed replicate volume & ran posix compliance test

Volinfo:
Volume Name: vol
Type: Distributed-Replicate
Volume ID: d388c638-1879-46f9-936b-03422ee8aa2c
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: dagobah:/data/export1
Brick2: dagobah:/data/export2
Brick3: dagobah:/data/export3
Brick4: dagobah:/data/export4
Options Reconfigured:
performance.stat-prefetch: off
geo-replication.indexing: on
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
features.limit-usage: /:5GB
features.quota: on

The client crashed along with 1 replica pair crashed. All the processes have identical backtrace, except for the client which has afr in it.

The key in prev is set to "system.posix_acl_access" when the crash happened.

Client Core:
Core was generated by `/usr/local/sbin/glusterfs --acl --volfile-id=vol --volfile-server=dagobah mount'.
Program terminated with signal 6, Aborted.
#0  0x00007f684d1583a5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007f684d1583a5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f684d15bb0b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f684d150d4d in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f684db5ab95 in __gf_free (free_ptr=0x7f683c005fd0) at ../../../libglusterfs/src/mem-pool.c:273
#4  0x00007f684db21f4c in data_destroy (data=0x7f683c004320) at ../../../libglusterfs/src/dict.c:144
#5  0x00007f684db22d58 in data_unref (this=0x7f683c004320) at ../../../libglusterfs/src/dict.c:492
#6  0x00007f684db22ab8 in dict_destroy (this=0x7f683c003ca0) at ../../../libglusterfs/src/dict.c:417
#7  0x00007f684db22c02 in dict_unref (this=0x7f683c003ca0) at ../../../libglusterfs/src/dict.c:454
#8  0x00007f68498c7332 in afr_local_cleanup (local=0x7f6844be4930, this=0x6afa70) at ../../../../../xlators/cluster/afr/src/afr-common.c:916
#9  0x00007f684987cbc8 in afr_create_done (frame=0x7f684bee4afc, this=0x6afa70)
    at ../../../../../xlators/cluster/afr/src/afr-dir-write.c:258
#10 0x00007f68498ba205 in afr_unlock_common_cbk (frame=0x7f684bee4afc, cookie=0x0, this=0x6afa70, op_ret=0, op_errno=0)
    at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:543
#11 0x00007f68498badf4 in afr_unlock_entrylk_cbk (frame=0x7f684bee4afc, cookie=0x0, this=0x6afa70, op_ret=0, op_errno=0)
    at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:705
#12 0x00007f6849b0590a in client3_1_entrylk_cbk (req=0x7f684826a97c, iov=0x7f684826a9bc, count=1, myframe=0x7f684c16e724)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1314
#13 0x00007f684d906efa in rpc_clnt_handle_reply (clnt=0x6c6260, pollin=0x7f684405e890) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789
#14 0x00007f684d90725b in rpc_clnt_notify (trans=0x6c6530, mydata=0x6c6290, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f684405e890)
    at ../../../../rpc/rpc-lib/src/rpc-clnt.c:908
#15 0x00007f684d903124 in rpc_transport_notify (this=0x6c6530, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f684405e890)
    at ../../../../rpc/rpc-lib/src/rpc-transport.c:498
#16 0x00007f684a73f2db in socket_event_poll_in (this=0x6c6530) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1675
#17 0x00007f684a73f844 in socket_event_handler (fd=12, idx=3, data=0x6c6530, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:1790
#18 0x00007f684db59b94 in event_dispatch_epoll_handler (event_pool=0x69f2d0, events=0x6a49a0, i=0) at ../../../libglusterfs/src/event.c:794
#19 0x00007f684db59da7 in event_dispatch_epoll (event_pool=0x69f2d0) at ../../../libglusterfs/src/event.c:856
#20 0x00007f684db5a11a in event_dispatch (event_pool=0x69f2d0) at ../../../libglusterfs/src/event.c:956
#21 0x0000000000407d6e in main (argc=5, argv=0x7fffcf8cf958) at ../../../glusterfsd/src/glusterfsd.c:1601
(gdb) f 3
#3  0x00007f684db5ab95 in __gf_free (free_ptr=0x7f683c005fd0) at ../../../libglusterfs/src/mem-pool.c:273
273	                GF_ASSERT (0);
(gdb) l
268	
269	        ptr = (char *)free_ptr - 8 - 4;
270	
271	        if (GF_MEM_HEADER_MAGIC != *(uint32_t *)ptr) {
272	                //Possible corruption, assert here
273	                GF_ASSERT (0);
274	        }
275	
276	        *(uint32_t *)ptr = 0;
277	
(gdb) f 6
#6  0x00007f684db22ab8 in dict_destroy (this=0x7f683c003ca0) at ../../../libglusterfs/src/dict.c:417
417	                data_unref (prev->value);
(gdb) p *prev
$1 = {hash_next = 0x7f683c0058d0, prev = 0x0, next = 0x7f683c0058d0, value = 0x7f683c004320, key = 0x7f683c004d90 "system.posix_acl_access"}
(gdb) p *prev->value
$2 = {is_static = 0 '\000', is_const = 0 '\000', is_stdalloc = 0 '\000', len = 68, vec = 0x0, data = 0x7f683c005fd0 "\002", refcount = 0, 
  lock = 1}


Server core:

Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id vol.dagobah.data-export4 -'.
Program terminated with signal 6, Aborted.
#0  0x00007f7ea78d73a5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007f7ea78d73a5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f7ea78dab0b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f7ea78cfd4d in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f7ea82d9b95 in __gf_free (free_ptr=0x1da0e50) at ../../../libglusterfs/src/mem-pool.c:273
#4  0x00007f7ea82a0f4c in data_destroy (data=0x1d5c5d0) at ../../../libglusterfs/src/dict.c:144
#5  0x00007f7ea82a1d58 in data_unref (this=0x1d5c5d0) at ../../../libglusterfs/src/dict.c:492
#6  0x00007f7ea82a1ab8 in dict_destroy (this=0x7f7e94070450) at ../../../libglusterfs/src/dict.c:417
#7  0x00007f7ea82a1c02 in dict_unref (this=0x7f7e94070450) at ../../../libglusterfs/src/dict.c:454
#8  0x00007f7ea82d3c66 in call_stub_destroy_wind (stub=0x7f7ea65cb468) at ../../../libglusterfs/src/call-stub.c:3307
#9  0x00007f7ea82d47dc in call_stub_destroy (stub=0x7f7ea65cb468) at ../../../libglusterfs/src/call-stub.c:3828
#10 0x00007f7ea82d4901 in call_resume (stub=0x7f7ea65cb468) at ../../../libglusterfs/src/call-stub.c:3859
#11 0x00007f7ea3dad7af in iot_worker (data=0x1ca6e40) at ../../../../../xlators/performance/io-threads/src/io-threads.c:138
#12 0x00007f7ea7c47efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007f7ea798289d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#14 0x0000000000000000 in ?? ()
(gdb) f 7
#7  0x00007f7ea82a1c02 in dict_unref (this=0x7f7e94070450) at ../../../libglusterfs/src/dict.c:454
454	                dict_destroy (this);
(gdb) f 6
#6  0x00007f7ea82a1ab8 in dict_destroy (this=0x7f7e94070450) at ../../../libglusterfs/src/dict.c:417
417	                data_unref (prev->value);
(gdb) p *prev
$1 = {hash_next = 0x0, prev = 0x7f7e940162f0, next = 0x0, value = 0x1d5c5d0, key = 0x7f7e9408ecd0 "system.posix_acl_access"}
(gdb) f 4
#4  0x00007f7ea82a0f4c in data_destroy (data=0x1d5c5d0) at ../../../libglusterfs/src/dict.c:144
144	                                        GF_FREE (data->data);
(gdb) p *data
$2 = {is_static = 0 '\000', is_const = 0 '\000', is_stdalloc = 0 '\000', len = 68, vec = 0x0, data = 0x1da0e50 "\002", refcount = 0, 
  lock = 1}

Comment 1 shishir gowda 2012-02-01 07:46:19 UTC
Not able to reproduce the bug on git mainline.
Please reopen the bug if reproduced