This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 844332 - [96d5c52b7e9e4c4a654213e092dde9c54282fe64] - glusterd crashed because of list corruption due to garbage head pointer
[96d5c52b7e9e4c4a654213e092dde9c54282fe64] - glusterd crashed because of list...
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: glusterd (Show other bugs)
mainline
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Krutika Dhananjay
M S Vishwanath Bhat
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-30 06:24 EDT by M S Vishwanath Bhat
Modified: 2016-05-31 21:56 EDT (History)
3 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-24 13:36:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
glusterd log (809.75 KB, text/x-log)
2012-07-30 06:24 EDT, M S Vishwanath Bhat
no flags Details

  None (edit)
Description M S Vishwanath Bhat 2012-07-30 06:24:12 EDT
Created attachment 601203 [details]
glusterd log

Description of problem:
I was running the top-profile sanity and glusterd crashed because of the list corruption due to garbage head pointer.


Version-Release number of selected component (if applicable):
git master with head at 96d5c52b7e9e4c4a654213e092dde9c54282fe64

How reproducible:
1/1, inconsistent

Steps to Reproduce:
1. Build glusterfs with the -g3 -DDEBUG compiler flags and start glusterd in debug mode.
2. Run top_profile sanity with only one machine.
3.
  
Actual results:
glusterd crashed with following core dump.

Core was generated by `glusterd -LDEBUG'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f667f3cb8cb in list_add_tail (new=0x263a390, head=0x1400) at ../../../../../libglusterfs/src/list.h:41
41              new->prev = head->prev;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.9.x86_64 keyutils-libs-1.4-3.el6.x86_64 krb5-libs-1.9-22.el6_2.1.x86_64 libcom_err-1.41.12-11.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 libselinux-2.0.94-5.2.el6.x86_64 openssl-1.0.0-20.el6_2.3.x86_64 zlib-1.2.3-27.el6.x86_64
(gdb) bt
#0  0x00007f667f3cb8cb in list_add_tail (new=0x263a390, head=0x1400) at ../../../../../libglusterfs/src/list.h:41
#1  0x00007f667f3cff7e in glusterd_op_perform_replace_brick (volinfo=0x25ce4a0, old_brick=0x25ce0d0 "172.17.251.67:/tmp/brick3", 
    new_brick=0x25e2490 "172.17.251.67:/tmp/brick4") at ../../../../../xlators/mgmt/glusterd/src/glusterd-replace-brick.c:1334
#2  0x00007f667f3d0adf in glusterd_op_replace_brick (dict=0x7f66815fcb1c, rsp_dict=0x0)
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-replace-brick.c:1534
#3  0x00007f667f38c85e in glusterd_op_commit_perform (op=GD_OP_REPLACE_BRICK, dict=0x7f66815fcb1c, op_errstr=0x7fff263d58c8, rsp_dict=0x0)
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-op-sm.c:3004
#4  0x00007f667f38ade0 in glusterd_op_ac_send_commit_op (event=0x262a2d0, ctx=0x26425d0)
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-op-sm.c:2350
#5  0x00007f667f390829 in glusterd_op_sm () at ../../../../../xlators/mgmt/glusterd/src/glusterd-op-sm.c:4626
#6  0x00007f667f3cc59f in glusterd_handle_replace_brick (req=0x7f667f2e6910) at ../../../../../xlators/mgmt/glusterd/src/glusterd-replace-brick.c:162
#7  0x00007f668296814b in rpcsvc_handle_rpc_call (svc=0x25be360, trans=0x2647890, msg=0x264b0e0) at ../../../../rpc/rpc-lib/src/rpcsvc.c:513
#8  0x00007f6682968503 in rpcsvc_notify (trans=0x2647890, mydata=0x25be360, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x264b0e0)
    at ../../../../rpc/rpc-lib/src/rpcsvc.c:612
#9  0x00007f668296dfb3 in rpc_transport_notify (this=0x2647890, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x264b0e0)
    at ../../../../rpc/rpc-lib/src/rpc-transport.c:495
#10 0x00007f667f0d6e1c in socket_event_poll_in (this=0x2647890) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1964
#11 0x00007f667f0d73a7 in socket_event_handler (fd=12, idx=7, data=0x2647890, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:2075
#12 0x00007f6682bca9cc in event_dispatch_epoll_handler (event_pool=0x25b9640, events=0x25ca7d0, i=0) at ../../../libglusterfs/src/event.c:784
#13 0x00007f6682bcabdb in event_dispatch_epoll (event_pool=0x25b9640) at ../../../libglusterfs/src/event.c:845
#14 0x00007f6682bcaf66 in event_dispatch (event_pool=0x25b9640) at ../../../libglusterfs/src/event.c:945
#15 0x0000000000408b08 in main (argc=2, argv=0x7fff263d6028) at ../../../glusterfsd/src/glusterfsd.c:1765
(gdb) f 0
#0  0x00007f667f3cb8cb in list_add_tail (new=0x263a390, head=0x1400) at ../../../../../libglusterfs/src/list.h:41
41              new->prev = head->prev;
(gdb) p new
$1 = (struct list_head *) 0x263a390
(gdb) p head->prev
Cannot access memory at address 0x1408
(gdb) 


Expected results:
glusterd should not crash.

Additional info:

Below is the last few log entries in the glusterd logs.


[2012-07-30 09:17:15.151630] I [glusterd-replace-brick.c:1288:rb_update_dstbrick_port] 0-: adding dst-brick port no
[2012-07-30 09:17:15.151640] D [glusterd-replace-brick.c:747:rb_generate_client_volfile] 0-management: Creating volfile
[2012-07-30 09:17:15.165901] D [glusterd-replace-brick.c:694:rb_spawn_glusterfs_client] 0-: Successfully started glusterfs: brick=172.17.251.67:/tmp/b
rick3
[2012-07-30 09:17:15.174879] D [glusterd-replace-brick.c:710:rb_spawn_glusterfs_client] 0-: stat on mountpoint succeeded
[2012-07-30 09:17:15.183212] D [glusterd-replace-brick.c:1515:glusterd_op_replace_brick] 0-management: Received commit - will be adding dst brick and 
removing src brick
[2012-07-30 09:17:15.183278] D [glusterd-utils.c:234:glusterd_is_local_addr] 0-management: 172.17.251.67 
[2012-07-30 09:17:15.183310] D [glusterd-utils.c:243:glusterd_is_local_addr] 0-management: 172.17.251.67 is local
[2012-07-30 09:17:15.183324] D [glusterd-replace-brick.c:1519:glusterd_op_replace_brick] 0-management: I AM THE DESTINATION HOST
[2012-07-30 09:17:15.183371] I [glusterd-utils.c:1025:glusterd_service_stop] 0-: Stopping gluster brick running in pid: 13457
[2012-07-30 09:17:16.183662] I [mem-pool.c:567:mem_pool_destroy] 0-management: size=2236 max=0 total=0
[2012-07-30 09:17:16.183704] I [mem-pool.c:567:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2012-07-30 09:17:16.183757] I [glusterd-utils.c:1025:glusterd_service_stop] 0-: Stopping gluster nfs running in pid: 13468
[2012-07-30 09:17:17.184070] E [glusterd-utils.c:2978:glusterd_nodesvc_unlink_socket_file] 0-management: Failed to remove /tmp/c01f72999909ab0897e8aac
322caa8ee.socket error: Resource temporarily unavailable
[2012-07-30 09:17:17.185259] I [glusterd-utils.c:3012:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV3 successfully
[2012-07-30 09:17:17.186007] I [glusterd-utils.c:3017:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV1 successfully
[2012-07-30 09:17:17.186259] I [glusterd-utils.c:3022:glusterd_nfs_pmap_deregister] 0-: De-registered NFSV3 successfully
[2012-07-30 09:17:17.186400] I [glusterd-utils.c:3027:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v4 successfully
[2012-07-30 09:17:17.186540] I [glusterd-utils.c:3032:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v1 successfully
[2012-07-30 09:17:17.186564] D [glusterd-utils.c:721:glusterd_brickinfo_new] 0-: Returning 0
[2012-07-30 09:17:17.186576] D [glusterd-utils.c:779:glusterd_brickinfo_new_from_brick] 0-: Returning 0
[2012-07-30 09:17:17.188207] D [glusterd-utils.c:4167:glusterd_friend_find_by_hostname] 0-management: Unable to find friend: 172.17.251.67
[2012-07-30 09:17:17.188254] D [glusterd-utils.c:234:glusterd_is_local_addr] 0-management: 172.17.251.67 
[2012-07-30 09:17:17.188275] D [glusterd-utils.c:243:glusterd_is_local_addr] 0-management: 172.17.251.67 is local
[2012-07-30 09:17:17.188288] D [glusterd-utils.c:4201:glusterd_hostname_to_uuid] 0-: returning 0
[2012-07-30 09:17:17.188296] D [glusterd-utils.c:733:glusterd_resolve_brick] 0-: Returning 0
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2012-07-30 09:17:17configuration details:
argp 1
backtrace 1

dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3git
glusterd(glusterfsd_print_trace+0x22)[0x4081cd]
/lib64/libc.so.6[0x379e632900]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(+0x828cb)[0x7f667f3cb8cb]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(+0x86f7e)[0x7f667f3cff7e]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(glusterd_op_replace_brick+0xa81)[0x7f667f3d0adf]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0xd9)[0x7f667f38c85e]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(+0x41de0)[0x7f667f38ade0]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(glusterd_op_sm+0x1ea)[0x7f667f390829]
/usr/local/lib/glusterfs/3git/xlator/mgmt/glusterd.so(glusterd_handle_replace_brick+0x574)[0x7f667f3cc59f]
/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x360)[0x7f668296814b]
/usr/local/lib/libgfrpc.so.0(rpcsvc_notify+0x181)[0x7f6682968503]
/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x130)[0x7f668296dfb3]
/usr/local/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_poll_in+0x54)[0x7f667f0d6e1c]
/usr/local/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_handler+0x224)[0x7f667f0d73a7]
/usr/local/lib/libglusterfs.so.0(+0x4e9cc)[0x7f6682bca9cc]
/usr/local/lib/libglusterfs.so.0(+0x4ebdb)[0x7f6682bcabdb]
/usr/local/lib/libglusterfs.so.0(event_dispatch+0x88)[0x7f6682bcaf66]
glusterd(main+0x259)[0x408b08]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x379e61ecdd]
glusterd[0x404669]


I have attached the glusterd log as well.

Note You need to log in before you can comment on or make changes to this bug.