Bug 762756 (GLUSTER-1024)

Summary: [3.0.5rc6]: Crash in distriburte
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: distributeAssignee: kaushik <kbudiger>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 3.0.4CC: gluster-bugs, tejas, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Raghavendra Bhat 2010-06-24 06:06:43 UTC
distribute client crashed in dht_stat_merge. Sanity script was running and server up and down was being done when the crash happened. This is the backtrace of the core generated.

GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-slackware-linux"...

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /opt/glusterfs/3.0.5rc6/lib/libglusterfs.so.0...done.
Loaded symbols for /opt/glusterfs/3.0.5rc6/lib/libglusterfs.so.0
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/xlator/protocol/client.so...done.
Loaded symbols for /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/xlator/protocol/client.so
Reading symbols from /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/xlator/cluster/distribute.so...done.
Loaded symbols for /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/xlator/cluster/distribute.so
Reading symbols from /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/xlator/mount/fuse.so...done.
Loaded symbols for /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/xlator/mount/fuse.so
Reading symbols from /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/transport/socket.so...done.
Loaded symbols for /opt/glusterfs/3.0.5rc6/lib/glusterfs/3.0.5rc6/transport/socket.so
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /usr/lib64/libgcc_s.so.1...done.
Loaded symbols for /usr/lib64/libgcc_s.so.1
Core was generated by `/opt/glusterfs/3.0.5rc6/sbin/glusterfs -f client1.vol /mnt/hd/ -l /tmp/tests_cl'.
Program terminated with signal 11, Segmentation fault.
[New process 4211]
[New process 4212]
[New process 4217]
#0  0x00007f740da1685e in dht_stat_merge (this=0x611200, to=0x7f740020aba8, from=0x7fff00000000, subvol=0x6103f0)
    at ../../../../../xlators/cluster/dht/src/dht-helper.c:328
328		to->st_dev      = from->st_dev;
(gdb) bt
#0  0x00007f740da1685e in dht_stat_merge (this=0x611200, to=0x7f740020aba8, from=0x7fff00000000, subvol=0x6103f0)
    at ../../../../../xlators/cluster/dht/src/dht-helper.c:328
#1  0x00007f740da18802 in dht_selfheal_dir_mkdir_cbk (frame=0x7f7400209730, cookie=0x745f10, this=0x611200, op_ret=-1, op_errno=22, 
    inode=0x7f740009fd50, stbuf=0x0, preparent=0x0, postparent=0x7fff00000000)
    at ../../../../../xlators/cluster/dht/src/dht-selfheal.c:220
#2  0x00007f740dc4dbe6 in client_mkdir (frame=0x745f10, this=0x6103f0, loc=0x7f740020a878, mode=16877)
    at ../../../../../xlators/protocol/client/src/client-protocol.c:1062
#3  0x00007f740da18bdf in dht_selfheal_dir_mkdir (frame=0x7f7400209730, loc=0x7f740020a878, layout=0x7f74000a0050, force=1)
    at ../../../../../xlators/cluster/dht/src/dht-selfheal.c:271
#4  0x00007f740da19634 in dht_selfheal_restore (frame=0x7f7400209730, dir_cbk=0x7f740da2fcde <dht_rmdir_selfheal_cbk>, 
    loc=0x7f740020a878, layout=0x7f74000a0050) at ../../../../../xlators/cluster/dht/src/dht-selfheal.c:531
#5  0x00007f740da30020 in dht_rmdir_cbk (frame=0x7f7400209730, cookie=0x850330, this=0x611200, op_ret=0, op_errno=22, 
    preparent=0x7fff98ed3e90, postparent=0x7fff98ed3e00) at ../../../../../xlators/cluster/dht/src/dht-common.c:3264
#6  0x00007f740dc5c8a3 in client_rmdir_cbk (frame=0x850330, hdr=0x868510, hdrlen=268, iobuf=0x0)
    at ../../../../../xlators/protocol/client/src/client-protocol.c:4883
#7  0x00007f740dc619dc in protocol_client_interpret (this=0x6108a0, trans=0x613c40, hdr_p=0x868510 "", hdrlen=268, iobuf=0x0)
    at ../../../../../xlators/protocol/client/src/client-protocol.c:6571
#8  0x00007f740dc626a2 in protocol_client_pollin (this=0x6108a0, trans=0x613c40)
    at ../../../../../xlators/protocol/client/src/client-protocol.c:6869
#9  0x00007f740dc62d16 in notify (this=0x6108a0, event=2, data=0x613c40)
    at ../../../../../xlators/protocol/client/src/client-protocol.c:6988
#10 0x00007f740ee142fa in xlator_notify (xl=0x6108a0, event=2, data=0x613c40) at ../../../libglusterfs/src/xlator.c:929
#11 0x00007f740cdec257 in socket_event_poll_in (this=0x613c40) at ../../../../transport/socket/src/socket.c:771
#12 0x00007f740cdec551 in socket_event_handler (fd=12, idx=0, data=0x613c40, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../transport/socket/src/socket.c:871
#13 0x00007f740ee38f1f in event_dispatch_epoll_handler (event_pool=0x60a320, events=0x6171a0, i=0)
    at ../../../libglusterfs/src/event.c:804
#14 0x00007f740ee390ee in event_dispatch_epoll (event_pool=0x60a320) at ../../../libglusterfs/src/event.c:867
#15 0x00007f740ee393ff in event_dispatch (event_pool=0x60a320) at ../../../libglusterfs/src/event.c:975
#16 0x000000000040634b in main (argc=6, argv=0x7fff98ed4b48) at ../../../glusterfsd/src/glusterfsd.c:1425
(gdb) p from
$1 = (struct stat *) 0x7fff00000000
(gdb) p *from
Cannot access memory at address 0x7fff00000000
(gdb) f 2
#2  0x00007f740dc4dbe6 in client_mkdir (frame=0x745f10, this=0x6103f0, loc=0x7f740020a878, mode=16877)
    at ../../../../../xlators/protocol/client/src/client-protocol.c:1062
1062	        STACK_UNWIND (frame, -1, EINVAL, loc->inode, NULL);
(gdb) q


In client_mkdir STACK_UNWIND_STRICT should be used instead if STACK_UNWIND which makes postparent to have some wrong address in dht_selfheal_dir_mkdir_cbk

Comment 1 Anand Avati 2010-06-25 07:25:27 UTC
PATCH: http://patches.gluster.com/patch/3480 in release-3.0 (use STACK_UNWIND_STRICT to avoid postparent from having wrong address in dht_selfheal_dir_mkdir_cbk)

Comment 2 Vijay Bellur 2010-08-24 03:27:36 UTC
Kaushik,

Please confirm if the patch is available in mainline. If not, please send it across.