Bug 804745

Summary: [glusterfs-3.3.0qa29]: glusterfs client crashed since call_count was zero
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:21:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description Raghavendra Bhat 2012-03-19 17:52:45 UTC
Description of problem:
3 replica volume. 6 fuse and 6 nfs clients running different tests such as ping_pong, fs-perf-test, rdd etc. Volume set  operations, volume status commands were running in a loop parallely. Replace-brick was running.

One of the glusterfs clients crashed. (core file size was  38GB).

This is the backtrace.

GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/sbin/glusterfs...done.
[New Thread 23335]
[New Thread 23336]
[New Thread 23338]
[New Thread 23352]
[New Thread 23351]
[New Thread 23337]
[New Thread 23342]
Reading symbols from /usr/local/lib/libglusterfs.so.0...done.
Loaded symbols for /usr/local/lib/libglusterfs.so.0
Reading symbols from /usr/local/lib/libgfrpc.so.0...done.
Loaded symbols for /usr/local/lib/libgfrpc.so.0
Reading symbols from /usr/local/lib/libgfxdr.so.0...done.
Loaded symbols for /usr/local/lib/libgfxdr.so.0
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/mount/fuse.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/mount/fuse.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/rpc-transport/socket.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/rpc-transport/socket.so
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/protocol/client.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/protocol/client.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/cluster/replicate.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/cluster/replicate.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/write-behind.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/write-behind.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/read-ahead.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/read-ahead.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/io-cache.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/io-cache.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/quick-read.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/quick-read.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/debug/io-stats.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/debug/io-stats.so
Reading symbols from /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/md-cache.so...done.
Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa29/xlator/performance/md-cache.so
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
Core was generated by `/usr/local/sbin/glusterfs --volfile-id=mirror --volfile-server=10.16.156.9 /mnt'.
Program terminated with signal 6, Aborted.
#0  0x00007fdbb586a885 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.9.x86_64 libgcc-4.4.6-3.el6.x86_64
(gdb) bt
#0  0x00007fdbb586a885 in raise () from /lib64/libc.so.6
#1  0x00007fdbb586c065 in abort () from /lib64/libc.so.6
#2  0x00007fdbb58639fe in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007fdbb5863ac0 in __assert_fail () from /lib64/libc.so.6
#4  0x00007fdbb14820a6 in afr_sh_data_erase_pending (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:417
#5  0x00007fdbb1483785 in afr_sh_data_special_file_fix (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:819
#6  0x00007fdbb14839bb in afr_sh_data_fstat_cbk (frame=0x2b32d3c, cookie=0x2, this=0x8cce5e0, op_ret=-1, op_errno=107, 
    buf=0x7fff39d95370) at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:863
#7  0x00007fdbb16ef1f1 in client3_1_fstat_cbk (req=0x8d8fc58, iov=0x0, count=0, myframe=0x7fdbb4c2ece4)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1186
#8  0x00007fdbb62165aa in rpc_clnt_submit (rpc=0x4a7a540, prog=0x7fdbb1913cc0, procnum=25, 
    cbkfn=0x7fdbb16eeea5 <client3_1_fstat_cbk>, proghdr=0x7fff39d95f10, proghdrcount=1, progpayload=0x0, progpayloadcount=0, 
    iobref=0x197a000, frame=0x7fdbb4c2ece4, rsphdr=0x0, rsphdr_count=0, rsp_payload=0x0, rsp_payload_count=0, rsp_iobref=0x0)
    at ../../../../rpc/rpc-lib/src/rpc-clnt.c:1542
#9  0x00007fdbb16ddaa6 in client_submit_request (this=0x8ccdc20, req=0x7fff39d95fd0, frame=0x7fdbb4c2ece4, prog=0x7fdbb1913cc0, 
    procnum=25, cbkfn=0x7fdbb16eeea5 <client3_1_fstat_cbk>, iobref=0x0, rsphdr=0x0, rsphdr_count=0, rsp_payload=0x0, 
    rsp_payload_count=0, rsp_iobref=0x0, xdrproc=0x7fdbb5ff3ef8 <xdr_gfs3_fstat_req>)
    at ../../../../../xlators/protocol/client/src/client.c:203
#10 0x00007fdbb16fa662 in client3_1_fstat (frame=0x7fdbb4c2ece4, this=0x8ccdc20, data=0x7fff39d96090)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:3617
#11 0x00007fdbb16e18d9 in client_fstat (frame=0x7fdbb4c2ece4, this=0x8ccdc20, fd=0x91d095c)
    at ../../../../../xlators/protocol/client/src/client.c:1008
#12 0x00007fdbb1483d22 in afr_sh_data_fstat (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:893
#13 0x00007fdbb1484098 in afr_sh_data_fxattrop_cbk (frame=0x2b32d3c, cookie=0x2, this=0x8cce5e0, op_ret=-1, op_errno=107, xattr=0x0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:980
#14 0x00007fdbb16f08e6 in client3_1_fxattrop_cbk (req=0x8d90514, iov=0x0, count=0, myframe=0x7fdbb4c4e0c4)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1453
#15 0x00007fdbb62165aa in rpc_clnt_submit (rpc=0x4a7a540, prog=0x7fdbb1913cc0, procnum=34, 
    cbkfn=0x7fdbb16f03cc <client3_1_fxattrop_cbk>, proghdr=0x7fff39d96da0, proghdrcount=1, progpayload=0x0, progpayloadcount=0, 
    iobref=0x3a9bf10, frame=0x7fdbb4c4e0c4, rsphdr=0x7fff39d96e70, rsphdr_count=1, rsp_payload=0x0, rsp_payload_count=0, 
    rsp_iobref=0x25f7840) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:1542
#16 0x00007fdbb16ddaa6 in client_submit_request (this=0x8ccdc20, req=0x7fff39d96f70, frame=0x7fdbb4c4e0c4, prog=0x7fdbb1913cc0, 
    procnum=34, cbkfn=0x7fdbb16f03cc <client3_1_fxattrop_cbk>, iobref=0x0, rsphdr=0x7fff39d96e70, rsphdr_count=1, rsp_payload=0x0, 
    rsp_payload_count=0, rsp_iobref=0x25f7840, xdrproc=0x7fdbb5ff3047 <xdr_gfs3_fxattrop_req>)
    at ../../../../../xlators/protocol/client/src/client.c:203
#17 0x00007fdbb16fd641 in client3_1_fxattrop (frame=0x7fdbb4c4e0c4, this=0x8ccdc20, data=0x7fff39d97090)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:4281
#18 0x00007fdbb16e3a26 in client_fxattrop (frame=0x7fdbb4c4e0c4, this=0x8ccdc20, fd=0x91d095c, flags=GF_XATTROP_ADD_ARRAY, 
    dict=0x84e698) at ../../../../../xlators/protocol/client/src/client.c:1472
#19 0x00007fdbb1484543 in afr_sh_data_fxattrop (frame=0x2b32d3c, this=0x8cce5e0, 
    fxattrop_cbk=0x7fdbb1484022 <afr_sh_data_fxattrop_cbk>) at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:1039
#20 0x00007fdbb1484635 in afr_sh_data_big_lock_success (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:1075
#21 0x00007fdbb14847c6 in afr_sh_data_post_blocking_inodelk_cbk (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:1102
#22 0x00007fdbb14a006c in afr_lock_blocking (frame=0x2b32d3c, this=0x8cce5e0, child_index=3)
    at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:1036
#23 0x00007fdbb149f4b5 in afr_lock_cbk (frame=0x2b32d3c, cookie=0x2, this=0x8cce5e0, op_ret=-1, op_errno=107)
    at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:819
#24 0x00007fdbb149f54b in afr_blocking_inodelk_cbk (frame=0x2b32d3c, cookie=0x2, this=0x8cce5e0, op_ret=-1, op_errno=107)
    at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:833
#25 0x00007fdbb16ef4f8 in client3_1_inodelk_cbk (req=0x8d90dd0, iov=0x7fff39d979b0, count=1, myframe=0x7fdbb4c3fe58)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1225
#26 0x00007fdbb6213ab8 in saved_frames_unwind (saved_frames=0x2d94b30) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:387
#27 0x00007fdbb6213b67 in saved_frames_destroy (frames=0x2d94b30) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:405
#28 0x00007fdbb6214121 in rpc_clnt_connection_cleanup (conn=0x4a7a570) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:567
#29 0x00007fdbb6214c02 in rpc_clnt_notify (trans=0x8cbe580, mydata=0x4a7a570, event=RPC_TRANSPORT_DISCONNECT, data=0x8cbe580)
    at ../../../../rpc/rpc-lib/src/rpc-clnt.c:870
#30 0x00007fdbb6210e7c in rpc_transport_notify (this=0x8cbe580, event=RPC_TRANSPORT_DISCONNECT, data=0x8cbe580)
    at ../../../../rpc/rpc-lib/src/rpc-transport.c:498
#31 0x00007fdbb25291c7 in socket_event_poll_err (this=0x8cbe580) at ../../../../../rpc/rpc-transport/socket/src/socket.c:694
#32 0x00007fdbb252d880 in socket_event_handler (fd=6, idx=3, data=0x8cbe580, poll_in=1, poll_out=0, poll_err=16)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:1808
#33 0x00007fdbb646a00c in event_dispatch_epoll_handler (event_pool=0x848c50, events=0x86bcb0, i=0)
    at ../../../libglusterfs/src/event.c:794
#34 0x00007fdbb646a22f in event_dispatch_epoll (event_pool=0x848c50) at ../../../libglusterfs/src/event.c:856
#35 0x00007fdbb646a5ba in event_dispatch (event_pool=0x848c50) at ../../../libglusterfs/src/event.c:956
#36 0x0000000000408057 in main (argc=4, argv=0x7fff39d98028) at ../../../glusterfsd/src/glusterfsd.c:1647
(gdb) f 4
#4  0x00007fdbb14820a6 in afr_sh_data_erase_pending (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:417
417	        GF_ASSERT (call_count);
(gdb) l
412	        }
413	
414	        afr_sh_delta_to_xattr (priv, sh->delta_matrix, erase_xattr,
415	                               priv->child_count, AFR_DATA_TRANSACTION);
416	
417	        GF_ASSERT (call_count);
418	        local->call_count = call_count;
419	        for (i = 0; i < priv->child_count; i++) {
420	                if (!erase_xattr[i])
421	                        continue;
(gdb) f 3
#3  0x00007fdbb5863ac0 in __assert_fail () from /lib64/libc.so.6
(gdb) f 4
#4  0x00007fdbb14820a6 in afr_sh_data_erase_pending (frame=0x2b32d3c, this=0x8cce5e0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-data.c:417
417	        GF_ASSERT (call_count);
(gdb) l
412	        }
413	
414	        afr_sh_delta_to_xattr (priv, sh->delta_matrix, erase_xattr,
415	                               priv->child_count, AFR_DATA_TRANSACTION);
416	
417	        GF_ASSERT (call_count);
418	        local->call_count = call_count;
419	        for (i = 0; i < priv->child_count; i++) {
420	                if (!erase_xattr[i])
421	                        continue;
(gdb) p co[Kall_count
$1 = 0
(gdb) quit

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Pranith Kumar K 2012-03-29 06:42:01 UTC
fixed as part of 765373

Comment 2 Raghavendra Bhat 2012-04-05 09:44:31 UTC
Checked with glusterfs-3.3.0qa33. This crash is not seen now.