Bug 801787 - Crash during rebalance
Crash during rebalance
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
x86_64 Linux
high Severity high
: ---
: ---
Assigned To: Amar Tumballi
Depends On:
Blocks: 817967
  Show dependency treegraph
Reported: 2012-03-09 08:53 EST by shylesh
Modified: 2015-12-01 11:45 EST (History)
2 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-07-24 14:03:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions: 3.3.0qa41
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description shylesh 2012-03-09 08:53:11 EST
Description of problem:
glusterfs crashed during rebalance operation

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.create a distribute volume with few bricks
2.fill some data 
3.add-brick to the volume and initiate rebalance
4. while rebalance is happening perform some I/O on mount point
5. at the same time restart glusterd
Actual results:
glusterfs crashed

Expected results:

Additional info:

(gdb) p ctx->active
$1 = (glusterfs_graph_t *) 0x0
(gdb) p *ctx->active
Cannot access memory at address 0x0

(gdb) bt
#0  0x000000000040a708 in glusterfs_handle_defrag (req=0x137ed6c) at glusterfsd-mgmt.c:765
#1  0x000000000040b1fb in glusterfs_handle_rpc_msg (req=0x137ed6c) at glusterfsd-mgmt.c:983
#2  0x00007f88243260b5 in rpcsvc_handle_rpc_call (svc=0x137ebf0, trans=0x13937f0, msg=0x1393660) at rpcsvc.c:514
#3  0x00007f8824326458 in rpcsvc_notify (trans=0x13937f0, mydata=0x137ebf0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1393660)
    at rpcsvc.c:610
#4  0x00007f882432be10 in rpc_transport_notify (this=0x13937f0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1393660)
    at rpc-transport.c:498
#5  0x00007f882103d27c in socket_event_poll_in (this=0x13937f0) at socket.c:1686
#6  0x00007f882103d800 in socket_event_handler (fd=8, idx=2, data=0x13937f0, poll_in=1, poll_out=0, poll_err=0) at socket.c:1801
#7  0x00007f8824586080 in event_dispatch_epoll_handler (event_pool=0x1379d90, events=0x1392a40, i=0) at event.c:794
#8  0x00007f88245862a3 in event_dispatch_epoll (event_pool=0x1379d90) at event.c:856
#9  0x00007f882458662e in event_dispatch (event_pool=0x1379d90) at event.c:956
#10 0x0000000000407d6d in main (argc=21, argv=0x7fffa1bb5518) at glusterfsd.c:1611

[2012-03-09 08:33:25.164547] W [client.c:2011:client_rpc_notify] 0-dist-client-2: Registering a grace timer
[2012-03-09 08:33:25.164561] I [client.c:2024:client_rpc_notify] 0-dist-client-2: disconnected
[2012-03-09 08:33:24.170727] I [dht-rebalance.c:852:dht_migrate_file] 0-dist-dht: completed migration of /linux-3.2.1/arch/arm/include
/asm/hardware/ssp.h from subvolume dist-client-2 to dist-client-5
[2012-03-09 08:33:25.164561] I [client.c:2024:client_rpc_notify] 0-dist-client-2: disconnected
[2012-03-09 08:33:25.164570] W [dht-common.c:4476:dht_notify] 0-dist-dht: Received CHILD_DOWN. Exiting
ad.so.0() [0x39674077e1] (-->/usr/local/sbin/glusterfs(glusterfs_sigwaiter+0xfc) [0x40741e]))) 0-: received signum (15), shutting down
[2012-03-09 08:33:43.404701] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 8
t supported
pending frames:
Comment 1 Amar Tumballi 2012-03-12 05:46:54 EDT
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.
Comment 2 Anand Avati 2012-03-12 10:59:29 EDT
CHANGE: http://review.gluster.com/2924 (glusterfsd: handle a case of NULL dereference during rebalance) merged in master by Vijay Bellur (vijay@gluster.com)
Comment 3 shylesh 2012-05-17 09:25:12 EDT
No crash happens upon restarting glusterd

Note You need to log in before you can comment on or make changes to this bug.