| Summary: | Crash in glusterfs while rebalance and remove-brick triggered at the same time | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | shylesh <shmohan> | ||||
| Component: | core | Assignee: | shishir gowda <sgowda> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | |||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | pre-release | CC: | gluster-bugs, nsathyan | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-04-17 09:43:11 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Same crash also generated while 1. create a 2 brick distribute from a single node 2. Keep on doing some I/O on the mount 3. Attach another node to the cluster 4. Add a brick from the new node to this volume 5. Initiate fix-layout and rebalance . Same steps lead to another crash whose stack frames are almost same.
Program terminated with signal 11, Segmentation fault.
#0 0x00007fcd94f82da4 in default_notify (this=0x2502040, event=6, data=0x24ff840) at defaults.c:1333
1333 if (parent->xlator->init_succeeded)
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6.x86_64 libgcc-4.4.6-3.el6.x86_64
======================================================================
(gdb) p this
$1 = (xlator_t *) 0x2502040
(gdb) p this->ctx
$2 = (glusterfs_ctx_t *) 0x24b4010
(gdb) p this->ctx->master
$3 = (void *) 0x0
(gdb) p this->graph
$4 = (glusterfs_graph_t *) 0x24fa420
====================================================================
(gdb) bt
#0 0x00007fcd94f82da4 in default_notify (this=0x2502040, event=6, data=0x24ff840) at defaults.c:1333
#1 0x00007fcd90842759 in dht_notify (this=0x2502040, event=6, data=0x24ff840) at dht-common.c:4703
#2 0x00007fcd90853bed in notify (this=0x2502040, event=6, data=0x24ff840) at dht.c:201
#3 0x00007fcd94f6f933 in xlator_notify (xl=0x2502040, event=6, data=0x24ff840) at xlator.c:457
#4 0x00007fcd94f82dda in default_notify (this=0x24ff840, event=6, data=0x0) at defaults.c:1334
#5 0x00007fcd90a73c7c in client_rpc_notify (rpc=0x2579d70, mydata=0x24ff840, event=RPC_CLNT_DISCONNECT,
data=0x0) at client.c:2107
#6 0x00007fcd94d4ae2b in rpc_clnt_notify (trans=0x25897d0, mydata=0x2579da0, event=RPC_TRANSPORT_DISCONNECT,
data=0x25897d0) at rpc-clnt.c:887
#7 0x00007fcd94d46ee4 in rpc_transport_notify (this=0x25897d0, event=RPC_TRANSPORT_DISCONNECT,
data=0x25897d0) at rpc-transport.c:498
#8 0x00007fcd918c81d3 in socket_event_poll_err (this=0x25897d0) at socket.c:694
#9 0x00007fcd918cc88c in socket_event_handler (fd=9, idx=4, data=0x25897d0, poll_in=1, poll_out=0,
poll_err=16) at socket.c:1808
#10 0x00007fcd94fa3640 in event_dispatch_epoll_handler (event_pool=0x24cbdb0, events=0x24f9800, i=0)
at event.c:794
#11 0x00007fcd94fa3863 in event_dispatch_epoll (event_pool=0x24cbdb0) at event.c:856
#12 0x00007fcd94fa3bee in event_dispatch (event_pool=0x24cbdb0) at event.c:956
#13 0x000000000040801c in main (argc=21, argv=0x7fff8844c918) at glusterfsd.c:1650
Can you please check if this bug is still valid? This bug is not reproducible on latest master . |
Created attachment 575716 [details] rebalance logs Description of problem: while rebalancing is happening , I/O on the mount point and initiating remove-brick on the same volume leads to crash. Version-Release number of selected component (if applicable): 3.3.0qa33 How reproducible: Steps to Reproduce: 1. created a distribute volume with 6 bricks 2. Initiated rebalance 3. Do some I/O on the mount point while rebalance is happening 4. Initiate remove-brick on the same volume Actual results: glusterfs crashed Expected results: remove-brick should not start while rebalance is happening Additional info: Program terminated with signal 11, Segmentation fault. #0 0x0000003d7b0157f8 in ?? () from /lib64/libgcc_s.so.1 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.5.x86_64 libgcc-4.4.6-3.el6.x86_64 ================================================== (gdb) bt #0 0x0000003d7b0157f8 in ?? () from /lib64/libgcc_s.so.1 #1 0x00007fba83a74933 in xlator_notify (xl=0x2274fb0, event=6, data=0x226ee10) at xlator.c:457 #2 0x00007fba83a87dda in default_notify (this=0x226ee10, event=6, data=0x0) at defaults.c:1334 #3 0x00007fba7f578c7c in client_rpc_notify (rpc=0x23a1c40, mydata=0x226ee10, event=RPC_CLNT_DISCONNECT, data=0x0) at client.c:2107 #4 0x00007fba8384fe2b in rpc_clnt_notify (trans=0x23b16a0, mydata=0x23a1c70, event=RPC_TRANSPORT_DISCONNECT, data=0x23b16a0) at rpc-clnt.c:887 #5 0x00007fba8384bee4 in rpc_transport_notify (this=0x23b16a0, event=RPC_TRANSPORT_DISCONNECT, data=0x23b16a0) at rpc-transport.c:498 #6 0x00007fba803cd1d3 in socket_event_poll_err (this=0x23b16a0) at socket.c:694 #7 0x00007fba803d188c in socket_event_handler (fd=13, idx=6, data=0x23b16a0, poll_in=1, poll_out=0, poll_err=16) at socket.c:1808 #8 0x00007fba83aa8640 in event_dispatch_epoll_handler (event_pool=0x223adb0, events=0x2268830, i=0) at event.c:794 #9 0x00007fba83aa8863 in event_dispatch_epoll (event_pool=0x223adb0) at event.c:856 #10 0x00007fba83aa8bee in event_dispatch (event_pool=0x223adb0) at event.c:956 #11 0x000000000040801c in main (argc=21, argv=0x7fffb6c96b88) at glusterfsd.c:1650 =========================================================================== attached the logs