Bug 1229139
| Summary: | glusterd: glusterd crashing if you run re-balance and vol status command parallely. | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Anand Nekkunti <anekkunt> | |
| Component: | glusterd | Assignee: | Anand Nekkunti <anekkunt> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | mainline | CC: | amukherj, bperkins, bugs, gluster-bugs, nsathyan, olim | |
| Target Milestone: | --- | Keywords: | Reopened, Triaged | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.8rc2 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1230523 1230525 (view as bug list) | Environment: | ||
| Last Closed: | 2016-06-16 13:09:29 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1230523, 1230525 | |||
REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#1) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#2) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#3) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#4) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#5) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#7) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#8) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#9) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#10) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#11) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#12) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#13) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#14) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#15) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#16) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#17) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#18) for review on master by Anand Nekkunti (anekkunt) REVIEW: http://review.gluster.org/11120 (glusterd: Get the local txn_info based on trans_id in op_sm call backs.) posted (#19) for review on master by Anand Nekkunti (anekkunt) COMMIT: http://review.gluster.org/11120 committed in master by Krishnan Parthasarathi (kparthas) ------ commit c9765bcb1557ab1e921080e7de4f3ebac1e424d5 Author: anand <anekkunt> Date: Mon Jun 8 00:19:00 2015 +0530 glusterd: Get the local txn_info based on trans_id in op_sm call backs. Issue: when two or more transactions are running concurrently in op_sm, global op_info might get corrupted. Fix: Get local txn_info based on trans_id instead of using global txn_info for commands (re-balance, profile ) which are using op_sm in originator. TODO: Handle errors properly in call backs and completely remove the global op_info from op_sm. Change-Id: I9d61388acc125841ddc77e2bd560cb7f17ae0a5a BUG: 1229139 Signed-off-by: anand <anekkunt> Reviewed-on: http://review.gluster.org/11120 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Krishnan Parthasarathi <kparthas> Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well. This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |
Description of problem: glusterd: glusterd crashing if you run re-balance and vol status command parallely (compilied in debug mode). Version-Release number of selected component (if applicable): How reproducible: Most of the times Steps to Reproduce: 1.compile in glusterfs debug mode (./configure --enable-debug) 2.gluster peer probe 46.101.184.191 gluster volume create livebackup replica 2 transport tcp 46.101.160.245:/opt/gluster_brick1 46.101.184.191:/opt/gluster_brick2 force gluster volume start livebackup gluster volume add-brick livebackup 46.101.160.245:/opt/gluster_brick2 46.101.184.191:/opt/gluster_brick1 force gluster volume info Volume Name: livebackup Type: Distributed-Replicate Volume ID: 55cf62a0-099f-4a5e-ae4a-0ddec29239b4 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 46.101.160.245:/opt/gluster_brick1 Brick2: 46.101.184.191:/opt/gluster_brick2 Brick3: 46.101.160.245:/opt/gluster_brick2 Brick4: 46.101.184.191:/opt/gluster_brick1 Options Reconfigured: performance.readdir-ahead: on mount -t glusterfs localhost:/livebackup /mnt cp /var/log/* /mnt gluster volume rebalance livebackup start In node 2: gluster volume status Actual results: glusterd crashing. Expected results: glusterd should not crash. (gdb) bt #0 0x0000003c000348c7 in raise () from /lib64/libc.so.6 #1 0x0000003c0003652a in abort () from /lib64/libc.so.6 #2 0x0000003c0002d46d in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000003c0002d522 in __assert_fail () from /lib64/libc.so.6 #4 0x00007fc09938d0d5 in glusterd_volume_rebalance_use_rsp_dict (aggr=0x0, rsp_dict=0x7fc08800b68c) at glusterd-utils.c:7776 #5 0x00007fc0993969b4 in __glusterd_commit_op_cbk (req=0x7fc08800f1cc, iov=0x7fc08800f20c, count=1, myframe=0x7fc08800f0b4) at glusterd-rpc-ops.c:1333 #6 0x00007fc099393cee in glusterd_big_locked_cbk (req=0x7fc08800f1cc, iov=0x7fc08800f20c, count=1, myframe=0x7fc08800f0b4, fn=0x7fc099396419 <__glusterd_commit_op_cbk>) at glusterd-rpc-ops.c:207 #7 0x00007fc099396a9a in glusterd_commit_op_cbk (req=0x7fc08800f1cc, iov=0x7fc08800f20c, count=1, myframe=0x7fc08800f0b4) at glusterd-rpc-ops.c:1371 #8 0x00007fc0a2ebdc1b in rpc_clnt_handle_reply (clnt=0xaf58b0, pollin=0x7fc08800a7a0) at rpc-clnt.c:761 #9 0x00007fc0a2ebe010 in rpc_clnt_notify (trans=0xaf5d20, mydata=0xaf58e0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7fc08800a7a0) at rpc-clnt.c:889 #10 0x00007fc0a2eba69a in rpc_transport_notify (this=0xaf5d20, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7fc08800a7a0) at rpc-transport.c:538 #11 0x00007fc097df912c in socket_event_poll_in (this=0xaf5d20) at socket.c:2285 #12 0x00007fc097df95d8 in socket_event_handler (fd=12, idx=2, data=0xaf5d20, poll_in=1, poll_out=0, poll_err=0) at socket.c:2398 #13 0x00007fc0a3168146 in event_dispatch_epoll_handler (event_pool=0xa77ca0, event=0x7fc096dbcea0) at event-epoll.c:567 #14 0x00007fc0a3168499 in event_dispatch_epoll_worker (data=0xa82140) at event-epoll.c:669 #15 0x0000003c0040752a in start_thread () from /lib64/libpthread.so.0 #16 0x0000003c0010079d in clone () from /lib64/libc.so.6