Bug 1211866
Summary: | Concurrently detaching a peer on one node and stopping glusterd on other node, leads to dead-lock on former node | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | SATHEESARAN <sasundar> | ||||
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | Byreddy <bsrirama> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | mainline | CC: | amukherj, bsrirama, bugs, sasundar | ||||
Target Milestone: | --- | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | GlusterD | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-01-25 05:45:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
SATHEESARAN
2015-04-15 06:51:34 UTC
Following logs are seen in cli.log <snip> [2015-04-14 16:41:28.732861] I [cli-cmd-volume.c:1832:cli_check_gsync_present] 0-: geo-replication not installed [2015-04-14 16:41:28.733712] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-04-14 16:41:28.733855] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now [2015-04-14 16:41:31.732510] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now [2015-04-14 16:41:34.733371] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now [2015-04-14 16:41:37.734100] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now [2015-04-14 16:41:40.734911] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now [2015-04-14 16:41:43.735664] I [socket.c:2409:socket_event_handler] 0-transport: disconnecting now </snip> Created attachment 1014595 [details]
cli.log file from node1
Satheesaran, Could you attach a core taken on the hung glusterd process? You could do that by attaching gdb to the process and issuing gcore inside gdb. (In reply to krishnan parthasarathi from comment #3) > Satheesaran, > > Could you attach a core taken on the hung glusterd process? You could do > that by attaching gdb to the process and issuing gcore inside gdb. I have already got that core file, but couldn't attach it to the bug as BZ was slow to reach yesterday. Attaching it now Checked this issue with latest 3.1.2 build ( glusterfs-3.7.5-17 ) on rhel7 1. Created two node cluster (N1 and N2) 2.Performed below steps using the terminator tool to broadcast commands concurrently to 2 nodes. a)On N1 > gluster peer detach N2 b)On N2 > systemctl stop glusterd No issues observed and able to get the response for gluster commands issued on N1. Based on above verification details closing this bug as working in current release. (In reply to Byreddy from comment #7) > Checked this issue with latest 3.1.2 build ( glusterfs-3.7.5-17 ) on rhel7 > > 1. Created two node cluster (N1 and N2) > 2.Performed below steps using the terminator tool to broadcast commands > concurrently to 2 nodes. > a)On N1 > gluster peer detach N2 > b)On N2 > systemctl stop glusterd > > No issues observed and able to get the response for gluster commands issued > on N1. > > Based on above verification details closing this bug as working in current > release. Thanks Byreddy for verifying this issue with the latest RHGS 3.1.2 nightly build. I will re-open this bz, if the issue happens again |