Description of problem: ======================= glusterd hangs when a node with stale snap entries are attached to the cluster Version-Release number of selected component (if applicable): ============================================================= glusterfs 3.6.0.13 How reproducible: ================ 1/1 Steps to Reproduce: =================== 4 node cluster : Node1 Node2 Node3 Node4 Attach a node(Node5) to the cluster while snapshot creation (snap_vol0_1.. snap_vol0_n) is in progress. Check snapshots on the newly added peer. On node 4 : ---------- After snapshots are completed, detach the node and attach it again while starting snapshot creation again(snap1.. snapn) Now probe is successful, but gluster peer status hangs as it is trying to start bricks of the stale snap entries (snap_vol0_1) which is not present on Node4. E [glusterd-handshake.c:85:get_snap_volname_and_volinfo] 0-management: Failed to fetch s nap snap_vol0_1 [2014-06-06 10:49:52.132065] E [glusterd-handshake.c:196:build_volfile_path] 0-management: Failed to get snap volinfo from path (/snaps/snap_vol0_1/a94839bbaf994733a5b591bb59731d9c.snapshot16.lab.eng.blr.redhat.com.var-run-gluster-sna ps-a94839bbaf994733a5b591bb59731d9c-brick4-b0) Also checking gluster peer status on the newly added peer shows the below status : [root@snapshot01 ~]# gluster peer status Number of Peers: 1 Hostname: 10.70.40.172 Uuid: 2c797de6-e1e0-4a9a-8729-f5d3ce2de2a1 State: Sent and Received peer request (Connected) Its shows the above status of one Node and the status other nodes are not listed . Actual results: Expected results: Additional info:
Logs : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1105543/
Please review and signoff edited doc text.
Current Gluster architecture does not support implementation of fix. Therefore this fix is deferred till Gluterd 2.0.