Bug 1105543 - [SNAPSHOT]: glusterd hangs when a node with stale snap entries is attached to the cluster
Summary: [SNAPSHOT]: glusterd hangs when a node with stale snap entries is attached to...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Avra Sengupta
QA Contact: Anoop
URL:
Whiteboard: SNAPSHOT
Depends On:
Blocks: 1087818
TreeView+ depends on / blocked
 
Reported: 2014-06-06 11:30 UTC by senaik
Modified: 2016-09-17 13:01 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
When a node with stale snap entry is attached to the cluster, the stale entries are propagated throughout the cluster and stale snapshots which are not present are displayed. Workaround: Do not attach a peer with stale snap entries
Clone Of:
Environment:
Last Closed: 2016-01-29 13:19:21 UTC
Embargoed:


Attachments (Terms of Use)

Description senaik 2014-06-06 11:30:52 UTC
Description of problem:
=======================
glusterd hangs when a node with stale snap entries are attached to the cluster


Version-Release number of selected component (if applicable):
=============================================================

glusterfs 3.6.0.13

How reproducible:
================
1/1

Steps to Reproduce:
===================

4 node cluster : Node1 Node2 Node3 Node4 

Attach a node(Node5) to the cluster while snapshot creation (snap_vol0_1.. snap_vol0_n) is in progress.
Check snapshots on the newly added peer. 


On node 4 :
----------
After snapshots are completed, detach the node and attach it again while starting snapshot creation again(snap1.. snapn)


Now probe is successful, but gluster peer status hangs as it is trying to start bricks of the stale snap entries (snap_vol0_1) which is not present on Node4.

 E [glusterd-handshake.c:85:get_snap_volname_and_volinfo] 0-management: Failed to fetch s
nap snap_vol0_1
[2014-06-06 10:49:52.132065] E [glusterd-handshake.c:196:build_volfile_path] 0-management: Failed to get snap volinfo
 from path (/snaps/snap_vol0_1/a94839bbaf994733a5b591bb59731d9c.snapshot16.lab.eng.blr.redhat.com.var-run-gluster-sna
ps-a94839bbaf994733a5b591bb59731d9c-brick4-b0)


Also checking gluster peer status on the newly added  peer shows the below status : 

[root@snapshot01 ~]# gluster peer status
Number of Peers: 1

Hostname: 10.70.40.172
Uuid: 2c797de6-e1e0-4a9a-8729-f5d3ce2de2a1
State: Sent and Received peer request (Connected)

Its shows the above status of one Node and the status other nodes are not listed . 

Actual results:


Expected results:


Additional info:

Comment 8 Shalaka 2014-06-26 15:28:59 UTC
Please review and signoff edited doc text.

Comment 12 Avra Sengupta 2016-01-29 13:19:21 UTC
Current Gluster architecture does not support implementation of fix. Therefore this fix is deferred till Gluterd 2.0.


Note You need to log in before you can comment on or make changes to this bug.