Bug 1141694 - [RHEV-RHS]add-brick failed after performing series of add-brick, rebalance, remove-brick start converting distribute-replicate to replicate and vice versa
Summary: [RHEV-RHS]add-brick failed after performing series of add-brick, rebalance, r...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-15 09:20 UTC by SATHEESARAN
Modified: 2014-09-15 10:31 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
virt rhev integration
Last Closed: 2014-09-15 10:31:59 UTC
Embargoed:


Attachments (Terms of Use)
glusterd log from RHSS-Node-1 (449.20 KB, text/plain)
2014-09-15 09:35 UTC, SATHEESARAN
no flags Details

Description SATHEESARAN 2014-09-15 09:20:20 UTC
Description of problem:
-----------------------
Gluster volume was used to host virtual machine images [RHEV-RHS Integration].
Initially used 2X2 distribute replicate volume. After adding replica pairs, did a rebalance. Remove all pairs of bricks for the volume to become a replicate volume. Again added few more bricks, to make it a distribute-replicate volume.

After sometime I add-brick reported failure :
"volume add-brick: failed: Staging failed on 10.70.37.116. Error: Host 10.70.37.159 is not in 'Peer in Cluster' state
Staging failed on 10.70.37.175. Error: Host 10.70.37.159 is not in 'Peer in Cluster' state"

But all the peers listed in all the hosts has the status 'Peer in Cluster'

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
glusterfs-3.6.0.28-1.el6rhs

How reproducible:
-----------------
Haven't tried to reproduce

Steps to Reproduce:
-------------------
1. Created 2X2 distributed-replicate volume with 4 RHSS Nodes in the cluster.
Each having 4 bricks

2. After optimizing the volume for virt-store, use it to host VM images

3. Add more bricks and perform rebalance

4. 'remove brick start' on the volume and make it as a replicate volume

5. Add more bricks to make it as a Distribute-replicate volume and perform rebalance

6. Perform 'remove brick start' operation on the volume

7. Try to repeat step 5 and 6

Actual results:
---------------
After some iteration, add-brick fails with error message :
volume add-brick: failed: Staging failed on 10.70.37.116. Error: Host 10.70.37.159 is not in 'Peer in Cluster' state
Staging failed on 10.70.37.175. Error: Host 10.70.37.159 is not in 'Peer in Cluster' state

Expected results:
-----------------
There shouldn't be any errors/problems

Comment 2 SATHEESARAN 2014-09-15 09:35:55 UTC
Created attachment 937498 [details]
glusterd log from RHSS-Node-1

Attached the glusterd log from RHSS-Node-1

Comment 3 Atin Mukherjee 2014-09-15 10:31:59 UTC
As per the analysis done, its not a bug. Peer updation was attempted from 10.70.37.159 without having /etc/hosts entry for the IP and hostname. So hostname resolution failed in this node & 10.70.37.175. 

Having an entry in /etc/hosts solve this problem.


Note You need to log in before you can comment on or make changes to this bug.