Bug 1036551 - [RFE] : Start glusterd even when the glusterd is unable to resolve the bricks path.
Summary: [RFE] : Start glusterd even when the glusterd is unable to resolve the bricks...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Samikshan Bairagya
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1294412
TreeView+ depends on / blocked
 
Reported: 2013-12-02 09:37 UTC by spandura
Modified: 2016-06-24 08:51 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 1294412 (view as bug list)
Environment:
Last Closed: 2015-12-28 06:12:44 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description spandura 2013-12-02 09:37:42 UTC
Description of problem:
========================
Consider the case where the storage node is stopped and on restart the storage node gets a new ip_address or hostname. ( When Amazon EC2 instances are stopped and brought back online, the Hostname and the IP address gets changed. The storage node has the cluster information. But the brick paths gets changed)

Upon restart of the storage node's glusterd tries to start and start of glusterd fails because it is not able to resolve the brick path. The error message we get is : 

[2013-11-28 10:14:28.928388] E [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-0
[2013-11-28 10:14:28.928424] E [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-1
[2013-11-28 10:14:28.928441] E [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-2
[2013-11-28 10:14:28.928457] E [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-3
[2013-11-28 10:14:28.928472] E [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-4
[2013-11-28 10:14:28.928488] E [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-5


"glusterd" should be restarted even though it is not able to resolve the brick path. 

Without restarting glusterd we will not be able to do any "peer" or "volume" operations. 

Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.4.0.44.1u2rhs built on Nov 25 2013 08:17:39

How reproducible:
==================
Often

Steps to Reproduce:
====================
1. Create a 1 x 2 replicate volume . Start the volume. 

2. Stop one of the storage nodes (shutdown). Restart the node. (Restart should change the IP/hostname of the node)

Comment 2 Paul Armstrong 2013-12-21 14:16:16 UTC
After updating from glusterfs 3.4.0.33rhs to glusterfs 3.4.0.44rhs I was experiencing the same errors.

I found that the peer definitions for the volumes in question were using aliases instead of FQDNs. I edited the peer files on each server to include the FQDN and glusterd now starts without error.


Note You need to log in before you can comment on or make changes to this bug.