Bug 1023636 - Inconsistent UUID's not causing an error that would stop the system
Summary: Inconsistent UUID's not causing an error that would stop the system
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 3.3.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-26 11:59 UTC by Dean Bruhn
Modified: 2014-12-14 19:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-14 19:40:32 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Dean Bruhn 2013-10-26 11:59:15 UTC
Description of problem:
An issue came up where my log files over ran my /var partition, this caused my volume configurations to become corrupt on several servers. Because the of the issue I needed to copy the config files from a known good host using the gluster commands to do so. 

Part of the resolution was to re-probe the servers to get them back in sync. I ended up with several servers who's UUID's did not match the peer files across the cluster. The gluster commands would work intermittently if I spammed the commands and the cluster and volume was functioning, but not optimally, it would suffer a lot of what seemed to be random disconnects, etc. 

What I was seeing was that the glusterd.info file on server1 would have a UUID of 6e5df7ab-4440-49b2-9aa7-a5984391b5f5, but it's peer files on each of the other 10 servers would either reference the correct 6e5df7ab-4440-49b2-9aa7-a5984391b5f5 or 35e62be3-752e-48aa-aa15-dc68abfaf06f (these UUID's are only examples). 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

The system ran in an inconsistent state even though the UUID's configurations didn't match across the cluster. 


Expected results:

The system would not function until the UUID's were corrected. 


Additional info:

It might be useful to have some sort of UUID consistency check algorithm that either stops the system and throws an error, or is intelligent enough to correct the problem.

Comment 1 Niels de Vos 2014-11-27 14:54:35 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.


Note You need to log in before you can comment on or make changes to this bug.