Red Hat Bugzilla – Bug 851989
Glusterd was unable to start due to unavailability of entry in one of the peer file.
Last modified: 2015-11-03 18:04:37 EST
Description of problem: The glusterd was unable to write the entries in one the peer file , because of the lack of space on the root partition( most probably), but still glusterd was running, may be because it already had the information about the peer in the memory. But when you try to restart the glusterd , it fails in the init, even though the root partion has enough space. Although it has information about the other peers. It gets stuck in that state only until you rewrite the peer file manually. If it has the info about the other peers , then it should be able to get the info about peer it doesn't have info from the peers who have the information.
Version-Release number of selected component (if applicable):RHS-2.0.z
Steps to Reproduce:
1.peer probe some four machines.
2.Remove entries from one of the file from one of machines.(don't remove file)
3.Try to restart the glusterd of the machines from where you removes the entry.
Actual results: The glusterd fails in the init.
Expected results:The glusterd should start successfully.
Considering the bug is hit when the disk space was not available, not considering it blocker.
http://review.gluster.com/654 and http://review.gluster.com/3726 fix the issue on master. Need to review if we need this for 2.0.z branch.
verified on glusterfs-18.104.22.168rhs-1.el6rhs.x86_64.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
For information on the advisory, and where to find the updated files, follow the link below.
If the solution does not work for you, open a new bug report.