Red Hat Bugzilla – Bug 765441
volume replace-brick unstable
Last modified: 2015-11-03 18:03:51 EST
tl;dr: When trying to run a replace-brick, the operation fails after a period of time and leaves the cluster in an inconsistent state, requiring a complete restart (and possibly a redefinition of the volume) to try again.
Volume Name: store
Number of Bricks: 2
gluster volume replace-brick store 126.96.36.199:/mnt/live 188.8.131.52:/mnt/live start
gluster volume replace-brick store 184.108.40.206:/mnt/live 220.127.116.11:/mnt/live status
This was started on 10/09/2011 around 15:00 (give or take an hour). The status and the disk activity indicated that it was working. However, at some point before 20:30 the same day, the transfer stopped. At this point, cloud0 and cloud2 could not tell me the status of the replace-brick.
The next morning (around 11:00) I started trying to restart the replace-brick. I was able to abort the previous one, and start a new one. This new one reported success, but failed immediately. After trying this several times, the cluster entered an inconsistent state where cloud0 was trying to initiate a replace-brick operation that cloud2 thought was already in progress. Restarting all gluster processes on cloud2 did not alleviate this problem. I was unable to restart gluster processes on cloud0 because it is a production machine.
Following advice in #gluster, I checked the contents of the rbstate file:
root@cloud2:/var/log/glusterfs# cat /etc/glusterd/vols/store/rbstate
root@cloud0:/etc/glusterd/vols/store# cat rbstate
Attached to this bug are the complete log directories for both cloud0 and cloud2. Please keep these files confidential, as they have not been anonymized.
CHANGE: http://review.gluster.com/609 (Change-Id: Ie14492451cab821e7ed60e68dbaff22d7d78fba9) merged in release-3.2 by Vijay Bellur (firstname.lastname@example.org)
CHANGE: http://review.gluster.com/2689 (glusterd: Refactored rb subcmds code and fixed some minor issues.) merged in master by Vijay Bellur (email@example.com)