Description of problem: When a volume is destroyed and recreated while one of its bricks is down, the new volume is unable to connect to one of the bricks. Version-Release number of selected component (if applicable): master How reproducible: Always Steps to Reproduce: 1. glusterd 2. gluster volume create test replica 2 server:/bricks/test{1..2} force 3. gluster volume start test 4. mount -t glusterfs server:/test /gluster/test 5. kill -9 <pid of one brick> 6. umount /gluster/test 7. gluster volume stop test 8. gluster volume delete test 9. rm -rf /bricks/test* 10. gluster volume create test replica 2 server:/bricks/test{1..2} force 11. gluster volume start test 12. mount -t glusterfs server:/test /gluster/test Actual results: Everything seems ok, even a gluster volume status shows all bricks as online, however there are messages in logs saying that one brick is not connected and writting data to the volume only writes to one brick. # gluster volume status Status of volume: test Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick server:/bricks/test_1 49154 Y 695 Brick server:/bricks/test_2 49155 Y 707 NFS Server on localhost 2049 Y 721 Self-heal Daemon on localhost N/A Y 727 Task Status of Volume test ------------------------------------------------------------------------------ There are no active volume tasks logs: [2015-01-15 12:20:04.260781] I [rpc-clnt.c:1765:rpc_clnt_reconfig] 0-test-client-1: changing port to 49153 (from 0) [2015-01-15 12:20:04.283114] E [socket.c:2276:socket_connect_finish] 0-test-client-1: connection to 192.168.200.61:49153 failed (Connection refused) Note that the port shown in the log file does not correspond to the port shown in the 'gluster volume status' command. The port in the log is the port used by that brick in the previous volume. Expected results: The new volume should connect successfully to the new bricks. Additional info: Restarting glusterd solves the problem.
This is fixed through BZ 1334270.