Description of problem: ------------------------ On a replicate volume when "gluster volume heal <vol_name> full" command is executed on several node when few bricks are down, each node reports different error message. Version-Release number of selected component (if applicable): ------------------------------------------------------------- 3.3.0qa45 How reproducible: ----------------- Often Steps to Reproduce: 1.Create a replicate volume(1X3: brick1 on node1, brick2 on node2, brick3 on node3). Start the volume 2.Bring down brick1 and brick2 3.Create a fuse mount. 4.Execute : "dd if=/dev/urandom of=./file bs=1M count=1" 5.On machine1, machine2 and machine3 execute: gluster v heal <volume_name> full Actual results: ---------------- [06/08/12 - 08:07:21 root@AFR-Server1 ~]# gluster v heal vol1 full Operation failed on 10.16.159.196 [06/08/12 - 08:07:17 root@AFR-Server2 ~]# gluster v heal vol1 full Operation failed on 10.16.159.196 [06/08/12 - 08:06:10 root@AFR-Server3 ~]# gluster v heal vol1 full Heal operation on volume vol1 has been unsuccessful Expected results: ----------------- Error message should be consistent on all the nodes. Additional info: ---------------- [06/08/12 - 08:09:04 root@AFR-Server3 ~]# gluster v status Status of volume: vol1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.16.159.184:/export_b1/dir1 24009 N 8935 Brick 10.16.159.188:/export_b1/dir1 24009 N 12896 Brick 10.16.159.196:/export_b1/dir1 24009 Y 28360 NFS Server on localhost 38467 Y 28725 Self-heal Daemon on localhost N/A Y 28731 NFS Server on 10.16.159.184 38467 Y 8867 Self-heal Daemon on 10.16.159.184 N/A Y 8873 NFS Server on 10.16.159.188 38467 Y 13263 Self-heal Daemon on 10.16.159.188 N/A Y 13269 [06/08/12 - 08:09:08 root@AFR-Server3 ~]# [06/08/12 - 08:09:10 root@AFR-Server3 ~]# gluster v info Volume Name: vol1 Type: Replicate Volume ID: e5ff8b2b-7d44-405e-8266-54e5e68b0241 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.16.159.184:/export_b1/dir1 Brick2: 10.16.159.188:/export_b1/dir1 Brick3: 10.16.159.196:/export_b1/dir1 Options Reconfigured: cluster.eager-lock: on performance.write-behind: on
Bug got fixed in the current release from rebase.
Now all three nodes give same output.