Hide Forgot
Description of problem: ========================= In a 1 x 2 replicate volume while running ping_pong on a file one of the brick process went offline. Killed all the mount process and unmounted the mount points. Brought back the brick online. Self-heal is happening from source to sync. But the information is reported in sync node glustershd.log file and reported under sync node when "heal info healed" command is executed. Version-Release number of selected component (if applicable): ============================================================== glusterfs 3.4.0.36rhs built on Oct 22 2013 10:56:18 How reproducible: ================ Often Steps to Reproduce: ===================== 1. Create replicate volume ( 1 x 2 ). Start the volume. root@king [Sep-02-2013-12:31:44] >gluster v info Volume Name: vol_dis_1_rep_2 Type: Replicate Volume ID: 15a1734e-8485-4ef2-a82b-ddafff2fc97e Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: hicks.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b0 Brick2: king.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b1 Options Reconfigured: performance.write-behind: on cluster.self-heal-daemon: on 2. Create 4 fuse mounts. 3. From all the mount points start ping_pong : "ping_pong -rw ping_pong_testfile 6" 4. While ping_pong is in progress get the brick pid. kill a brick (brick1) (kill -KILL <brick_pid>) root@rhs-client11 [Oct-25-2013- 9:33:57] >ps -ef | grep glusterfsd root 1272 1 2 09:27 ? 00:00:08 /usr/sbin/glusterfsd -s rhs-client11 --volfile-id vol_rep.rhs-client11.rhs-bricks-b1 -p /var/lib/glusterd/vols/vol_rep/run/rhs-client11-rhs-bricks-b1.pid -S /var/run/fa6cf6fce4458a2be5fc60a4dc3bc11d.socket --brick-name /rhs/bricks/b1 -l /var/log/glusterfs/bricks/rhs-bricks-b1.log --xlator-option *-posix.glusterd-uuid=8b2090ab-c382-4c8a-85ea-ebab93df4c24 --brick-port 49153 --xlator-option vol_rep-server.listen-port=49153 kill -KILL 1272 5. After some time, kill all the mount process and unmount mount points. 6. Bring back the brick online by starting the brick from command line. For example: "/usr/sbin/glusterfsd -s rhs-client11 --volfile-id vol_rep.rhs-client11.rhs-bricks-b1 -p /var/lib/glusterd/vols/vol_rep/run/rhs-client11-rhs-bricks-b1.pid -S /var/run/fa6cf6fce4458a2be5fc60a4dc3bc11d.socket --brick-name /rhs/bricks/b1 -l /var/log/glusterfs/bricks/rhs-bricks-b1.log --xlator-option *-posix.glusterd-uuid=8b2090ab-c382-4c8a-85ea-ebab93df4c24 --brick-port 49153 --xlator-option vol_rep-server.listen-port=49153" Actual results: ================= root@rhs-client12 [Oct-25-2013- 9:34:32] >gluster v status Status of volume: vol_rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client11:/rhs/bricks/b1 N/A N 1272 Brick rhs-client12:/rhs/bricks/b2 49153 Y 29620 NFS Server on localhost 2049 Y 29867 Self-heal Daemon on localhost N/A Y 29875 NFS Server on rhs-client13 2049 Y 23263 Self-heal Daemon on rhs-client13 N/A Y 23267 NFS Server on rhs-client11 2049 Y 1226 Self-heal Daemon on rhs-client11 N/A Y 1237 root@rhs-client12 [Oct-25-2013- 9:35:03] >gluster v heal vol_rep info Gathering list of entries to be healed on volume vol_rep has been successful Brick rhs-client11:/rhs/bricks/b1 Status: Brick is Not connected Number of entries: 0 Brick rhs-client12:/rhs/bricks/b2 Number of entries: 1 /ping_pong_testfile root@rhs-client11 [Oct-25-2013- 9:33:57] >ps -ef | grep glusterfsd root 1272 1 2 09:27 ? 00:00:08 /usr/sbin/glusterfsd -s rhs-client11 --volfile-id vol_rep.rhs-client11.rhs-bricks-b1 -p /var/lib/glusterd/vols/vol_rep/run/rhs-client11-rhs-bricks-b1.pid -S /var/run/fa6cf6fce4458a2be5fc60a4dc3bc11d.socket --brick-name /rhs/bricks/b1 -l /var/log/glusterfs/bricks/rhs-bricks-b1.log --xlator-option *-posix.glusterd-uuid=8b2090ab-c382-4c8a-85ea-ebab93df4c24 --brick-port 49153 --xlator-option vol_rep-server.listen-port=49153 root@rhs-client12 [Oct-25-2013- 9:35:09] >gluster v heal vol_rep info Gathering list of entries to be healed on volume vol_rep has been successful Brick rhs-client11:/rhs/bricks/b1 Number of entries: 0 Brick rhs-client12:/rhs/bricks/b2 Number of entries: 0 root@rhs-client12 [Oct-25-2013- 9:35:13] >gluster v heal vol_rep info healed Gathering list of healed entries on volume vol_rep has been successful Brick rhs-client11:/rhs/bricks/b1 Number of entries: 1 at path on brick ----------------------------------- 2013-10-25 09:35:10 <gfid:e85f328d-423e-4488-9a03-e018ee85db77> Brick rhs-client12:/rhs/bricks/b2 Number of entries: 0 Expected results: ================== 1. The files which are self-healed should be reported under source storage node. 2. The file healed information should be reported in source glustershd.log file.
Created attachment 816078 [details] SOS Reports
The command "gluster volume heal <volume_name> info healed" is not supported anymore from the gluster build : "[root@rhs-client11 ~]# gluster --version glusterfs 3.6.0.15 built on Jun 9 2014 11:03:54" Refer to bug : https://bugzilla.redhat.com/show_bug.cgi?id=1104486 Hence this bug is not valid anymore. Moving the bug to CLOSED state.