Bug 1200252

Summary: Self heal command gives error "Launching heal operation to perform index self heal on volume vol0 has been unsuccessful"
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Anil Shah <ashah>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED NEXTRELEASE QA Contact: Anil Shah <ashah>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: rhs-bugs, storage-qa-internal, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1294612 (view as bug list) Environment:
Last Closed: 2015-12-29 09:39:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1294612, 1302291, 1306922    

Description Anil Shah 2015-03-10 06:29:31 UTC
Description of problem:

When there are multiple source to heal, starting self heal daemon when one of the source brick is down give error "Launching heal operation to perform index self heal on volume vol0 has been unsuccessful" even though heal is successful  

Version-Release number of selected component (if applicable):

[root@localhost ~]# rpm -qa | grep glusterfs
glusterfs-api-3.6.0.50-1.el6rhs.x86_64
glusterfs-geo-replication-3.6.0.50-1.el6rhs.x86_64
glusterfs-3.6.0.50-1.el6rhs.x86_64
samba-glusterfs-3.6.509-169.4.el6rhs.x86_64
glusterfs-fuse-3.6.0.50-1.el6rhs.x86_64
glusterfs-server-3.6.0.50-1.el6rhs.x86_64
glusterfs-rdma-3.6.0.50-1.el6rhs.x86_64
glusterfs-libs-3.6.0.50-1.el6rhs.x86_64
glusterfs-cli-3.6.0.50-1.el6rhs.x86_64
glusterfs-debuginfo-3.6.0.50-1.el6rhs.x86_64


How reproducible:
100%

Steps to Reproduce:
1. create 2x3 distribute replicate volume and do fuse mount 
2. set self-heal daemon, data , metadata and entry self-heal off
3. kill brick 3 and brick 6 
4. create some files on mount point
5. bring brick 3 and 6 up
6. kill the brick 2 and brick 4 from subvolume
7. make self-heal-daemon on
8. trigger the self heal daemon e.g. gluster v heal vol0 


Actual results:

"Launching heal operation to perform index self heal on volume vol0 has been unsuccessful" though self heal completes 

Expected results:

error message should not be displayed

Additional info:


[root@localhost ~]# gluster v info
 
Volume Name: vol0
Type: Distributed-Replicate
Volume ID: d0e9e55c-a62d-4b2b-907d-d56f90e5d06f
Status: Started
Snap Volume: no
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.47.143:/rhs/brick1/b1
Brick2: 10.70.47.145:/rhs/brick1/b2
Brick3: 10.70.47.150:/rhs/brick1/b3
Brick4: 10.70.47.151:/rhs/brick1/b4
Brick5: 10.70.47.143:/rhs/brick2/b5
Brick6: 10.70.47.145:/rhs/brick2/b6
Options Reconfigured:
cluster.quorum-type: auto
performance.readdir-ahead: on
performance.write-behind: off
performance.read-ahead: off
performance.io-cache: off
performance.quick-read: off
performance.open-behind: off
cluster.self-heal-daemon: on
cluster.data-self-heal: off
cluster.metadata-self-heal: off
cluster.entry-self-heal: off
snap-max-hard-limit: 256
snap-max-soft-limit: 90
auto-delete: disable

[root@localhost ~]# gluster v status
Status of volume: vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.143:/rhs/brick1/b1           49152     0          Y       32485
Brick 10.70.47.145:/rhs/brick1/b2           N/A       N/A        N       N/A  
Brick 10.70.47.150:/rhs/brick1/b3           49152     0          Y       31465
Brick 10.70.47.151:/rhs/brick1/b4           49152     0          Y       16654
Brick 10.70.47.143:/rhs/brick2/b5           N/A       N/A        N       N/A  
Brick 10.70.47.145:/rhs/brick2/b6           49153     0          Y       16126
NFS Server on localhost                     2049      0          Y       19006
Self-heal Daemon on localhost               N/A       N/A        Y       19015
NFS Server on 10.70.47.145                  2049      0          Y       15920
Self-heal Daemon on 10.70.47.145            N/A       N/A        Y       15929
NFS Server on 10.70.47.150                  2049      0          Y       31251
Self-heal Daemon on 10.70.47.150            N/A       N/A        Y       31262
NFS Server on 10.70.47.151                  2049      0          Y       31815
Self-heal Daemon on 10.70.47.151            N/A       N/A        Y       31824
 
Task Status of Volume vol0
------------------------------------------------------------------------------
There are no active volume tasks