1239021 – AFR: gluster v restart force or brick process restart doesn't heal the files

Bug 1239021 - AFR: gluster v restart force or brick process restart doesn't heal the files

Summary: AFR: gluster v restart force or brick process restart doesn't heal the files

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.1.1
Assignee:	Ravishankar N
QA Contact:	Shruti Sampat
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1223636 1251815 1253309 1255690 1256245
TreeView+	depends on / blocked

Reported:	2015-07-03 09:35 UTC by Anil Shah
Modified:	2016-09-17 12:20 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.7.1-13
Doc Type:	Bug Fix
Doc Text:	Previously, the self-heal daemon was performing a crawl only on the brick that came up after it went down. So the pending heals were not happening immediately after the child is up, but only after the cluster.heal-timeout value. With the fix, index heal will be triggered on all local subvolumes of a replicated volume.
Clone Of:
Clones:	1253309 (view as bug list)
Environment:
Last Closed:	2015-10-05 07:18:15 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2015:1845	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.1 update	2015-10-05 11:06:22 UTC

Description Anil Shah 2015-07-03 09:35:03 UTC

Description of problem:

When one of the replica brick is down and do some file operation, gluster vol restart or brick process restart doesn't heal the files which needs to be healed.

Version-Release number of selected component (if applicable):

glusterfs-3.7.1-7.el6rhs.x86_64


How reproducible:

100%

Steps to Reproduce:

1. Create 2*2 distribute replicate volume
2. Do fuse mount 
3. create some files on mount point
4. kill one of the replica brick
5. rename the file from the mount point
6. check gluster v heal <volname> info
7. restart the volume or restart the brick process


Actual results:

Files are not healed


Expected results:

volume restart or brick process restart should heal the files which need to be healed

Additional info:

Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 53c64343-c537-428c-b7b7-a45f198c42a0
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.33.214:/rhs/brick1/b001
Brick2: 10.70.33.219:/rhs/brick1/b002
Brick3: 10.70.33.225:/rhs/brick1/b003
Brick4: 10.70.44.13:/rhs/brick1/b004
Options Reconfigured:
performance.readdir-ahead: on
features.uss: enable
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
server.allow-insecure: on
features.barrier: disable
cluster.enable-shared-storage: enable

Comment 6 Anuradha 2015-08-18 06:52:31 UTC

Patch posted upstream - http://review.gluster.org/11912

Comment 7 Ravishankar N 2015-08-24 07:33:58 UTC

Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/56024/

Comment 8 Shruti Sampat 2015-08-28 08:49:06 UTC

Verified as fixed in glusterfs-3.7.1-13.el7rhgs.x86_64. Heals are now happening as soon as volume is started with force.

Comment 9 Divya 2015-09-22 05:31:59 UTC

Ravishankar,

Made a few minor edits to the doc text. Could you review and sign-off?

Comment 10 Ravishankar N 2015-09-22 05:57:29 UTC

Looks okay to me.

Comment 12 errata-xmlrpc 2015-10-05 07:18:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1845.html

Note You need to log in before you can comment on or make changes to this bug.