869251 – [RHEV-RHS]: "gluster volume heal <vol-name> info healed" doesn't show entry which is healed

Bug 869251 - [RHEV-RHS]: "gluster volume heal <vol-name> info healed" doesn't show entry which is healed

Summary: [RHEV-RHS]: "gluster volume heal <vol-name> info healed" doesn't show entry ...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Pranith Kumar K
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-10-23 12:04 UTC by Rahul Hinduja
Modified:	2012-11-15 10:56 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	Cause: Only self-heals that happen from self-heal daemons are shown in 'gluster volume heal info *' commands. Even gluster native mounts, nfs server processes can do self-heals whenever the files/dirs accessed through them. Consequence: The self-heals successfully done by mount/nfs-server processes are not shown in the 'gluster volume heal info healed' command. Workaround (if any): It is always better to fix the files as in when the file is accessed to reduce the probability of going into split-brains but if the user wants to manage the self-heals only through self-heal-daemons the mounts/nfs-server processes should be configured to not perform self-heals. So the user needs to set gluster volume set <volname> cluster.entry-self-heal off gluster volume set <volname> cluster.data-self-heal off gluster volume set <volname> cluster.metadata-self-heal off Once these options are set, Until self-heal-daemon fixes the file there is a chance of the source brick (brick with actual file contents) going down before the file gets a chance to heal and probability of ending in split-brain if the file is written to after the source brick is taken down. Result: The user will be able to observe the functionality he requires, with the work around suggested.
Clone Of:
Environment:
Last Closed:	2012-11-15 10:56:23 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sosreport and glusterfs logs from rhs-client6 and rhs-client7 (5.79 MB, application/x-gzip) 2012-10-23 12:18 UTC, Rahul Hinduja	no flags	Details
View All

Description Rahul Hinduja 2012-10-23 12:04:42 UTC

Description of problem:
=======================

[RHEV-RHS]:  "gluster volume heal <vol-name> info healed" doesn't show entry which is healed

Version-Release number of selected component (if applicable):

[10/23/12 - 17:15:11 root@rhs-client7 images]# gluster --version 
glusterfs 3.3.0rhsvirt1 built on Oct  8 2012 15:23:00

(glusterfs-3.3.0rhsvirt1-7.el6rhs.x86_64)


Steps Carried:
==============

1. 2*2 distributed replicate volume formed from 

rhs-client6.lab.eng.blr.redhat.com/disk1
rhs-client7.lab.eng.blr.redhat.com/disk1
rhs-client8.lab.eng.blr.redhat.com/disk1
rhs-client9.lab.eng.blr.redhat.com/disk1

2. Brought down brick "rhs-client6.lab.eng.blr.redhat.com/disk1"

3. Created new VM's

4. Created snapshots from the existing VM's

5. Executed "gluster volume heal <vol-name> info heal-failed" which shows 4 entries for "rhs-client7.lab.eng.blr.redhat.com/disk1"

6. Executed "gluster volume heal <vol-name> info healed" which shows entries that are healed. One of the entry shown in the output of step 5 also displayed, which confirms that it is healed but other 3 are not displayed.

7. Check the getfattr on the entries that are displayed at step5. It shows all Zero's. Which confirms that it is actually healed, but not shown at step6.


Outputs:
========

A. gluster volume heal <vol-name> info heal-failed:
===================================================

[10/23/12 - 17:23:10 root@rhs-client7 images]# gluster volume heal dis-rep info heal-failed
Heal operation on volume dis-rep has been successful

Brick rhs-client6.lab.eng.blr.redhat.com:/disk1
Number of entries: 0

Brick rhs-client7.lab.eng.blr.redhat.com:/disk1
Number of entries: 4
at                    path on brick
-----------------------------------
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/e3588fda-f01c-489e-858c-9d9ab4555a9e
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/844bb1de-e06a-4d04-a73e-5e121376e8eb/399f0afd-d1be-49b9-9103-cceed751277a.lease
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/844bb1de-e06a-4d04-a73e-5e121376e8eb/399f0afd-d1be-49b9-9103-cceed751277a.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/844bb1de-e06a-4d04-a73e-5e121376e8eb

Brick rhs-client8.lab.eng.blr.redhat.com:/disk1
Number of entries: 0

Brick rhs-client9.lab.eng.blr.redhat.com:/disk1
Number of entries: 0
[10/23/12 - 17:23:11 root@rhs-client7 images]# 




B. gluster volume heal <vol-name> info healed:
==============================================


[10/23/12 - 17:24:04 root@rhs-client7 images]# gluster volume heal dis-rep info healed
Heal operation on volume dis-rep has been successful

Brick rhs-client6.lab.eng.blr.redhat.com:/disk1
Number of entries: 9
at                    path on brick
-----------------------------------
2012-10-23 17:22:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/2d224078-1027-431a-8cc7-1bf456c916a7/16d5a322-335f-42c4-b4fa-e21c3a8fb04e
2012-10-23 17:02:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077/1c450275-60c7-4ed8-b078-62c1a752c37c
2012-10-23 17:02:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/2d224078-1027-431a-8cc7-1bf456c916a7/16d5a322-335f-42c4-b4fa-e21c3a8fb04e
2012-10-23 16:42:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077/1c450275-60c7-4ed8-b078-62c1a752c37c
2012-10-23 16:22:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids
2012-10-23 16:12:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids
2012-10-23 15:52:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/2d224078-1027-431a-8cc7-1bf456c916a7/16d5a322-335f-42c4-b4fa-e21c3a8fb04e
2012-10-23 15:42:09 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids
2012-10-23 14:52:08 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids

Brick rhs-client7.lab.eng.blr.redhat.com:/disk1
Number of entries: 22
at                    path on brick
-----------------------------------
2012-10-23 17:20:40 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077/1c450275-60c7-4ed8-b078-62c1a752c37c
2012-10-23 17:00:40 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077/1c450275-60c7-4ed8-b078-62c1a752c37c
2012-10-23 17:00:40 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids
2012-10-23 16:50:40 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids
2012-10-23 15:30:40 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids
2012-10-23 15:00:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/e3588fda-f01c-489e-858c-9d9ab4555a9e/bbb161b6-ab29-41a5-b522-5656e5e8356f.meta
2012-10-23 15:00:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/e3588fda-f01c-489e-858c-9d9ab4555a9e/929eb4ae-12c9-41ca-9855-b3165b0ac851.meta
2012-10-23 15:00:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/844bb1de-e06a-4d04-a73e-5e121376e8eb/399f0afd-d1be-49b9-9103-cceed751277a.lease
2012-10-23 15:00:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/2d224078-1027-431a-8cc7-1bf456c916a7/16d5a322-335f-42c4-b4fa-e21c3a8fb04e
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/a3708c5c-a4d2-488c-b8bf-c2147cf6893d/e8be11d5-692b-4221-a40a-b6667a1ee282.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/a3708c5c-a4d2-488c-b8bf-c2147cf6893d/bf5e6dd7-f44c-4449-a2c2-7c4c4f170d3e.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/a3708c5c-a4d2-488c-b8bf-c2147cf6893d
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077/1c450275-60c7-4ed8-b078-62c1a752c37c.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077/1c450275-60c7-4ed8-b078-62c1a752c37c
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/9785eb25-0cfc-456f-9dc2-cd35caf85077
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/6d2fe622-a493-4451-9a52-eeae4962135f/b9f4a9ed-1a0a-440c-a877-74e1b0cfe5d9.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/6d2fe622-a493-4451-9a52-eeae4962135f/cb7e8c5e-9f72-4643-9f84-f53882512e5d.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/6d2fe622-a493-4451-9a52-eeae4962135f
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/aa9980ba-19ef-436b-9225-bd89212bbd54/5fc5ff1d-ae1f-4389-b393-c1262b33d221.meta
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/aa9980ba-19ef-436b-9225-bd89212bbd54
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/2d224078-1027-431a-8cc7-1bf456c916a7/16d5a322-335f-42c4-b4fa-e21c3a8fb04e
2012-10-23 14:50:39 /9dd7add5-dd1c-4740-b3c3-33e85db48e21/dom_md/ids





Note between the Output of A and B, "399f0afd-d1be-49b9-9103-cceed751277a.lease" entry shown in heal-failed output at "14:50" and also shown in output of "healed" at 15:00,

Similar should have been a case with the entries like "399f0afd-d1be-49b9-9103-cceed751277a.meta" which is shown in output of "heal-failed" at 14:50 but not shown in "healed"


When checked the getfattr on "399f0afd-d1be-49b9-9103-cceed751277a.meta" it shows all zeros
===========================================================================

[10/23/12 - 17:29:25 root@rhs-client7 images]# getfattr -d -e hex -m . /disk1/9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/844bb1de-e06a-4d04-a73e-5e121376e8eb/399f0afd-d1be-49b9-9103-cceed751277a.meta
getfattr: Removing leading '/' from absolute path names
# file: disk1/9dd7add5-dd1c-4740-b3c3-33e85db48e21/images/844bb1de-e06a-4d04-a73e-5e121376e8eb/399f0afd-d1be-49b9-9103-cceed751277a.meta
trusted.afr.dis-rep-client-0=0x000000000000000000000000
trusted.afr.dis-rep-client-1=0x000000000000000000000000
trusted.gfid=0xaf4efdd6ec4a458a97c515b9459c3060


ls /disk1/.glusterfs/indices/xattrop/ output
============================================

[10/23/12 - 17:30:08 root@rhs-client7 images]# ls /disk1/.glusterfs/indices/xattrop/
xattrop-b5f1d315-c48f-49cd-bdc2-a9b4a7c400c4

Comment 2 Rahul Hinduja 2012-10-23 12:18:43 UTC

Created attachment 632056 [details]
sosreport and glusterfs logs from rhs-client6 and rhs-client7

Comment 3 Pranith Kumar K 2012-11-06 11:07:41 UTC

Only self-heals that happen from self-heal daemons are shown in 'heal info *' commands. Even mounts, nfs server processes can do self-heals whenever the files/dirs accessed through them. If the user wants to manage the self-heals only through self-heal-daemons the mounts should be configured to not perform self-heals.
This is not a bug in afr.

Vijay,
    Do you think this needs to be documented?.

Pranith.

Comment 4 Vijay Bellur 2012-11-15 07:48:58 UTC

Let us document this behavior.

Note You need to log in before you can comment on or make changes to this bug.