764232 – (GLUSTER-2500) Self Healing not working

Bug 764232 (GLUSTER-2500) - Self Healing not working

Summary: Self Healing not working

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	GLUSTER-2500
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	3.1.2
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	high
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	764761 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-03-09 00:07 UTC by mohitanchlia
Modified:	2015-12-01 16:45 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Regression:	---
Mount Type:	fuse
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description mohitanchlia 2011-03-09 00:07:17 UTC

1)
$ gluster volume info test-volume

Volume Name: test-volume
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: testefitbdb02:/gluster-test
Brick2: testefitarc01:/gluster-test
Options Reconfigured:
diagnostics.client-log-level: NONE
diagnostics.brick-log-level: NONE

2) client:

$ df -h
glusterfs#qysprfefitarc01:/test-volume
                      128G   75G   47G  62% /mnt/ext

3) server 1:

$ cat /gluster-test/g.txt
sds

4) server 2:

$ cat /gluster-test/g.txt
sds

5) kill all gluster processes on server 2 to simulate crash

6) client:

$ echo "DD" >> /mnt/ext/g.txt

7) client:

$ cat /mnt/ext/g.txt
sds
DD

8) Server 1:

$ cat /gluster-test/g.txt
sds
DD

9) Start gluster service on Server 2:

$ /etc/init.d/gluserd start

10) Client:
   $ ls -alR /mnt/ext/ 
   $ touch /mnt/ext/g.txt

11) Server 2:
$ cat /gluster-test/g.txt
sds


"DD" string change to g.txt is not replicated. However if I edit file now after server 2 is up then g.txt gets replicated.

Looks like self heal doesn't work when editing files. Times are in sync as well.

Comment 1 Pranith Kumar K 2011-03-09 00:39:30 UTC

hi Mohit,
       ls -laR does not trigger the self-heal when the stat-prefetch translator is loaded. The command to use for triggering self-heal is "find". Please see our documentation of the same.
http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate


I executed the same example on my machine and it works fine.

root@pranith-laptop:/mnt/client# cat /tmp/2/a.txt 
sds
root@pranith-laptop:/mnt/client# find .
.
./a.txt
root@pranith-laptop:/mnt/client# cat /tmp/2/a.txt 
sds
DD

I did find that if we edit the files with a text editor, the gfid of the same file is changed causing the split-brain. We shall fix that in this bug.

Pranith.

Comment 2 mohitanchlia 2011-03-09 15:31:43 UTC

(In reply to comment #1)
> hi Mohit,
>        ls -laR does not trigger the self-heal when the stat-prefetch translator
> is loaded. The command to use for triggering self-heal is "find". Please see
> our documentation of the same.
> http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate
> I executed the same example on my machine and it works fine.
> root@pranith-laptop:/mnt/client# cat /tmp/2/a.txt 
> sds
> root@pranith-laptop:/mnt/client# find .
> .
> ./a.txt
> root@pranith-laptop:/mnt/client# cat /tmp/2/a.txt 
> sds
> DD
> I did find that if we edit the files with a text editor, the gfid of the same
> file is changed causing the split-brain. We shall fix that in this bug.
> Pranith.

Thanks for fixing that bug!

How will I know if self heal worked or not. What's the best way to tell? I see there are 2 find commands and it looks in some cases running one may not be sufficient. So how can we make sure that self heal worked. Currently I am testing with one file so it's easy to verify but with millions of files it may not be possible.

Comment 3 raf 2011-03-10 18:52:05 UTC

Maybe gluster peers should monitor each others' state and start cluster self-healing in case of temporarily unreachable cluster members (peers).
E.g. how could an administrator be aware of a random short network downtime overnight that can cause filesystem inconsistency?

Comment 4 Vijay Bellur 2011-03-14 07:09:10 UTC

PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal: fixes to detected renames (gfid based))

Comment 5 mohitanchlia 2011-03-14 13:35:48 UTC

(In reply to comment #4)
> PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal:
> fixes to detected renames (gfid based))

Thanks! How do I install this bug? Can I take the latest release and update gluster?

Comment 6 Pranith Kumar K 2011-03-15 03:12:01 UTC

(In reply to comment #5)
> (In reply to comment #4)
> > PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal:
> > fixes to detected renames (gfid based))
> 
> Thanks! How do I install this bug? Can I take the latest release and update
> gluster?

first download the patch http://patches.gluster.com/patch/6385/mbox/
Take the 3.1.2 git repo and do git am <patch file patch>. You may not want to apply it in the latest code because it is still going through the QA cycles.

Comment 7 mohitanchlia 2011-03-15 20:13:51 UTC

(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal:
> > > fixes to detected renames (gfid based))
> > 
> > Thanks! How do I install this bug? Can I take the latest release and update
> > gluster?
> first download the patch http://patches.gluster.com/patch/6385/mbox/
> Take the 3.1.2 git repo and do git am <patch file patch>. You may not want to
> apply it in the latest code because it is still going through the QA cycles.

Thanks! Is this not going to be ported to new code?

Comment 8 Pranith Kumar K 2011-03-16 03:14:48 UTC

(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #5)
> > > (In reply to comment #4)
> > > > PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal:
> > > > fixes to detected renames (gfid based))
> > > 
> > > Thanks! How do I install this bug? Can I take the latest release and update
> > > gluster?
> > first download the patch http://patches.gluster.com/patch/6385/mbox/
> > Take the 3.1.2 git repo and do git am <patch file patch>. You may not want to
> > apply it in the latest code because it is still going through the QA cycles.
> 
> Thanks! Is this not going to be ported to new code?

Like I said Its already ported in latest code but it is going through QA cycles. I am still awaiting a commit in 3.1.x branch.

Comment 9 Anand Avati 2011-04-11 07:52:45 UTC

PATCH: http://patches.gluster.com/patch/6414 in release-3.1 (afr-entry-self-heal: fixes to detected renames (gfid based))

Comment 10 Jeff Darcy 2011-04-11 11:09:09 UTC

The "find" and "ls" approaches should - and in my testing do - generate the same lstat calls which should trigger self-heal.  The only differences are the order and timing of these calls, and if there are races etc. in stat-prefetch that create a dependency on one of these two approaches then stat-prefetch is broken.

This translator breaks so many things and has been the subject of so many "disable this and retry" suggestions in response to user problems that I disable it in all CloudFS configurations and expect to continue doing so indefinitely.  I don't need the support burden, and I suspect neither do you.

Comment 11 mohitanchlia 2011-04-11 13:16:13 UTC

(In reply to comment #10)
> The "find" and "ls" approaches should - and in my testing do - generate the
> same lstat calls which should trigger self-heal.  The only differences are the
> order and timing of these calls, and if there are races etc. in stat-prefetch
> that create a dependency on one of these two approaches then stat-prefetch is
> broken.
> This translator breaks so many things and has been the subject of so many
> "disable this and retry" suggestions in response to user problems that I
> disable it in all CloudFS configurations and expect to continue doing so
> indefinitely.  I don't need the support burden, and I suspect neither do you.

I really didn't follow. Do you mean disable self healing? Can you please help me with an example and suggested approach and how that is better? Thanks!

Comment 12 mohitanchlia 2011-04-11 13:16:55 UTC

(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > (In reply to comment #5)
> > > > (In reply to comment #4)
> > > > > PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal:
> > > > > fixes to detected renames (gfid based))
> > > > 
> > > > Thanks! How do I install this bug? Can I take the latest release and update
> > > > gluster?
> > > first download the patch http://patches.gluster.com/patch/6385/mbox/
> > > Take the 3.1.2 git repo and do git am <patch file patch>. You may not want to
> > > apply it in the latest code because it is still going through the QA cycles.
> > 
> > Thanks! Is this not going to be ported to new code?
> Like I said Its already ported in latest code but it is going through QA
> cycles. I am still awaiting a commit in 3.1.x branch.

Is it available in 3.1.3? If I just download the rpm would it have this fix?

Comment 13 Pranith Kumar K 2011-04-11 13:22:19 UTC

(In reply to comment #12)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > (In reply to comment #6)
> > > > (In reply to comment #5)
> > > > > (In reply to comment #4)
> > > > > > PATCH: http://patches.gluster.com/patch/6385 in master (afr-entry-self-heal:
> > > > > > fixes to detected renames (gfid based))
> > > > > 
> > > > > Thanks! How do I install this bug? Can I take the latest release and update
> > > > > gluster?
> > > > first download the patch http://patches.gluster.com/patch/6385/mbox/
> > > > Take the 3.1.2 git repo and do git am <patch file patch>. You may not want to
> > > > apply it in the latest code because it is still going through the QA cycles.
> > > 
> > > Thanks! Is this not going to be ported to new code?
> > Like I said Its already ported in latest code but it is going through QA
> > cycles. I am still awaiting a commit in 3.1.x branch.
> 
> Is it available in 3.1.3? If I just download the rpm would it have this fix?

It will be available in >= 3.1.5 and >= 3.2.0

Pranith

Comment 14 Jeff Darcy 2011-04-11 14:05:48 UTC

(In reply to comment #11) 
> I really didn't follow. Do you mean disable self healing? Can you please help
> me with an example and suggested approach and how that is better? Thanks!

I certainly don't mean that disabling self-heal would be better.  I mean that self-heal should be done when a stat/fstat VFS call is done *regardless of whether it came from "find" or "ls" or anything else*.  If any program does a stat/lstat/fstat/fstatat call, and that call doesn't propagate to AFR because of something stat-prefetch is doing, then stat-prefetch is interfering with self-heal and that effectively means lost data.  If only "find" can cause self-heal to occur correctly, as comment #1 implies, that's a pretty catastrophic problem.  I cannot in good conscience tell users they should live with that for the sake of slightly better directory-listing performance in some cases.

Comment 15 mohitanchlia 2011-04-11 14:09:29 UTC

(In reply to comment #14)
> (In reply to comment #11) 
> > I really didn't follow. Do you mean disable self healing? Can you please help
> > me with an example and suggested approach and how that is better? Thanks!
> I certainly don't mean that disabling self-heal would be better.  I mean that
> self-heal should be done when a stat/fstat VFS call is done *regardless of
> whether it came from "find" or "ls" or anything else*.  If any program does a
> stat/lstat/fstat/fstatat call, and that call doesn't propagate to AFR because
> of something stat-prefetch is doing, then stat-prefetch is interfering with
> self-heal and that effectively means lost data.  If only "find" can cause
> self-heal to occur correctly, as comment #1 implies, that's a pretty
> catastrophic problem.  I cannot in good conscience tell users they should live
> with that for the sake of slightly better directory-listing performance in some
> cases.

thanks! What's your suggestion for me?

Comment 16 Jeff Darcy 2011-04-11 14:16:10 UTC

(In reply to comment #15) 
> thanks! What's your suggestion for me?

My suggestion (perhaps for Pranith Kumar K) is to identify why "ls -alR" doesn't trigger self-heal properly when stat-prefetch is loaded, and then fix stat-prefetch so that it does, instead of saying to use "find" instead.  As far as I can tell, "ls" does issue the necessary lstat(2) calls.

Comment 17 mohitanchlia 2011-04-11 14:17:55 UTC

(In reply to comment #16)
> (In reply to comment #15) 
> > thanks! What's your suggestion for me?
> My suggestion (perhaps for Pranith Kumar K) is to identify why "ls -alR"
> doesn't trigger self-heal properly when stat-prefetch is loaded, and then fix
> stat-prefetch so that it does, instead of saying to use "find" instead.  As far
> as I can tell, "ls" does issue the necessary lstat(2) calls.

Yes I agree. Should a new bug be filed for this?

Comment 18 Jeff Darcy 2011-04-11 14:23:07 UTC

(In reply to comment #17)
> Should a new bug be filed for this?

It depends on your internal policies/preferences, but probably a new bug.  While the fact that "ls -alR" did not force self-heal does seem highly relevant to the original report, the gfid problem would have prevented it from working anyway and has already been addressed to close this bug.

Comment 19 Pranith Kumar K 2011-04-11 14:54:09 UTC

(In reply to comment #18)
> (In reply to comment #17)
> > Should a new bug be filed for this?
> 
> It depends on your internal policies/preferences, but probably a new bug. 
> While the fact that "ls -alR" did not force self-heal does seem highly relevant
> to the original report, the gfid problem would have prevented it from working
> anyway and has already been addressed to close this bug.

Jeff,
    It is not because of the ls/find. (In reply to comment #18)
> (In reply to comment #17)
> > Should a new bug be filed for this?
> 
> It depends on your internal policies/preferences, but probably a new bug. 
> While the fact that "ls -alR" did not force self-heal does seem highly relevant
> to the original report, the gfid problem would have prevented it from working
> anyway and has already been addressed to close this bug.

Please refer to the "comment #1"
Here is the problem that is fixed:
> I did find that if we edit the files with a text editor, the gfid of the same
> file is changed causing the split-brain. We shall fix that in this bug.

Gfid is equivalent to inode in the glusterfs cluster.
This problem happens because most of the editors create a temporary file (which has different gfid) and rename the file to the original file on save+exit. This results in a scenario where same filepath on the replicas end-up with different gfids on each of the replicas when one of the nodes in replica is offline.


Pranith

Comment 20 mohitanchlia 2011-04-11 15:11:20 UTC

(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > Should a new bug be filed for this?
> > 
> > It depends on your internal policies/preferences, but probably a new bug. 
> > While the fact that "ls -alR" did not force self-heal does seem highly relevant
> > to the original report, the gfid problem would have prevented it from working
> > anyway and has already been addressed to close this bug.
> Jeff,
>     It is not because of the ls/find. (In reply to comment #18)
> > (In reply to comment #17)
> > > Should a new bug be filed for this?
> > 
> > It depends on your internal policies/preferences, but probably a new bug. 
> > While the fact that "ls -alR" did not force self-heal does seem highly relevant
> > to the original report, the gfid problem would have prevented it from working
> > anyway and has already been addressed to close this bug.
> Please refer to the "comment #1"
> Here is the problem that is fixed:
> > I did find that if we edit the files with a text editor, the gfid of the same
> > file is changed causing the split-brain. We shall fix that in this bug.
> Gfid is equivalent to inode in the glusterfs cluster.
> This problem happens because most of the editors create a temporary file (which
> has different gfid) and rename the file to the original file on save+exit. This
> results in a scenario where same filepath on the replicas end-up with different
> gfids on each of the replicas when one of the nodes in replica is offline.
> Pranith

Hi Pranith,

Thanks for the info!

Jeff,

Still waiting for your post on extended attribute :)

Comment 21 Pranith Kumar K 2011-04-15 09:48:27 UTC

*** Bug 2745 has been marked as a duplicate of this bug. ***

Comment 22 Raghavendra Bhat 2011-06-10 06:52:06 UTC

Its fixed now. Brought down a replica child. Edisted a file and brought up the child. Triggered self-heal. And checked in the backend. The gfid was same on both the backend. The data was the latest. Tried with both emacs and vim.

Comment 23 Pranith Kumar K 2011-06-16 02:32:43 UTC

*** Bug 3029 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.