Bug 917686 - Self-healing does not physically replicate content of new file/dir
Summary: Self-healing does not physically replicate content of new file/dir
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.3.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On: 969384
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-04 14:57 UTC by Patrick Monnerat
Modified: 2014-12-14 19:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Cause: Explained in comment-3. Consequence: Workaround (if any): Result:
Clone Of:
Environment:
Last Closed: 2014-12-14 19:40:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Patrick Monnerat 2013-03-04 14:57:01 UTC
Description of problem:
When a file/dir has been created on host A while host B was down, bringing up B (with proactive full self-healing on) replicates the object, but not its contents.

Version-Release number of selected component (if applicable):
3.3.1 (F18 3.3.1-10)

How reproducible:
Always


Steps to Reproduce:
1. Create a glusterfs volume on two empty directories: gv0 replica 2 A:/replica B:/replica
2. Stop daemon on host B
3. On host A: mount -t glusterfs A:gv0 /mnt
4. On host A: mkdir /mnt/dir
5. On host A: echo test /mnt/dir/file
6. On Host B: start daemon, do not mount volume
7. On host B: sleep 10; ls /replica    -->   dir             (OK)
8. On host B: ls /replica/dir    --> empty         (KO)
9. On host B: gluster volume gv0 heal; sleep 10; ls /replica/dir  --> file  (OK)
10. On host B: ls -l /replica/dir  -->  -rw-r--r-- 2 root root 0 Mar  4 15:15 file  --> (KO: file is empty)
11: On host B: gluster volume gv0 heal; sleep 10; ls -l /replica/dir --> -rw-r--r-- 2 root root 5 Mar  4 15:16 file  --> (OK)

Actual results:
See above

Expected results:
Initial self-healing properly replicates all objects and contents recursively, including newly replicated (sub)dirs/files.

Additional info:
- This problem is important increases the "chances" to lose data since the later are not replicated as soon as the host goes up, as expected from a replication scheme. This is particularly true if host B does not "work" with the new data for a long time after power-up.
- In the above "Steps to reproduce", waiting longer between actions does not alter the results.
- The above example shows that each consecutive healing request replicates a single level of new contents.

Thanks for investigating.

Comment 1 yin.yin 2013-03-11 06:07:05 UTC
the same as bug: https://bugzilla.redhat.com/show_bug.cgi?id=852741 ?

Comment 2 Patrick Monnerat 2013-03-11 09:12:07 UTC
I don't think so: 852741 is about self-healing not starting. This one assumes self-healing starts, updates changed objects and create new ones, but does not process contents of new objects.

Comment 3 Pranith Kumar K 2013-03-15 09:55:52 UTC
hi Patrick,
     There are two types of self-heals done by self-heal-daemon. One is called full crawl which is similar to find command execution on the mount point. Other one is healing only the files that need healing. Basically the bricks remember all the files/dirs that need self-heal but there is no way to figure out in which order the heal needs to happen.

If one creates a dir 'd' and a file 'f' inside 'd' with some data in it. For healing this correctly self-heal on 'd' must be attempted before self-heal on 'f'. But if self-heal on 'f' is attempted before 'd' only 'd' is self-healed and the file 'f' is scheduled for healing after 10 minutes. After which you should see that the file f is also healed properly.

So in general if the directory depth is x we need atleast x+1 attempts of self-heal by self-heal daemon to completely heal the data. These attempts will be done automatically. If you want to trigger the heal just once and want to make sure the data is healed completely please execute 'volume heal <volname> full' manually.

We are going to document this behavior and close the bug for now.
Please feel free to ask if you need more information.

Thanks for logging the bug.

Comment 4 Patrick Monnerat 2013-03-15 10:12:13 UTC
Understood, thanks for the explanation.
But this means that, if not a bug, this "feature" is at least a caveat and leaves glusterfs not realy usable for RAID-like replication, I'm afraid.
I tried to look through the code to find a solution, but it's too "unsequential" to be completely understood without some documentation/explanation. But I wonder if it should be possible to automatically reschedule an "immediate" (i.e.: before 10 minutes) self-healing if an object cannot be healed (for the underlying reason) during a pass? Or if the unhealed objects could be pushed for later processing?

Comment 5 Patrick Monnerat 2013-05-07 15:36:33 UTC
BTW, would it be possible to have a config. param. telling to schedule a full crawl at daemon start-up? This can be a new feature request, actually :-)

Comment 6 Pranith Kumar K 2014-07-13 07:40:23 UTC
Patrick,
    full crawl is extremely expensive. So instead we are coming up with algorithms to decrease the number of crawls as in https://bugzilla.redhat.com/show_bug.cgi?id=969384. Let me know if that would suffice?

Pranith

Comment 7 Patrick Monnerat 2014-07-14 08:45:01 UTC
This seems already better !
In addition, would it be possible to, as proposed in comment 4, flag an "incomplete" crawl to schedule the next one immediately after completion (i.e.: without waiting 10 minutes)? I think such a solution might be satisfactory.

Thanks for your support.

Comment 8 Pranith Kumar K 2014-07-14 15:27:10 UTC
That not waiting for 10 minutes thing is already present on upstream and should be released in 3.6. if we don't find any issues.

Comment 9 Patrick Monnerat 2014-07-14 15:28:21 UTC
Thanks a lot :-)

Comment 10 Niels de Vos 2014-11-27 14:54:16 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.


Note You need to log in before you can comment on or make changes to this bug.