Bug 862838
Summary: | Self-heal is unreliable if other volumes are present | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Jeff Darcy <jdarcy> | ||||
Component: | replicate | Assignee: | Jeff Darcy <jdarcy> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | mainline | CC: | gluster-bugs, spandura | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.4.0 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-07-24 17:22:18 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jeff Darcy
2012-10-03 17:55:49 UTC
The problem turned out to be something completely unexpected. Part of the self-heal code was calling synctask_new from code that was already in a synctask. It would then become self-deadlocked waiting for the new task to complete, because it was itself sitting on the last resource for running that task. That wouldn't happen with a single volume, because the default synctask-processor count is two so there would in fact be a spare, but with two volumes we'd run out and the entire self-heal daemon would effectively stop. I've submitted two separate fixed for this. http://review.gluster.org/4032 fixes this in AFR, by changing _do_crawl_op_on_local_subvols to call afr_syncop_find_child_position directly instead of through synctask_new. http://review.gluster.org/4031 fixes it in the core, by adding extra code to make sure we have a processor to run the new task even if someone makes this mistake again. 1) performing "find . | xargs stat" (at step7) on the client triggers background self-heal. Not sure why "find | xargs stat" is not triggering the self-heal. when we execute background self-heal tests we always perform "find . | xargs stat". 2) within 10 minutes of time "self-heal daemon" triggers self-heal but it is not triggering the self-heal immediately as soon as the brick comes online. There is already a bug reported for this. (Bug 852741) Jeff, shwetha, This code path is the result of e8712f36335dd3b8508914f917d74b69a2d751a1. Pranith. pranith, can you please explain why "find . | xargs stat" triggered self-heal and why "find | xargs stat" didn't trigger self-heal. (In reply to comment #2) > 1) performing "find . | xargs stat" (at step7) on the client triggers > background self-heal. The bug is entirely reproducible given two preconditions (a) Using code from git master - i.e. not GlusterFS 3.3 or Red Hat Storage 2.0 - prior to the two patches mentioned above (b) With two or more volumes being exported by each involved server Is it possible that you're not seeing this because you don't meet those preconditions? Also, doing "find" on the client wouldn't be background self-heal. It would be foreground B) I had two volumes (pure-replicate) and (dis-rep) volumes running and 2 servers and 1 brick on each server in case of pure-replicate and 2 more servers for dis-rep . But still not seeing the issue when "find . | xargs stat" is performed. I branched from current master (6c2eb4f2) and reverted the synctask_new fix (557637f3). Then I rebuilt, reinstalled, and re-ran the steps above.. Here's the state after step 6. *** On gfs1 (daemons had been killed and restarted) # file: export/sdb/top trusted.gfid=0xa42fb2a23c604aafaabdcbdd1cda4d40 # export/sdb/dir and export/sdb/dir/sub not present *** On gfs2 (daemons had run since start of test) # file: export/sdb/dir trusted.afr.rep2-client-0=0x000000000000000000000001 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xf2c6ad80ca704f52bd926ad207c0cd69 trusted.glusterfs.dht=0x000000010000000000000000ffffffff # file: export/sdb/dir/sub trusted.afr.rep2-client-0=0x000000010000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0x289bd4b244a949d387aac88f824e47c8 # file: export/sdb/top trusted.afr.rep2-client-0=0x000000020000000100000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xa42fb2a23c604aafaabdcbdd1cda4d40 As you can see, /export/sdb/top had been created as an empty file with no AFR xattrs on gfs1, leaving the copy on gfs2 with a changelog of data=2,meta=1. This is the result of the entry self-heal on the volume root from gfs1's NFS daemon. On gfs2, /export/sdb/dir has a changelog of entry=1 and /export/sdb/dir/sub has a changelog of data=1. This is all as described above. At this point I did a "find . | xargs stat" on the still-mounted client (gfs4) with the following result. *** On gfs1 # file: export/sdb/dir trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xf2c6ad80ca704f52bd926ad207c0cd69 trusted.glusterfs.dht=0x000000010000000000000000ffffffff # file: export/sdb/dir/sub trusted.gfid=0x289bd4b244a949d387aac88f824e47c8 # file: export/sdb/top trusted.gfid=0xa42fb2a23c604aafaabdcbdd1cda4d40 *** On gfs2 # file: export/sdb/dir trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xf2c6ad80ca704f52bd926ad207c0cd69 trusted.glusterfs.dht=0x000000010000000000000000ffffffff # file: export/sdb/dir/sub trusted.afr.rep2-client-0=0x000000020000000100000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0x289bd4b244a949d387aac88f824e47c8 # file: export/sdb/top trusted.afr.rep2-client-0=0x000000020000000100000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xa42fb2a23c604aafaabdcbdd1cda4d40 So /export/sdb/dir got created and seems fine, while /export/sdb/dir/sub got created the same way as /export/sdb/top had previously. In other words, entry self-heal happened but data self-heal did not. Finally, I unmounted and remounted the volume on gfs4, then re-ran the find|xargs command. *** On gfs1 # file: export/sdb/dir trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xf2c6ad80ca704f52bd926ad207c0cd69 trusted.glusterfs.dht=0x000000010000000000000000ffffffff # file: export/sdb/dir/sub trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0x289bd4b244a949d387aac88f824e47c8 # file: export/sdb/top trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xa42fb2a23c604aafaabdcbdd1cda4d40 *** On gfs2 # file: export/sdb/dir trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xf2c6ad80ca704f52bd926ad207c0cd69 trusted.glusterfs.dht=0x000000010000000000000000ffffffff # file: export/sdb/dir/sub trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0x289bd4b244a949d387aac88f824e47c8 # file: export/sdb/top trusted.afr.rep2-client-0=0x000000000000000000000000 trusted.afr.rep2-client-1=0x000000000000000000000000 trusted.gfid=0xa42fb2a23c604aafaabdcbdd1cda4d40 Now all of the AFR xattrs are present, all zero, and (not shown above) the contents are correct on both nodes. In other words, data self-heal finally happened. I did use gdb to check the self-heal daemon on gfs2, and it did show the expected two threads in self-deadlock trying to call synctask_new from within synctask_wrap. I don't know why you're seeing different behavior on your systems, but I'm also not sure what more I can do to demonstrate that this problem is reproducible without yesterday's fixes. Created attachment 621915 [details]
Testcase execution commands history
Not sure if I am executing some steps wrongly. Hence attaching the output of the test case commands execution . Please let me know us know your thoughts
CHANGE: http://review.gluster.org/4032 (replicate: don't use synctask_new from within a synctask) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4085 (syncop: save and restore THIS from the time of context switch) merged in master by Anand Avati (avati) |