Hide Forgot
The following errors are seen on the replicate setup: [2010-04-18 12:04:20] E [posix.c:509:posix_lookup] posix5: post-operation lstat on parent of .landfill/gardener-files-gardens-tangle002-active_domains_by_site.json.txt failed: No such file or directory [2010-04-18 15:44:37] E [posix.c:2366:posix_open] posix5: open on /mnt/brick5/gardener/files/gardens/tangle002: Is a directory [2010-04-18 15:44:38] E [posix.c:2366:posix_open] posix5: open on /mnt/brick5/gardener/files/gardens/tangle002: Is a directory [2010-04-18 15:44:38] E [posix.c:2146:posix_truncate] posix5: truncate on /gardener/files/gardens/tangle002 failed: Is a directory Some observations by Vikas@: > [2010-04-18 12:04:20] E [posix.c:509:posix_lookup] posix5: > post-operation lstat on parent of > .landfill/gardener-files-gardens-tangle002-active_domains_by_site.json.txt > failed: No such file or directory .landfill is the replicate directory for self-heal. When self-heal decides that a file should be deleted on a subvolume, it moves (rename) it to this directory (as a safety feature). So this file was originally: /gardener/files/gardens/tangle002/active_domains_by_site.json.txt > [2010-04-18 15:44:37] E [posix.c:2366:posix_open] posix5: open on > /mnt/brick5/gardener/files/gardens/tangle002: Is a directory > [2010-04-18 15:44:38] E [posix.c:2366:posix_open] posix5: open on > /mnt/brick5/gardener/files/gardens/tangle002: Is a directory > [2010-04-18 15:44:38] E [posix.c:2146:posix_truncate] posix5: truncate > on /gardener/files/gardens/tangle002 failed: Is a directory Due to the above lstat failing, client has wrongly concluded that /gardener/files/gardens/tangle002 does not exist (ENOENT). Then presumably it has sent open(O_CREAT) which has led to EISDIR.
This can be avoided by removing the dentry from inode table once the file is moved (ie, renamed) to .landfill directory. That way, we can prevent this file from getting resolved in server-protocol, hence preventing the errors.
Do seem to have the same issue with 3.0.2 when testing the selfheal function (killing second storage node while writing from client...)... [2010-07-14 12:57:25] E [posix.c:477:posix_lookup] posix: post-operation lstat on parent of .landfill/vhosts-blade-user.samh.include failed: No such file or directory [2010-07-14 12:57:25] E [posix.c:477:posix_lookup] posix: post-operation lstat on parent of .landfill/vhosts-blade-user.blade.wildcards.include failed: No such file or directory [2010-07-14 12:57:25] E [posix.c:477:posix_lookup] posix: post-operation lstat on parent of .landfill/vhosts-blade-user.bauwensp2.wildcards.include failed: No such file or directory [2010-07-14 12:57:25] E [posix.c:477:posix_lookup] posix: post-operation lstat on parent of .landfill/vhosts-blade-user.beutenb.wildcards.include failed: No such file or directory [2010-07-14 13:38:22] E [posix.c:2331:posix_open] posix: open on /data/export/.landfill/vhosts-apoc-user.rommess2.include: No such file or directory [2010-07-14 13:38:45] E [posix.c:2331:posix_open] posix: open on /data/export/.landfill/vhosts-apoc-user.rommess2.include: No such file or directory [2010-07-14 13:46:53] E [posix.c:2331:posix_open] posix: open on /data/export/.landfill/vhosts-apoc-user.rommess2.include: No such file or directory
Most of the self-heal (replicate related) bugs are now fixed with 3.1.0 branch. As we are just week behind the GA release time.. we would like you to test the particular bug in 3.1.0RC releases, and let us know if its fixed.
Avati, can you check this and update status accordingly ?
With 3.1 releases (check latest codebase), this bug will no more hold good. Please update to 3.1xx releases.