Description of problem: The test tests/bitrot/bug-1373520.t is consistently failing on master and blocking patches. Discussed the same with Kotresh. He reckons the bug is with EC not performing heals on lookup. Requesting EC component owners to fix either the script or the code, wherever the bug may be. Moving this to bad tests until then. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Just run the test in subject on the head of master 2. 3. Actual results: Expected results: Additional info: Here's a sample failure - https://build.gluster.org/job/centos6-regression/3444/console
REVIEW: https://review.gluster.org/16780 (tests: Mark tests/bitrot/bug-1373520.t until fixed) posted (#1) for review on master by Krutika Dhananjay (kdhananj)
REVIEW: https://review.gluster.org/16780 (tests: Mark tests/bitrot/bug-1373520.t bad until fixed) posted (#2) for review on master by Krutika Dhananjay (kdhananj)
Raghavendra, I looked at the issue, and cat on a file is leading to lookup on it's hardlink, #Delete file and all links from backend TEST rm -rf $(find $B0/${V0}5 -inum $(stat -c %i $B0/${V0}5/FILE1)) #Access files TEST cat $M0/FILE1 <<---- this EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" path_size $B0/${V0}5/FILE1 TEST cat $M0/HL_FILE1 EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" path_size $B0/${V0}5/HL_FILE1 [2017-02-28 06:34:39.451080]:++++++++++ G_LOG:tests/bitrot/bug-1373520.t: TEST: 53 cat /mnt/glusterfs/0/FILE1 ++++++++++ <<<------ [2017-02-28 06:34:39.456229] W [MSGID: 122053] [ec-common.c:161:ec_check_status] 0-patchy-disperse-0: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111111, remaining=000000, good=01 1111, bad=100000) [2017-02-28 06:34:39.458492] W [MSGID: 122053] [ec-common.c:161:ec_check_status] 0-patchy-disperse-0: Operation failed on 1 of 6 subvolumes.(up=111111, mask=011111, remaining=000000, good=01 1111, bad=100000) [2017-02-28 06:34:39.459497] W [MSGID: 114031] [client-rpc-fops.c:2926:client3_3_lookup_cbk] 0-patchy-client-5: remote operation failed. Path: /HL_FILE1 (d3b64a73-731b-4ce1-af6f-ab33bc4d07f3 ) [No such file or directory] <<-----
<pranithk> raghug: https://bugzilla.redhat.com/show_bug.cgi?id=1427404#c3 <raghug> pranithk, checking <raghug> pranithk, who sent this lookup? <raghug> on /HL_FILE1 <pranithk> raghug: cat $M0/FILE1, that command sent lookup on this file <pranithk> raghug: which is confusing <raghug> pranithk, I doubt it <raghug> is it possible to prove it? <raghug> may be by dumping fuse traffic <raghug> or is it something gone wrong in glusterfs stack <pranithk> raghug: Is the procedure documented somewhere? How to get that done? <raghug> for eg., inode_path might've given that link <raghug> though application was accessing from other link <pranithk> raghug: ah! I remember seeing such code in EC let me also see.. <pranithk> raghug: either way can you give any link which documents how to get fuse traffic? <raghug> pranithk, just do a git grep on "inode_parent" and inode_path <raghug> for the entire codebase <raghug> pranithk, let me search that <pranithk> raghug: no no, I want to first check what came in fuse traffic <pranithk> raghug: I want to learn that part... <raghug> I am searching for documentation on that :) <raghug> pranithk, https://github.com/csabahenk/parsefuse/wiki/2016-Aug-5th-presentation-screen-dumps
COMMIT: https://review.gluster.org/16780 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit bbb03ab1a2a9f0acc02f1d252a9bf811ba854bab Author: Krutika Dhananjay <kdhananj> Date: Tue Feb 28 11:45:44 2017 +0530 tests: Mark tests/bitrot/bug-1373520.t bad until fixed Change-Id: Ic0b5c93c6365e26a5742184dd9445354c0a57295 BUG: 1427404 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/16780 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
REVIEW: https://review.gluster.org/16792 (storage/posix: Set ret value correctly before exiting) posted (#1) for review on master by Krutika Dhananjay (kdhananj)
COMMIT: https://review.gluster.org/16792 committed in master by Jeff Darcy (jdarcy) ------ commit a2d4c928e93c95dfe2ceff450556f8494d67e654 Author: Krutika Dhananjay <kdhananj> Date: Wed Mar 1 12:48:10 2017 +0530 storage/posix: Set ret value correctly before exiting Change-Id: I07c3a21c1c0625a517964693351356eead962571 BUG: 1427404 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/16792 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report. glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html [2] https://www.gluster.org/pipermail/gluster-users/
*** Bug 1419445 has been marked as a duplicate of this bug. ***