Description of problem: ======================== While moving multiple temporary files to the same destination concurrently, writes and reads on the same dest file fails with ESTALE and ENOENT. Version-Release number of selected component (if applicable): 3.12.2-14.el7rhgs.x86_64 How reproducible: always Steps to Reproduce: ==================== 1) Create a distributed-replicated volume and start it. 2) FUSE mount it on multiple clients. 3) From few clients execute, "while true; do uuid="`uuidgen`"; echo "some data" > "test$uuid"; mv "test$uuid" "test" -f; done" From other clients, keep sending lookups. 4) With step-3 in-progress, do writes and reads to the destination file "test". [root@dhcp37-109 fuse]# while true; do cat /etc/redhat-release >> test;cat test;done Red Hat Enterprise Linux Server release 7.5 (Maipo) some data Red Hat Enterprise Linux Server release 7.5 (Maipo) cat: write error: Stale file handle cat: write error: Stale file handle cat: test: No such file or directory some data Red Hat Enterprise Linux Server release 7.5 (Maipo) cat: test: Stale file handle cat: write error: Stale file handle cat: write error: Stale file handle some data some data Red Hat Enterprise Linux Server release 7.5 (Maipo) cat: write error: No such file or directory cat: write error: No such file or directory cat: test: No such file or directory Actual results: ================ Writes and reads on the file fails with ESTALE and ENOENT. Expected results: ================== writes and reads should not fail. Additional info: ================ Will be sharing the location of sos reports and gluster-health-check reports.
can you try the test with turning off performance.open-behind?
(In reply to Raghavendra G from comment #3) > can you try the test with turning off performance.open-behind? I'm able to reproduce this issue with performance.open-behind: off as well. The difference I saw during this test is, while running the script[1] only ESTALE errors are seen (with performance.open-behind: on, we are seeing both ESTALE and ENOENT) [1] while true; do cat /etc/redhat-release >> test;cat test;done
(In reply to Prasad Desala from comment #4) > (In reply to Raghavendra G from comment #3) > > can you try the test with turning off performance.open-behind? > > I'm able to reproduce this issue with performance.open-behind: off as well. Please collect following debug information: * set diagnostics.client-log-level TRACE before starting tests * collect fuse-dumps during test. Attach client logs and fusedump collected to bz. Please collect this diagnostic data with performance.open-behind off.
ping? similar to bug 1610258?
(In reply to Sahina Bose from comment #9) > ping? similar to bug 1610258? Yes. I had the following comment on bz 1610258.
(In reply to Raghavendra G from comment #10) > (In reply to Sahina Bose from comment #9) > > ping? similar to bug 1610258? > > Yes. I had the following comment on bz 1610258. From POSIX complaint standpoint, this is a genuine issue as renames are expected to be atomic and the above test case is expected to pass. Also note that, 1. create a tmp file 2. write to it 3. rename tmp file to a well known path is a common pattern and this pattern is repeated over. So, I think this bug should be fixed, but may not be high priority.