Bug 1609224 - While moving multiple temporary files to the same destination concurrently, writes and reads on the same dest file fails with ESTALE and ENOENT
Summary: While moving multiple temporary files to the same destination concurrently, w...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: ---
Assignee: Susant Kumar Palai
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-27 10:14 UTC by Prasad Desala
Modified: 2020-06-10 12:13 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-30 12:02:19 UTC
Embargoed:


Attachments (Terms of Use)

Description Prasad Desala 2018-07-27 10:14:10 UTC
Description of problem:
========================
While moving multiple temporary files to the same destination concurrently, writes and reads on the same dest file fails with ESTALE and ENOENT.

Version-Release number of selected component (if applicable):
3.12.2-14.el7rhgs.x86_64

How reproducible:
always

Steps to Reproduce:
====================
1) Create a distributed-replicated volume and start it.
2) FUSE mount it on multiple clients.
3) From few clients execute,

"while true; do uuid="`uuidgen`"; echo "some data" > "test$uuid"; mv "test$uuid" "test" -f; done"
From other clients, keep sending lookups.
4) With step-3 in-progress, do writes and reads to the destination file "test".

[root@dhcp37-109 fuse]# while true; do cat /etc/redhat-release >> test;cat test;done

Red Hat Enterprise Linux Server release 7.5 (Maipo)
some data
Red Hat Enterprise Linux Server release 7.5 (Maipo)
cat: write error: Stale file handle
cat: write error: Stale file handle
cat: test: No such file or directory
some data
Red Hat Enterprise Linux Server release 7.5 (Maipo)
cat: test: Stale file handle
cat: write error: Stale file handle
cat: write error: Stale file handle
some data
some data
Red Hat Enterprise Linux Server release 7.5 (Maipo)
cat: write error: No such file or directory
cat: write error: No such file or directory
cat: test: No such file or directory

Actual results:
================
Writes and reads on the file fails with ESTALE and ENOENT.

Expected results:
==================
writes and reads should not fail.

Additional info:
================
Will be sharing the location of sos reports and gluster-health-check reports.

Comment 3 Raghavendra G 2018-07-28 04:44:41 UTC
can you try the test with turning off performance.open-behind?

Comment 4 Prasad Desala 2018-07-30 10:45:13 UTC
(In reply to Raghavendra G from comment #3)
> can you try the test with turning off performance.open-behind?

I'm able to reproduce this issue with performance.open-behind: off as well. The difference I saw during this test is, while running the script[1] only ESTALE errors are seen (with performance.open-behind: on, we are seeing both ESTALE and ENOENT)

[1] while true; do cat /etc/redhat-release >> test;cat test;done

Comment 5 Raghavendra G 2018-07-30 13:10:31 UTC
(In reply to Prasad Desala from comment #4)
> (In reply to Raghavendra G from comment #3)
> > can you try the test with turning off performance.open-behind?
> 
> I'm able to reproduce this issue with performance.open-behind: off as well.

Please collect following debug information:
* set diagnostics.client-log-level TRACE before starting tests
* collect fuse-dumps during test.

Attach client logs and fusedump collected to bz. Please collect this diagnostic data with performance.open-behind off.

Comment 9 Sahina Bose 2019-11-25 07:18:17 UTC
ping? similar to bug 1610258?

Comment 10 Raghavendra G 2019-11-25 09:44:45 UTC
(In reply to Sahina Bose from comment #9)
> ping? similar to bug 1610258?

Yes. I had the following comment on bz 1610258.

Comment 11 Raghavendra G 2019-11-25 09:47:23 UTC
(In reply to Raghavendra G from comment #10)
> (In reply to Sahina Bose from comment #9)
> > ping? similar to bug 1610258?
> 
> Yes. I had the following comment on bz 1610258.

From POSIX complaint standpoint, this is a genuine issue as renames are expected to be atomic and the above test case is expected to pass. Also note that,

1. create a tmp file
2. write to it
3. rename tmp file to a well known path

is a common pattern and this pattern is repeated over. So, I think this bug should be fixed, but may not be high priority.


Note You need to log in before you can comment on or make changes to this bug.