Description of problem:
Error No such file or directory for new file writes
Version-Release number of selected component (if applicable):
Split-brain and no such file or directory is coming for a new file writes
Split-brain and no such file or directory should not come for a new file writes
How many FUSE clients are accessing the volume?
Also, for the files for which the errors are returned, do the file paths exist (/store/partition/primary/=user/03/93/=8716871000002/store.idx for example ) and if yes, what is the gfid?
I am wondering if Fuse has cached the old gfid before this files was deleted and recreated.
1. Ask the customer to turn off readdir-ahead
2. Find out how many Fuse clients are accessing the volume and how many are performing the deletes/creates etc
3. If there is only one mount please check how many instances of application is accessing glusterfs and whether the application is multithreaded and whether deletes, creates, renames, recreates happen in same thread or multiple thread
From Raghavendra G:
If they are willing , can the customer try glusterfs mounted with options
--entry-timeout=0 and --attribute-timeout=0
and see if they still see the issue?
(In reply to Nithya Balachandran from comment #16)
> Additional requests:
> 1. Ask the customer to turn off readdir-ahead
> 2. Find out how many Fuse clients are accessing the volume and how many are
> performing the deletes/creates etc
> 3. If there is only one mount please check how many instances of application
> is accessing glusterfs and whether the application is multithreaded and
> whether deletes, creates, renames, recreates happen in same thread or
> multiple thread
Basically we are trying to find out the access pattern of application. I think following two data captured starting before some time when they see errors will help us to deduce the pattern more accurately:
* strace -fTtt -p <pid-of-mail-server> -o <strace-output-file>
* fusedump. while mounting glusterfs, provide option --dump-fuse=<path-to-fuse-dump-file>
Please attach strace-output-file and fuse-dump-file to the case.
In the meantime, we will look at the information provided and get back to you.
Created attachment 1336713 [details]
script to do rename while other program tries to open the file
Created attachment 1336714 [details]
program which opens a file that is being renamed parallely by other program
Created attachment 1336715 [details]
do rename through a C program
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.