Bug 1492591

Summary: [GSS] Error No such file or directory for new file writes
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Abhishek Kumar <abhishku>
Component: fuseAssignee: Raghavendra G <rgowdapp>
Status: CLOSED ERRATA QA Contact: Vinayak Papnoi <vpapnoi>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: abhishku, amarts, amukherj, bfoster, bkunal, csaba, dwojslaw, esandeen, mszeredi, nbalacha, ravishankar, rgowdapp, rhs-bugs, rwheeler, sheggodu, storage-qa-internal, swhiteho, vdas
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-04 06:36:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1500269, 1510401, 1529084, 1529085, 1529086, 1529088, 1529089    
Bug Blocks: 1503135    
Attachments:
Description Flags
script to do rename while other program tries to open the file
none
program which opens a file that is being renamed parallely by other program
none
do rename through a C program none

Description Abhishek Kumar 2017-09-18 09:30:36 UTC
Description of problem:

Error No such file or directory for new file writes

Version-Release number of selected component (if applicable):

RHGS 3.2

How reproducible:

Cu environment 


Actual results:

Split-brain and no such file or directory is coming for a new file writes

Expected results:

Split-brain and no such file or directory should not come for a new file writes

Additional info:

Comment 14 Nithya Balachandran 2017-09-22 09:41:06 UTC
How many FUSE clients are accessing the volume?

Comment 15 Nithya Balachandran 2017-09-22 09:44:18 UTC
Also, for the files for which the errors are returned, do the file paths exist (/store/partition/primary/=user/03/93/=8716871000002/store.idx for example ) and if yes, what is the gfid?

I am wondering if Fuse has cached the old gfid before this files was deleted and recreated.

Comment 16 Nithya Balachandran 2017-09-22 10:09:24 UTC
Additional requests:
1. Ask the customer to turn off readdir-ahead
2. Find out how many Fuse clients are accessing the volume and how many are performing the deletes/creates etc
3. If there is only one mount please check how many instances of application is accessing glusterfs and whether the application is multithreaded and whether deletes, creates, renames, recreates happen in same thread or multiple thread

Comment 17 Nithya Balachandran 2017-09-22 10:22:29 UTC
From Raghavendra G:

If they are willing , can the customer try  glusterfs mounted with options 
--entry-timeout=0 and --attribute-timeout=0
and see if they still see the issue?

Comment 18 Raghavendra G 2017-09-22 11:35:13 UTC
(In reply to Nithya Balachandran from comment #16)
> Additional requests:
> 1. Ask the customer to turn off readdir-ahead
> 2. Find out how many Fuse clients are accessing the volume and how many are
> performing the deletes/creates etc
> 3. If there is only one mount please check how many instances of application
> is accessing glusterfs and whether the application is multithreaded and
> whether deletes, creates, renames, recreates happen in same thread or
> multiple thread

Basically we are trying to find out the access pattern of application. I think following two data captured starting before some time when they see errors will help us to deduce the pattern more accurately:

* strace -fTtt -p <pid-of-mail-server> -o <strace-output-file>
* fusedump. while mounting glusterfs, provide option --dump-fuse=<path-to-fuse-dump-file>

Please attach strace-output-file and fuse-dump-file to the case.

Comment 25 Nithya Balachandran 2017-09-29 07:57:36 UTC
In the meantime, we will look at the information provided and get back to you.

Comment 41 Raghavendra G 2017-10-10 09:50:35 UTC
Created attachment 1336713 [details]
script to do rename while other program tries to open the file

Comment 42 Raghavendra G 2017-10-10 09:51:43 UTC
Created attachment 1336714 [details]
program which opens a file that is being renamed parallely by other program

Comment 43 Raghavendra G 2017-10-10 09:52:22 UTC
Created attachment 1336715 [details]
do rename through a C program

Comment 101 errata-xmlrpc 2018-09-04 06:36:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607