Bug 1243372 - FSAL_GLUSTER : iozone -a fails for pnfs mount
Summary: FSAL_GLUSTER : iozone -a fails for pnfs mount
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: nfs-ganesha
Classification: Retired
Component: FSAL_GLUSTER
Version: devel
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jiffin
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1243374
TreeView+ depends on / blocked
 
Reported: 2015-07-15 10:32 UTC by Jiffin
Modified: 2020-06-24 11:14 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
: 1243374 (view as bug list)
Environment:
Last Closed: 2020-06-24 11:14:42 UTC
Embargoed:


Attachments (Terms of Use)
strace_output (12.23 KB, text/plain)
2015-09-01 09:13 UTC, Jiffin
no flags Details
nfs-client log (178.27 KB, text/plain)
2015-09-01 09:13 UTC, Jiffin
no flags Details

Description Jiffin 2015-07-15 10:32:55 UTC
Description of problem:

iozone -a fails for pnfs mount throwing read: Bad file descriptor error.

#iozone -a

	Auto Mode
	Command line used: iozone -a
	Output is in Kbytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 Kbytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
              64       4  576282 1392258
Error reading block 0 66700000
read: Bad file descriptor


Version-Release number of selected component (if applicable):
mainline

How reproducible:
almost (90%)

Steps to Reproduce:
1. Create a volume
2. Export via nfs-ganesha
3. Mount using pnfs(version = 4.1)
4. run iozone -a on the mount point

Actual results:
failed throwing bad file-descriptor error

Expected results:
should succeed

Additional info:
Only iozone -a (automated mode)fails

No crashes seen in bricks or nfs-ganesha

But when iozone run in individually (from test i = 0 to 8) all the test passes successfully.


It seems to be timing issue for performing the cache invalidation on the M.D.S

Comment 1 Jiffin 2015-09-01 09:12:19 UTC
There are no notable errors seen in ganesha,gfapi and brick logs. From the packet trace it seems the request is not even send to servers. I am attaching strace output and nfs-client log

Comment 2 Jiffin 2015-09-01 09:13:05 UTC
Created attachment 1068901 [details]
strace_output

Comment 3 Jiffin 2015-09-01 09:13:46 UTC
Created attachment 1068902 [details]
nfs-client log

Comment 4 Jiffin 2015-09-10 05:44:22 UTC
On further debugging , i just confirmed it is a known timing issue.

The pNFS cluster consists of M.D.S and D.S. The file will be opened from the M.D.S and all the I/O's goes to the corresponding D.Ses. So after performing the write operation, UPCALL infrastructure should invalidate the context of M.D.S(otherwise M.D.S should be unknown of the modification happened).

In the case of "iozone -a" ,after performing the write operation , then client send open call for next read operation(). This open reaches M.D.S before the upcall notification.So M.D.S replies back with old information from its cache(In getattr call as part of the Compound Open Call, the size of the file seems to be 0 instead of 64k) and thus iozone -a is interpreted.

Comment 5 Kaleb KEITHLEY 2020-06-24 11:14:42 UTC
If this is still an issue please open an issue in the github tracker at https://github.com/nfs-ganesha/nfs-ganesha/issues


Note You need to log in before you can comment on or make changes to this bug.