Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 445560

Summary: GFS2: fstest using fsync() after each write much slower than open O_SYNC or using fdatasync()
Product: Red Hat Enterprise Linux 5 Reporter: Scott Crenshaw <crenshaw>
Component: kernelAssignee: Steve Whitehouse <swhiteho>
Status: CLOSED NOTABUG QA Contact: GFS Bugs <gfs-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.3CC: cluster-maint, edamato
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-07-09 13:51:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
fstest - test program to simulate sync writes for messaging loads
none
Updated version where -t 1 doesn't spin a thread so tracing is easier none

Description Rob Kenna 2008-05-07 16:08:38 UTC
For some reason a call to fsync() after a write is appreciably slower than
either opening a file O_SYNC or using the fdatasync() call.  ext3 and gfs1 do
not exhibit this behavior.  GFS2 is actually slower than GFS1 in this case.

Use the following to reproduce
// "-S 1" means sync on every write.
// "-A" means buffered (i.e. not open O_SYNC)
fstest -b 1k -t 1 -s 30m -A -S 1 

Compare to 
fstest -b 1k -t 1 -s 30m -A -d // fdatasync()  or
fstest -b 1k -t 1 -s 30m       // open O_SYNC

fstest.c program attached.

Comment 1 Rob Kenna 2008-05-07 16:08:38 UTC
Created attachment 304777 [details]
fstest - test program to simulate sync writes for messaging loads

Comment 2 Rob Kenna 2008-05-07 18:29:18 UTC
Here's a trace of the program.  It has the expected pattern of calls

open("test.MRMmBj", O_WRONLY)           = 3
lseek(3, 0, SEEK_SET)                   = 0
time(NULL)                              = 1210184262
gettimeofday({1210184262, 558839}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3)                                = 0
gettimeofday({1210184262, 559377}, NULL) = 0
gettimeofday({1210184262, 559442}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3)                                = 0
gettimeofday({1210184262, 559984}, NULL) = 0
gettimeofday({1210184262, 560049}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3)                                = 0
gettimeofday({1210184262, 560987}, NULL) = 0
gettimeofday({1210184262, 561054}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3)                                = 0
gettimeofday({1210184262, 561822}, NULL) = 0
close(3)                                = 0


VERSUS using "-d" in the test program and getting fdatasync() calls.
open("test.fcYQV1", O_WRONLY)           = 3
lseek(3, 0, SEEK_SET)                   = 0
time(NULL)                              = 1210184897
gettimeofday({1210184897, 828763}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3)                            = 0
gettimeofday({1210184897, 829359}, NULL) = 0
gettimeofday({1210184897, 829426}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3)                            = 0
gettimeofday({1210184897, 829977}, NULL) = 0
gettimeofday({1210184897, 830057}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3)                            = 0
gettimeofday({1210184897, 830678}, NULL) = 0
gettimeofday({1210184897, 830743}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3)                            = 0
gettimeofday({1210184897, 831429}, NULL) = 0
fsync(3)                                = 0
close(3)                                = 0


Comment 3 Rob Kenna 2008-05-07 18:31:51 UTC
Created attachment 304795 [details]
Updated version where -t 1 doesn't spin a thread so tracing is easier

This version won't spin a thread when "-t 1" is supplied.  This makes tracing
easier (using strace.)

Comment 4 Rob Kenna 2008-05-07 18:37:54 UTC
doing fsync() versus open O_SYNC is about 3-4 times slower (600-700 KBs vs. 2 MBs)

Comment 6 Steve Whitehouse 2008-07-02 15:19:01 UTC
Its not a great surprise that fsync() is slower than fdatasync() for GFS2. Also
I believe that O_SYNC is basically the same as fdatasync().

GFS1 is very similar to ext3. They both use the sync_inode method, now frowned
upon by upstream. The main difference between this and GFS2 is that in GFS2 we
sync the metadata even for I_DIRTY_SYNC, which I guess we could change, but
really I'm not sure its that much of a big deal. If someone has called fsync,
then I think we can reasonably assume that they want all data stable on disk,
even the minor inode changes. If they didn't, then they'd have called
fdatasync() instead.

So GFS2 does what the man page for fsync says, whereas GFS and ext3 ignore minor
changes (i.e. I_DIRTY_SYNC changes) which probably explains the difference in
performance.

So I don't think there is anything really to change at this stage, and I'm
intending to close this as NOTABUG shortly.