Bug 445560
| Summary: | GFS2: fstest using fsync() after each write much slower than open O_SYNC or using fdatasync() | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Scott Crenshaw <crenshaw> | ||||||
| Component: | kernel | Assignee: | Steve Whitehouse <swhiteho> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | GFS Bugs <gfs-bugs> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 5.3 | CC: | cluster-maint, edamato | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2008-07-09 13:51:58 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Rob Kenna
2008-05-07 16:08:38 UTC
Created attachment 304777 [details]
fstest - test program to simulate sync writes for messaging loads
Here's a trace of the program. It has the expected pattern of calls
open("test.MRMmBj", O_WRONLY) = 3
lseek(3, 0, SEEK_SET) = 0
time(NULL) = 1210184262
gettimeofday({1210184262, 558839}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3) = 0
gettimeofday({1210184262, 559377}, NULL) = 0
gettimeofday({1210184262, 559442}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3) = 0
gettimeofday({1210184262, 559984}, NULL) = 0
gettimeofday({1210184262, 560049}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3) = 0
gettimeofday({1210184262, 560987}, NULL) = 0
gettimeofday({1210184262, 561054}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fsync(3) = 0
gettimeofday({1210184262, 561822}, NULL) = 0
close(3) = 0
VERSUS using "-d" in the test program and getting fdatasync() calls.
open("test.fcYQV1", O_WRONLY) = 3
lseek(3, 0, SEEK_SET) = 0
time(NULL) = 1210184897
gettimeofday({1210184897, 828763}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3) = 0
gettimeofday({1210184897, 829359}, NULL) = 0
gettimeofday({1210184897, 829426}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3) = 0
gettimeofday({1210184897, 829977}, NULL) = 0
gettimeofday({1210184897, 830057}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3) = 0
gettimeofday({1210184897, 830678}, NULL) = 0
gettimeofday({1210184897, 830743}, NULL) = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
fdatasync(3) = 0
gettimeofday({1210184897, 831429}, NULL) = 0
fsync(3) = 0
close(3) = 0
Created attachment 304795 [details]
Updated version where -t 1 doesn't spin a thread so tracing is easier
This version won't spin a thread when "-t 1" is supplied. This makes tracing
easier (using strace.)
doing fsync() versus open O_SYNC is about 3-4 times slower (600-700 KBs vs. 2 MBs) Its not a great surprise that fsync() is slower than fdatasync() for GFS2. Also I believe that O_SYNC is basically the same as fdatasync(). GFS1 is very similar to ext3. They both use the sync_inode method, now frowned upon by upstream. The main difference between this and GFS2 is that in GFS2 we sync the metadata even for I_DIRTY_SYNC, which I guess we could change, but really I'm not sure its that much of a big deal. If someone has called fsync, then I think we can reasonably assume that they want all data stable on disk, even the minor inode changes. If they didn't, then they'd have called fdatasync() instead. So GFS2 does what the man page for fsync says, whereas GFS and ext3 ignore minor changes (i.e. I_DIRTY_SYNC changes) which probably explains the difference in performance. So I don't think there is anything really to change at this stage, and I'm intending to close this as NOTABUG shortly. |