Red Hat Bugzilla – Bug 248405
the open(2) man page lists an incorrect requirement on O_DIRECT buffer alignment
Last modified: 2013-04-12 15:14:30 EDT
Description of problem:
O_DIRECT I/O must be allowed to buffers aligned on a 512-byte boundary, and must
allow an I/O size that is a multiple of 512 bytes. This works properly on every
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Run the attached reproducer.
Created attachment 159344 [details]
On s390 the I/O size is not 512 bytes but 4096 bytes! Therefore this testcase
does produce an error (due to the check in fs/direct_io.c: __blockdev_direct_IO()).
Reading 512 bytes with direct-IO is therefore not supported on s390... rejecting.
I discussed this with Zach Brown, and he said that this is an ABI that the
kernel provides to userspace. Alignment on 512 byte boundaries must be
supported, so this is a bug. Surely you can read in 4k and zero out the
remainder of the block? It's not optimal, but should only be required for the
first/last partial blocks.
it is not correct that the direct-IO blocksize is always 512 bytes. It happens
to be 512 bytes on other architectures but on s390 the dasd device blocksize
always was 4K. Direct-IO must work with the blocksize of the device!
Apparently the man page for open is wrong:
Under Linux 2.4 transfer sizes, and the alignment of user buffer and file offset
must all be multiples of the logical block size of the file system. Under Linux
2.6 alignment to 512-byte boundaries suffices.
I asked around, and the man page does seem to be wrong.
The change from 2.4 wasn't to force 512B as a standard minimum alignment, but to
let the app operate at disk sector alignment instead of fs block alignment. I
was confused and mislead Jeff.
So apps using O_DIRECT are, supposedly, expected to figure out the sector size
of the underlying block device and align accordingly. The code seems to match
this, working with bdev_hardsect_size() when a block device for the inode is
Thanks for the update, Zach. Sorry to waste your time, Jan!
We should update the man page, but I don't know how this could be done, maybe you
open a bugzilla against the man page :)
(In reply to comment #7)
> We should update the man page, but I don't know how this could be done, maybe you
> open a bugzilla against the man page :)
Actually, I don't see that verbiage in the read(2) or write(2) man pages on my
RHEL 5 system. At which man page are you looking?
its the open man page, man 2 open, look for O_DIRECT (format broke by paste):
Try to minimize cache effects of the I/O to and from this
file. In general this will degrade performance, but it is useful
in special situations, such as when applications do their
own caching. File I/O is done directly to/from user space
buffers. The I/O is synchronous, i.e., at the completion of a read(2)
or write(2), data is guaranteed to have been transferred.
Under Linux 2.4 transfer sizes, and the alignment of user buffer
and file offset must all be multiples of the logical block size
of the file system. Under Linux 2.6 alignment to 512-byte
The last sentence is wrong. For 2.6 the alignment must fit the block size
of the device.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.