Red Hat Bugzilla – Bug 101938
C write fails for records gt 2 GB
Last modified: 2007-11-30 17:06:57 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461)
Description of problem:
If the size of block in
write_status = write(fd, bufio, (size_t)blocksize);
exceeds 2 GB, the write_status is not equal blocksize,
it is blocksize modulo 2 GB.
There is an error in ext2/ext3 file-handling
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.have c program which can write records larger 2 GB
2.execute this program
Actual Results: time ./rio -s 10000m -b 2500m -t -d ..
rio version = 2.24
filesize = 10000 Megabytes
blocksize = 2500 Megabytes
number of blocks = 4
number of loops = 1
directory = ..
filename = rio_04473
pattern = 8 (traditional "rio" pattern)
alignment = 8
offset = 0
tape mode : selected, sequential access only.
read-only mode : not selected, write and read will be performed.
2003-08-08 13:42:27 start sequential write
Unexpected short write at block 0.
requested blocksize=2621440000, actually written=-1
2.247u 14.271s 1:19.77 20.6% 0+0k 0+0io 6335pf+0w
Expected Results: Program continues without problems.
If you do not have a program, which can write records > 2GB,
please tell me, how I can pass our testpgm to you
Created attachment 93515 [details]
Source Program which write records > 2 GB
expand attachment on your IA64 system then :
make -f Makefile.linux # create executable
time ./rio -s 10000m -b 2500m -t -d .. # this will create a file of 10 GB
approx. ; if program does not fail
Arjan, this is a kernel bug, nothing to do with glibc.
write is passed directly to the kernel:
write(3, "RioMagic", 2621440000) = -1673527296
(this was on dolly, double checked that strace prints rax value as 8 byte
integer, not 4 byte).
-rw-r--r-- 1 root root 2621440000 Aug 10 17:35 rio_20606
and then returned (int)2621440000.
Ulrich said he saw some davidm's patch fixing this in 2.5.xx kernels, but he
cannot find it anymore.
Just to show some bugs:
ext3_file_write(struct file *file, const char *buf, size_t count, loff_t *ppos)
int ret, err;
struct inode *inode = file->f_dentry->d_inode;
ret = generic_file_write(file, buf, count, ppos);
/* Skip file flushing code if there was an error, or if nothing
was written. */
if (ret <= 0)
(ret cannot be int, but ssize_t).
in mm/filemap.c there are many:
generic_file_write(struct file *file,const char *buf,size_t count, loff_t *ppos)
err = do_generic_file_write(file, buf, count, ppos);
do_generic_file_write(struct file *file,const char *buf,size_t count, loff_t *ppos)
err = written ? written : status;
(wrote just 2 to illustrate).
[PATCH] Write with buffer>2GB returns broken errno (2)
[ Acked by AKPM --RR ]
From: Kazuto MIYOSHI <firstname.lastname@example.org>
On 64-bit platforms, issuing write(2) with buffer larger than
2GB will return -1 and broken errno (such as 2147483640)
Requested data itself is written correctly.
That is because generic_file_write() and other relating functions
store 'ssize_t written' into 'int err'. Written byte is trimmed to
int and then sign-extended to a negative ssize_t value, which
wrongly indicates an error.
(On 64bit platform, current glibc defines SSIZE_MAX as 'LONG_MAX')
Note that this deals only with mm/filemap.c (and probably has changed a lot between
2.4 and 2.5), for 2.4.2x kernels certainly at least all important filesystems
need to be audited.
Closing MODIFIED bugs as fixed. Please reopen if the problem perists.
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.