Bug 101938 - C write fails for records gt 2 GB
Summary: C write fails for records gt 2 GB
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: ia64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jeff Moyer
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-08-08 12:00 UTC by Winfrid Tschiedel
Modified: 2007-11-30 22:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-10-11 15:23:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Source Program which write records > 2 GB (26.73 KB, text/plain)
2003-08-08 12:07 UTC, Winfrid Tschiedel
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2004:017 0 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 1 2004-01-13 05:00:00 UTC

Description Winfrid Tschiedel 2003-08-08 12:00:47 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461)

Description of problem:
If the size of block in 
write_status = write(fd, bufio, (size_t)blocksize);
exceeds 2 GB, the write_status is not equal blocksize,
it is blocksize modulo 2 GB.

There is an error in ext2/ext3 file-handling

Version-Release number of selected component (if applicable):
2.4.21-1.1931.2.382.ent

How reproducible:
Always

Steps to Reproduce:
1.have c program which can write records larger 2 GB
2.execute this program
3.
  

Actual Results:  time ./rio -s 10000m  -b 2500m -t -d ..
rio version      = 2.24
filesize         = 10000 Megabytes
blocksize        = 2500 Megabytes
number of blocks = 4
number of loops  = 1
directory        = ..
filename         = rio_04473
pattern          = 8 (traditional "rio" pattern)
alignment        = 8
offset           = 0
tape mode        : selected, sequential access only.
read-only mode   : not selected, write and read will be performed.
2003-08-08 13:42:27 start sequential write
Unexpected short write at block 0.
requested blocksize=2621440000, actually written=-1
2.247u 14.271s 1:19.77 20.6%    0+0k 0+0io 6335pf+0w

Expected Results:  Program continues without problems.

Additional info:

If you do not have a program, which can write records > 2GB,
please tell me, how I can pass our testpgm to you

Comment 1 Winfrid Tschiedel 2003-08-08 12:07:32 UTC
Created attachment 93515 [details]
Source Program which write records > 2 GB

expand attachment on your IA64 system then :
cd rio-2.24
make -f Makefile.linux	 # create executable
time ./rio -s 10000m -b 2500m -t -d ..	# this will create a file of 10 GB
approx. ; if program does not fail

Comment 2 Jakub Jelinek 2003-08-10 21:49:26 UTC
Arjan, this is a kernel bug, nothing to do with glibc.
write is passed directly to the kernel:
write(3, "RioMagic", 2621440000)        = -1673527296
(this was on dolly, double checked that strace prints rax value as 8 byte
integer, not 4 byte).
It wrote:
-rw-r--r--    1 root     root     2621440000 Aug 10 17:35 rio_20606
and then returned (int)2621440000.

Ulrich said he saw some davidm's patch fixing this in 2.5.xx kernels, but he
cannot find it anymore.

Comment 3 Jakub Jelinek 2003-08-10 22:00:37 UTC
Just to show some bugs:
fs/ext3/file.c:
static ssize_t
ext3_file_write(struct file *file, const char *buf, size_t count, loff_t *ppos)
{
        int ret, err;
        struct inode *inode = file->f_dentry->d_inode;

        ret = generic_file_write(file, buf, count, ppos);

        /* Skip file flushing code if there was an error, or if nothing
           was written. */
        if (ret <= 0)
                return ret;
(ret cannot be int, but ssize_t).
in mm/filemap.c there are many:
ssize_t
generic_file_write(struct file *file,const char *buf,size_t count, loff_t *ppos)
{
...
        int             err;
...
                err = do_generic_file_write(file, buf, count, ppos);
...
        return err;
}

ssize_t
do_generic_file_write(struct file *file,const char *buf,size_t count, loff_t *ppos)
{
...
        ssize_t         written;
...
        int             err;
...
        err = written ? written : status;
out:
        return err;
}

(wrote just 2 to illustrate).

Comment 4 Jakub Jelinek 2003-08-10 22:16:44 UTC
<rusty.au>
        [PATCH] Write with buffer>2GB returns broken errno (2)

        [ Acked by AKPM --RR ]
        From:  Kazuto MIYOSHI <miyoshi.fc.nec.co.jp>

          On 64-bit platforms, issuing write(2) with buffer larger than
          2GB will return -1 and broken errno (such as 2147483640)
          Requested data itself is written correctly.

          That is because generic_file_write() and other relating functions
          store 'ssize_t written' into 'int err'. Written byte is trimmed to
          int and then sign-extended to a negative ssize_t value, which
          wrongly indicates an error.

          (On 64bit platform, current glibc defines SSIZE_MAX as 'LONG_MAX')

http://linux.bkbits.net:8080/linux-2.5/cset@1.889.108.43?nav=index.html|tags|ChangeSet@..1.889.120.3

Note that this deals only with mm/filemap.c (and probably has changed a lot between
2.4 and 2.5), for 2.4.2x kernels certainly at least all important filesystems
need to be audited.

Comment 5 Bill Nottingham 2004-10-11 15:23:58 UTC
Closing MODIFIED bugs as fixed. Please reopen if the problem perists.

Comment 6 Ernie Petrides 2004-12-03 01:30:10 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-017.html



Note You need to log in before you can comment on or make changes to this bug.