Bug 1122595 - istream::readsome() fails on large files
Summary: istream::readsome() fails on large files
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: gcc
Version: 7.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: ---
Assignee: Jakub Jelinek
QA Contact: qe-baseos-tools
Depends On:
Blocks: 1298243 1413146 1420851
TreeView+ depends on / blocked
Reported: 2014-07-23 14:53 UTC by Ross Miller
Modified: 2017-06-22 18:03 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2017-06-22 18:03:49 UTC

Attachments (Terms of Use)
Source code for short program to demonstrate the problem (1.45 KB, text/plain)
2014-07-23 14:53 UTC, Ross Miller
no flags Details

Description Ross Miller 2014-07-23 14:53:23 UTC
Created attachment 920248 [details]
Source code for short program to demonstrate the problem

Description of problem:
Using the istream::readsome function to read a file will fail if the file is larger than 4GB.

Version-Release number of selected component (if applicable):

How reproducible: Always

Steps to Reproduce:
1. Open a file stream with ifstream::open (for a file > 4GB)
2. Call readsome() repeatedly on that stream

Actual results:
readsome() will only read the first (filesize modulo 4GB) bytes.  After that, it will always return 0 (meaning no bytes were read).

Expected results:
Entire file is read

Additional info:

I'm not sure if this is a problem with libstdc++ or the kernel (or both).  Going by the output of strace, the readsome function (or more likely, one of the functions it calls) uses the FIONREAD ioctl to determine how many bytes are available.  FIONREAD, however, returns a 32-bit value.  Obviously, this is a problem for files larger than 4GB.

The attached source code compiles into a util that demonstrates the problem.

Comment 2 Ross Miller 2014-07-23 15:05:13 UTC
our comment was:

    The following source code compiles into a util that will demonstrate the issue with the FIONREAD ioctl.  It will read the entire file, and you can see the ioctl's output overflowing every 4GB.


    // This is a quick test to try to narrow down a possible bug in
    // ifstream::readsome()

    #include <unistd.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <errno.h>
    #include <stdio.h>
    #include <stdlib.h>

    #include <sys/ioctl.h>

    int main( int argc, char **argv)
        if (argc != 2)
            fprintf( stderr, "Usage: %s <test_file>\n", argv[0]);
            return 1;

        int fd = open(argv[1], O_RDONLY);
        if ( fd == -1)
            fprintf( stderr, "Failed to open %s\n", argv[1]);
            return 2;
        const unsigned bufLen = 1024 * 1024;  // 1MB reads
        char *buf = (char *)malloc( bufLen);

        ssize_t bytesRead;
        unsigned long long bytesAvail;
        fprintf( stderr, "*** Size of 'bytesAvail': %d ***\n", sizeof(bytesAvail));
            if (ioctl( fd, FIONREAD, &bytesAvail) != 0)
                perror( "IOCTL error");
                fprintf(stderr, "Bytes Available: %lu (%f MB)\n", bytesAvail, bytesAvail/(float)(1024*1024));

            bytesRead = read( fd, buf, bufLen);
        } while (bytesRead > 0);

        if (bytesRead == -1)
            perror( "Error read from file");
            fprintf( stderr, "Successfully read %s\n", argv[1]);
        free( buf);
        return 0;

Comment 3 Marek Polacek 2014-07-23 17:01:21 UTC
gcc-libraries package doesn't contain libstdc++.  Moving to gcc component.

Comment 6 Jakub Jelinek 2014-09-29 16:28:27 UTC
FIONREAD ioctl (at least on Linux) writes int, so truncates all the upper bits silently.
Therefore, the #c2 testcase is wrong, passing address of long long there is invalid.
If it is fine if the showmanyc() method will sometimes return smaller number of bytes than actually available, perhaps libstdc++ could keep using the FIONREAD ioctl, but just not trust it if it stored there 0 and sizeof(int) < sizeof(std::streamsize), because in that case it might return 0 even when there are bytes available.

Comment 8 Jonathan Wakely 2014-09-29 16:48:19 UTC
(In reply to Jakub Jelinek from comment #6)
> If it is fine if the showmanyc() method will sometimes return smaller number
> of bytes than actually available,

That is fine. The value returned by showmanyc is a lower bound:

  Returns: An estimate of the number of characters available in the sequence,
  or -1. If it returns a positive value, then successive calls to underflow()
  will not return traits::eof() until at least that number of characters have
  been extracted from the stream. If showmanyc() returns -1, then calls to
  underflow() or uflow() will fail.

It's even conforming to return 0 when there are bytes available, although doing so is not very useful.

Comment 17 Chris Williams 2017-06-22 18:03:49 UTC
Closing this BZ as the support case is closed which means we need to focus on more critical customer BZs. If this RFE is still important please open a new case via the Red Hat Customer Portal, access.redhat.com and ask that this BZ be re-opened.

Note You need to log in before you can comment on or make changes to this bug.