Bug 162094 - read() with count > 0xffffffff panics kernel at fs/direct-io.c:886
read() with count > 0xffffffff panics kernel at fs/direct-io.c:886
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
ia64 Linux
medium Severity medium
: ---
: ---
Assigned To: Peter Staubach
Brian Brock
:
Depends On:
Blocks: 168429
  Show dependency treegraph
 
Reported: 2005-06-29 16:03 EDT by David Milburn
Modified: 2010-10-21 23:07 EDT (History)
3 users (show)

See Also:
Fixed In Version: RHSA-2006-0132
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-07 14:13:41 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Program to reproduce the bug (820 bytes, text/plain)
2005-06-29 16:04 EDT, David Milburn
no flags Details
Patch to fix (5.77 KB, patch)
2005-06-29 16:05 EDT, David Milburn
no flags Details | Diff
Proposed patch (2.19 KB, patch)
2005-08-26 10:14 EDT, Peter Staubach
no flags Details | Diff
Proposed patch (2.19 KB, patch)
2005-10-10 11:23 EDT, Peter Staubach
no flags Details | Diff

  None (edit)
Description David Milburn 2005-06-29 16:03:03 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050302 Firefox/1.0.1 Fedora/1.0.1-1.3.2

Description of problem:
Using read() system call with large count (> 0xffffffff) against raw
device (or block device file that is opened with O_DIRECT) causes
kernel panic on RHEL4 with the following message:

   kernel BUG at fs/direct-io.c:886!


Version-Release number of selected component (if applicable):
kernel-2.6.9-5.EL

How reproducible:
Always

Steps to Reproduce:
1. Edit reproduce.c with appropriate FILE_NAME and recompile
2. Execute the reproduce program
3.
  

Actual Results:  kernel panics with the following message:

      kernel BUG at fs/direct-io.c:886!


Expected Results:  kernel should not panic


Additional info:

Customer developed fix based upon the following three patches from linux-2.6.11-rc3

http://lia64.bkbits.net:8080/linux-ia64-release-2.6.12/cset@41f6cf91c1R7rbuggBVQLxBuD7m6Aw
http://lia64.bkbits.net:8080/linux-ia64-release-2.6.12/cset@41f71cbbbAqnp67z79i7SSVQGtmQzg
http://lia64.bkbits.net:8080/linux-ia64-release-2.6.12/cset@42026b11ti7KiDM_DMvBv5ZQH_3yLw
Comment 1 David Milburn 2005-06-29 16:04:47 EDT
Created attachment 116144 [details]
Program to reproduce the bug
Comment 2 David Milburn 2005-06-29 16:05:37 EDT
Created attachment 116145 [details]
Patch to fix
Comment 4 Peter Staubach 2005-07-22 14:38:05 EDT
This situation occurs because an unsigned int is used to store the size of
maximum contiguous number of blocks which can be transfered at once.  When
doing a direct-io read on a block device, the size of the transfer is set
to the minimum of the size of the clock device or the requested number of
bytes.

In the test case, the program tries to read 4GB, 0x100000000.  I used a 10G
partition.  Therefore, the code tried to store 0x100000000 in an unsigned
int.  This won't fit and ends up zeroing out the int.

This situation can be addressed either by limiting the read count size,
as the proposed patch does, or by handling the request as several smaller
requests inside of the kernel.  The advantage of this latter approach is
that the system call semantics are maintained and the application does not
need to be aware that it is dealing with a "file" with different
characteristics and the file struct does not have to be modified.
Comment 5 Peter Staubach 2005-08-26 10:14:45 EDT
Created attachment 118154 [details]
Proposed patch
Comment 6 Peter Staubach 2005-08-26 10:27:11 EDT
The proposed patch breaks up the original, single iovec into multiple smaller
iovecs, each capable of being expressed using a 32 bit integer.  This avoids
the overflow that the current system suffers from.
Comment 13 Peter Staubach 2005-10-10 11:23:54 EDT
Created attachment 119775 [details]
Proposed patch
Comment 16 Peter Staubach 2005-10-11 08:32:30 EDT
I don't understand the question.  If it is about which symbol should be used
at the user level, then I don't actually know and will have to defer to some
one else with more experience in the kernel to user level symbol translation.
Comment 24 Red Hat Bugzilla 2006-03-07 14:13:42 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html

Note You need to log in before you can comment on or make changes to this bug.