Bug 759575 - C++: store crashes when creating large store files on 32-bit RHEL
Summary: C++: store crashes when creating large store files on 32-bit RHEL
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: 2.2
: ---
Assignee: Kim van der Riet
QA Contact: Leonid Zhaldybin
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-02 18:06 UTC by Kim van der Riet
Modified: 2014-11-09 22:38 UTC (History)
6 users (show)

Fixed In Version: 0.14
Doc Type: Bug Fix
Doc Text:
When creating a journal with file-size=32768 on 32-bit systems, the store fails with error 22. The reson is that on 32-bit systems, this size exceeds the maximum allowed by the kernel when using the O_DIRECT flag. The error occurs only at this value, which is coincidentally the maximum allowed value for the store. This problem was fixed by reducing the maximum allowed value to 32767.
Clone Of:
Environment:
Last Closed: 2012-12-07 17:43:02 UTC
Target Upstream Version:


Attachments (Terms of Use)
simple reproducer (1.93 KB, text/plain)
2011-12-02 20:56 UTC, Kim van der Riet
no flags Details

Description Kim van der Riet 2011-12-02 18:06:51 UTC
When the store is set up with --jfile-size 32768 on a 32-bit RHEL, the process of creating the store files fails with:

JERR_FCNTL_WRITE: Unable to write to file. (wr_size=2097152 errno=22 (Invalid argument))

This does not happen on 64-bit systems.

Setting to medium priority, as 32-bit OSs are not common for these large-file use-cases.

Comment 1 Kim van der Riet 2011-12-02 20:56:10 UTC
Created attachment 539772 [details]
simple reproducer

Simple reproducer.

Comment 2 Kim van der Riet 2011-12-02 20:59:59 UTC
The simple reproducer seems to show that this is a kernel limit of write size in 32-bit kernels imposed by using the O_DIRECT flag.

The man (2) page for open() indicates that limits on file offsets are possible when O_DIRECT is in use. We could solve this particular problem by not using the O_DIRECT flag at all. As the file is being formatted, a regular file handle may suffice.

Comment 3 Kim van der Riet 2011-12-05 16:17:53 UTC
As this error occurs exactly on the maximum allowable file size (jfile-size-pgs) store parameter, the most prudent fix is to reduce this limit by one from 32768 to 32767.

Fixed in r.4485.

NOTE: This may change the numbers for max file size in the Messaging User's Guide.

Comment 4 Kim van der Riet 2011-12-05 16:23:06 UTC
QE: to reproduce, run on a 32-bit RHEL.

./qpidd --auth no --load-module /abs/path/to/msgstore.so --store-dir /tmp/store --jfile-size 32768 --num-jfiles 4 --log-enable info+

./qpid-perftest --count 10 --durable yes

will result in a failure in the broker: JERR_FCNTL_WRITE: Unable to write to file. (wr_size=2097152 errno=22 (Invalid argument))

When fixed, --jfile-size 32768 will be too big by 1, and the broker will change it with a log warning to 32767. The test will complete without an error.

As each journal file in this test is ~2GB, the formatting of the files will take some time - be patient.

Comment 5 Ted Ross 2012-03-29 20:00:06 UTC
This is in the 0.14 rebase

Comment 6 Leonid Zhaldybin 2012-04-17 08:39:50 UTC
CLOSED/CRELEASE -> ASSIGNED -> ON_QA
The defect has to go through QA process.

Comment 7 Leonid Zhaldybin 2012-04-17 08:42:50 UTC
Tested on RHEL5.8 and RHEL6.2. This problem was fixed.
Packages used for testing:

RHEL5.8:
qpid-cpp-client-0.14-16.el5
qpid-cpp-client-devel-0.14-16.el5
qpid-cpp-client-devel-docs-0.14-16.el5
qpid-cpp-client-rdma-0.14-16.el5
qpid-cpp-client-ssl-0.14-16.el5
qpid-cpp-mrg-debuginfo-0.14-16.el5
qpid-cpp-server-0.14-16.el5
qpid-cpp-server-cluster-0.14-16.el5
qpid-cpp-server-devel-0.14-16.el5
qpid-cpp-server-rdma-0.14-16.el5
qpid-cpp-server-ssl-0.14-16.el5
qpid-cpp-server-store-0.14-16.el5
qpid-cpp-server-xml-0.14-16.el5
rh-qpid-cpp-tests-0.14-16.el5

RHEL6.2:
qpid-cpp-client-0.14-15.el6.i686
qpid-cpp-client-devel-0.14-15.el6.i686
qpid-cpp-client-devel-docs-0.14-15.el6.noarch
qpid-cpp-client-rdma-0.14-15.el6.i686
qpid-cpp-client-ssl-0.14-15.el6.i686
qpid-cpp-debuginfo-0.14-15.el6.i686
qpid-cpp-server-0.14-15.el6.i686
qpid-cpp-server-cluster-0.14-15.el6.i686
qpid-cpp-server-devel-0.14-15.el6.i686
qpid-cpp-server-rdma-0.14-15.el6.i686
qpid-cpp-server-ssl-0.14-15.el6.i686
qpid-cpp-server-store-0.14-15.el6.i686
qpid-cpp-server-xml-0.14-15.el6.i686
rh-qpid-cpp-tests-0.14-15.el6.i686

-> VERIFIED

Comment 8 Kim van der Riet 2012-08-23 15:57:05 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
When creating a journal with file-size=32768 on 32-bit systems, the store fails with error 22. The reson is that on 32-bit systems, this size exceeds the maximum allowed by the kernel when using the O_DIRECT flag. The error occurs only at this value, which is coincidentally the maximum allowed value for the store. This problem was fixed by reducing the maximum allowed value to 32767.


Note You need to log in before you can comment on or make changes to this bug.