Bug 510747 - Out of Bounds exception when sending large QMF response
Out of Bounds exception when sending large QMF response
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-qmf (Show other bugs)
1.1.1
All Linux
urgent Severity urgent
: 1.3
: ---
Assigned To: Ted Ross
Jan Sarenik
:
: 508145 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-10 11:03 EDT by Matthew Farrellee
Modified: 2011-08-12 12:02 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, a QMF method would exit with a segmentation fault when the result was larger than 64kB. With this update, this method works as expected, even for larger results.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-14 12:01:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Verification scripts (24.61 KB, application/x-gzip)
2010-10-07 10:18 EDT, Jan Sarenik
no flags Details
Updated verification scripts (24.72 KB, application/x-gzip)
2010-10-08 04:28 EDT, Jan Sarenik
no flags Details
Fixed verification scripts (24.76 KB, application/x-gzip)
2010-10-08 10:26 EDT, Jan Sarenik
no flags Details

  None (edit)
Description Matthew Farrellee 2009-07-10 11:03:50 EDT
Description of problem:

When trying to send a QMF method out argument with strings that grow to over 64K, the agent will throw an exception and crash.


Version-Release number of selected component (if applicable):

qmf-0.5.752600-5.fc10.i386
qpidc-0.5.752600-5.fc10.i386


How reproducible:

100%


Steps to Reproduce:
1. start condor_job_server with a queue containing jobs with long attribute values, e.g. a big environment
2. use qpid-tool to call the server's "GetJob" method
3. watch server crash

  
Actual results:

terminate called after throwing an instance of 'qpid::framing::OutOfBounds'
  what():  Out of Bounds
Stack dump for process 22429 at timestamp 1247238103 (21 frames)
./condor_job_server(dprintf_dump_stack+0xd0)[0x80ff64c]
./condor_job_server[0x80ff80a]
[0x276400]
/lib/libc.so.6(abort+0x188)[0x71fe28]
/usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x158)[0x7eb3c48]
/usr/lib/libstdc++.so.6[0x7eb1b35]
/usr/lib/libstdc++.so.6[0x7eb1b72]
/usr/lib/libstdc++.so.6[0x7eb1caa]
/usr/lib/libqpidcommon.so.0(_ZN4qpid7framing6Buffer10getRawDataERSsj+0xbb)[0xc8f83b]
/usr/lib/libqmfagent.so.0(_ZN4qpid10management19ManagementAgentImpl16ConnectionThread10sendBufferERNS_7framing6BufferEjRKSsS7_+0xf6)[0xeeca36]
/usr/lib/libqmfagent.so.0(_ZN4qpid10management19ManagementAgentImpl19invokeMethodRequestERNS_7framing6BufferEjSs+0x2c3)[0xef0fa3]
/usr/lib/libqmfagent.so.0(_ZN4qpid10management19ManagementAgentImpl13pollCallbacksEj+0x10c)[0xef16ec]
./condor_job_server(_Z16HandleMgmtSocketP7ServiceP6Stream+0x1c)[0x80c2353]
./condor_job_server(_ZN10DaemonCore24CallSocketHandler_workerEibP6Stream+0x177)[0x80ec889]
./condor_job_server(_ZN10DaemonCore35CallSocketHandler_worker_demarshallEPv+0x34)[0x80ecb88]
./condor_job_server(_ZN13CondorThreads8pool_addEPFvPvES0_PiPKc+0x29)[0x8141f35]
./condor_job_server(_ZN10DaemonCore17CallSocketHandlerERib+0x1b4)[0x80df028]
./condor_job_server(_ZN10DaemonCore6DriverEv+0x1959)[0x80e0a0b]
./condor_job_server(main+0x17d3)[0x80f6dc6]
/lib/libc.so.6(__libc_start_main+0xe5)[0x7096e5]
./condor_job_server[0x80bb2f1]
Comment 1 Ted Ross 2009-09-08 10:50:19 EDT
*** Bug 508145 has been marked as a duplicate of this bug. ***
Comment 6 Ted Ross 2010-03-31 17:19:14 EDT
Fix committed upstream at revision 929716.
Comment 7 Frantisek Reznicek 2010-06-04 04:36:55 EDT
May I ask you for more info, please? An example would be very appreciated.
Raising NEEDINFO.
Comment 8 Matthew Farrellee 2010-06-04 06:39:41 EDT
I ran into the bug when submitting jobs to a schedd and then querying for all of them. However, the broker has an echo method. You may be able to simply send >64K of data to that method to reproduce.
Comment 10 Jan Sarenik 2010-10-04 09:28:11 EDT
There are some uncertainities:

 1. There is no condor_job_server in 1.1.1 Grid release.
 2. When I try to run 1.3RC Grid against 1.1.1 broker,
    I can not access grid objects via qpid-tool (either
    1.3RC or 1.1.1) and I think it may be caused by
    QMF versions.

How should I verify it?
Comment 11 Florian Nadge 2010-10-07 07:26:34 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously,the Qpid Management Framework (QMF) method  would exit with a segmentation fault when the result was larger than 10 MB.
Comment 12 Florian Nadge 2010-10-07 07:52:37 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously,the Qpid Management Framework (QMF) method  would exit with a segmentation fault when the result was larger than 10 MB.+Previously,the Qpid Management Framework (QMF) method  would exit with a segmentation fault when the result was larger than 64kB. With this update, this method works as expected, even for larger results.
Comment 13 Martin Prpič 2010-10-07 10:16:11 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously,the Qpid Management Framework (QMF) method  would exit with a segmentation fault when the result was larger than 64kB. With this update, this method works as expected, even for larger results.+Previously, a QMF method would exit with a segmentation fault when the result was larger than 64kB. With this update, this method works as expected, even for larger results.
Comment 14 Jan Sarenik 2010-10-07 10:18:20 EDT
Created attachment 452111 [details]
Verification scripts

During phone meeting I was told this bug does not have to
be reproduced to verify it is working.

[user@host bz601828]$ ./runtest.sh 
x86_64
redhat-release-5Server-5.5.0.2
condor-qmf-7.4.4-0.16.el5
python-qpid-0.7.946106-14.el5
python-qmf-0.7.946106-13.el5
Clean: .
Submit: ..Submitting job(s)............
12 job(s) submitted to cluster 1.
Verify: ...SUCCESS

I will have to finish it for RHEL4 and RHEL5 i386 tomorrow.
Comment 15 Jan Sarenik 2010-10-07 10:18:48 EDT
Cleaning NEEDINFO.
Comment 16 Jan Sarenik 2010-10-08 04:04:20 EDT
x86_64
redhat-release-4AS-9
condor-qmf-7.4.4-0.16.el4
python-qpid-0.7.946106-14.el4
python-qmf-0.7.946106-13.el4
Clean: .
Submit: ..Submitting job(s)............
12 job(s) submitted to cluster 1.
Verify: SUCCESS
Comment 17 Jan Sarenik 2010-10-08 04:07:40 EDT
i686
redhat-release-4AS-9
condor-qmf-7.4.4-0.16.el4
python-qpid-0.7.946106-14.el4
python-qmf-0.7.946106-13.el4
qpid-cpp-server-0.7.946106-17.el4
Clean: .
Submit: ..Submitting job(s)............
12 job(s) submitted to cluster 1.
Verify: SUCCESS
Comment 18 Jan Sarenik 2010-10-08 04:14:03 EDT
It is vital to set auth=no on the broker for
Condor QMF agents to appear on all the latest
qpid-cpp-server-0.7.946106-17 builds.

The test should run under an unprivileged user
which has sudo NOPASSWORD right, see sudo(8)
manual page for more info.
Comment 19 Jan Sarenik 2010-10-08 04:28:28 EDT
Created attachment 452296 [details]
Updated verification scripts

$ ./runtest.sh 100
i686
redhat-release-5Server-5.5.0.2
qpid-cpp-server-0.7.946106-17.el5
condor-qmf-7.4.4-0.16.el5
python-qpid-0.7.946106-14.el5
python-qmf-0.7.946106-13.el5
Clean: .
Submit: ..Submitting job(s)....................................................................................................
100 job(s) submitted to cluster 1.
Verify: SUCCESS
Comment 20 Jan Sarenik 2010-10-08 04:28:59 EDT
Verified on all supported architectures and RHEL versions.
Comment 21 Jan Sarenik 2010-10-08 10:26:44 EDT
Created attachment 452353 [details]
Fixed verification scripts

Sorry, the test was deleting /var/lib/qpidd which contains SASL database.
Here are the updated scripts. No need for auth=no anymore.
Comment 23 errata-xmlrpc 2010-10-14 12:01:58 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html

Note You need to log in before you can comment on or make changes to this bug.