Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 658936

Summary: QMF engine-based agents segfault when returning more than 64k of argument data
Product: Red Hat Enterprise MRG Reporter: Will Benton <willb>
Component: qpid-qmfAssignee: Ted Ross <tross>
Status: CLOSED ERRATA QA Contact: Petr Matousek <pematous>
Severity: medium Docs Contact:
Priority: high    
Version: 1.3CC: iboverma, jneedle, matt, mcressma, pematous, tross
Target Milestone: 1.3.2-RC2   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-mrg-0.7.946106-27 Doc Type: Bug Fix
Doc Text:
When the size of a method request or response exceeded 65535 octets (that is, when a method call returned multiple large strings that in total exceeded this size), a buffer overflow could occur, causing a QMF engine-based agent to terminate unexpectedly with a segmentation fault. This update removes this limitation from the source code underlying the Ruby and Python QMF engines, and large method requests and responses no longer cause such agents to crash.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-15 12:13:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
issue reproducer
none
issue reproducer none

Description Will Benton 2010-12-01 16:46:10 UTC
Description of problem:

If an agent built on the Ruby QMF engine attempts to return arguments larger than 64k in total, the QMF engine will segfault.

Version-Release number of selected component (if applicable):

0.7.946106

How reproducible:

Create a QMF agent with a method that returns more than 64k of data in an output parameter.  Invoke this method.
  
Actual results:

The agent will crash.

Expected results:

The agent should not crash or have size restrictions on actual output parameters.
Additional info:

Comment 3 Ted Ross 2011-01-13 19:13:27 UTC
Fixed upstream at revisions 1058709 and 1058710.

Comment 4 Ted Ross 2011-01-13 19:18:48 UTC
Testing notes:

AMQP 0-10 does not support string values larger than 65535 bytes.  The best way to test this fix is to use multiple large-string values, each smaller than 64K but the sum being greater than 64K.

Note also that the broker must not have the ACL module loaded *if* you will be testing large (>64K) method requests.  This is because such methods are too large for the broker to evaluate and if the ACL module is loaded, they will be rejected.

You can, however, test large method results (output arguments) with or without the ACL module.

There is a test in the make-check regression suite (included in the 1058709 commit) for this fix.

Comment 6 Mike Cressman 2011-01-21 19:47:33 UTC
Backported to 1.3 branch for 1.3.2 RC2 build.

Comment 7 Petr Matousek 2011-01-24 15:52:22 UTC
I am not able to reporoduce the segfault on the ruby QMF engine-based agent:

testcase: 
1. send via ruby_console.rb multiple large-string values (each smaller than 64K, but the sum being greater than 64K) to the agent_ruby.rb with echo method
2. return the sum of recieved arguments with the echo method of agent

Version-Release number of selected component:
qpid-cpp-mrg-0.7.946106-26

The behaviour differs, depends on broker configuration (cluster x standalone) and architecture

Example behaviour RH5.6_64 standalone:
sum < 65293:
correct behaviour
($?=1)

65293 < sum < 65456 .. 
method echo is not initiated
(Timed out waiting for response, $?=1)

sum > 65456 .. 
CONSOLE is aborted with following message
terminate called after throwing an instance of 'qpid::framing::OutOfBounds'
  what():  Out of Bounds
Aborted
($?=134)

Version-Release number of selected component:
qpid-cpp-mrg-0.7.946106-27

The behaviour is as expected and identical on all architectures and cluster configuration.

if sum > 65535
AGENT is Aborted with the following message:
terminate called after throwing an instance of 'qpid::Exception'
  what():  Could not encode string of 65536 bytes as uint16_t string. (qpid/framing/Buffer.cpp:266)
Aborted

this is correct.

Comment 8 Petr Matousek 2011-01-24 15:59:08 UTC
Created attachment 474976 [details]
issue reproducer

I was able to reproduce the segfault at different conditions then described above, can you check whether the segfault is relevant, please?

[root ~/bugzillas/bz658936]# ./agent_ruby.rb 
starting
Agent Connection Established...
2011-01-24 16:48:17 notice Initial object-id bank assigned: 1.151
!Query:: user=cumin context=1 class=parent object_num=
!Method: user=cumin context=2 method=echo object_num=0-0-1-151-1 args=#<Qmf::Arguments:0xb7f1428c>
in_a length: 0
in_b length: 45234
/usr/lib/ruby/site_ruby/1.8/qmf.rb:1445: [BUG] Segmentation fault
ruby 1.8.5 (2006-08-25) [i386-linux]

Aborted (core dumped)

[root ~/bugzillas/bz658936]# ./ruby_console.rb 0 45234
Console Connection Established...
...

---- Agents ----
  => Agent embedded in broker
  => agent_test_label
----
---- parent object ----
    Pinging...!
/usr/lib/ruby/1.8/monitor.rb:102:in `stop': Interrupt
	from /usr/lib/ruby/1.8/monitor.rb:102:in `wait'
	from /usr/lib/ruby/site_ruby/1.8/qmf.rb:541:in `method_missing'
	from /usr/lib/ruby/1.8/monitor.rb:238:in `synchronize'
	from /usr/lib/ruby/site_ruby/1.8/qmf.rb:537:in `method_missing'
	from /usr/lib/ruby/site_ruby/1.8/qmf.rb:533:in `each'
	from /usr/lib/ruby/site_ruby/1.8/qmf.rb:533:in `method_missing'
	from ./ruby_console.rb:134:in `main'
	from ./ruby_console.rb:133:in `each'
	from ./ruby_console.rb:133:in `main'
	from ./ruby_console.rb:128:in `each'
	from ./ruby_console.rb:128:in `main'
	from ./ruby_console.rb:153




(gdb) info thre
  5 Thread 9150  0x00208402 in __kernel_vsyscall ()
  4 Thread 9151  0x00208402 in __kernel_vsyscall ()
  3 Thread 9152  0x00208402 in __kernel_vsyscall ()
  2 Thread 9153  0x00208402 in __kernel_vsyscall ()
* 1 Thread 9149  0x00208402 in __kernel_vsyscall ()
(gdb) thread apply all bt

Thread 5 (Thread 9150):
#0  0x00208402 in __kernel_vsyscall ()
#1  0x00d13bc5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x004fc3ed in wait (this=0x95ecd40) at ../include/qpid/sys/posix/Condition.h:63
#3  qmf::engine::ResilientConnectionImpl::run (this=0x95ecd40)
    at qmf/engine/ResilientConnection.cpp:366
#4  0x00fe5761 in qpid::sys::(anonymous namespace)::runRunnable (p=0x95ecd40)
    at qpid/sys/posix/Thread.cpp:35
#5  0x00d0f832 in start_thread () from /lib/libpthread.so.0
#6  0x003810ae in clone () from /lib/libc.so.6

Thread 4 (Thread 9151):
#0  0x00208402 in __kernel_vsyscall ()
#1  0x00d172f6 in nanosleep () from /lib/libpthread.so.0
#2  0x0080f930 in thread_timer (dummy=0x0) at eval.c:11673
#3  0x00d0f832 in start_thread () from /lib/libpthread.so.0
#4  0x003810ae in clone () from /lib/libc.so.6

Thread 3 (Thread 9152):
#0  0x00208402 in __kernel_vsyscall ()
#1  0x00381726 in epoll_wait () from /lib/libc.so.6
#2  0x00fef68a in qpid::sys::Poller::wait (this=0x9613d38, timeout=...)
    at qpid/sys/epoll/EpollPoller.cpp:563
#3  0x00ff02b3 in qpid::sys::Poller::run (this=0x9613d38) at qpid/sys/epoll/EpollPoller.cpp:515
#4  0x00fe5761 in qpid::sys::(anonymous namespace)::runRunnable (p=0x9613d38)
---Type <return> to continue, or q <return> to quit---
    at qpid/sys/posix/Thread.cpp:35
#5  0x00d0f832 in start_thread () from /lib/libpthread.so.0
#6  0x003810ae in clone () from /lib/libc.so.6

Thread 2 (Thread 9153):
#0  0x00208402 in __kernel_vsyscall ()
#1  0x00d13bc5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x009cf7bc in pop (this=0x9819060, timeout=...) at ../include/qpid/sys/posix/Condition.h:63
#3  qpid::sys::BlockingQueue<boost::shared_ptr<qpid::framing::FrameSet> >::pop (this=0x9819060, 
    timeout=...) at qpid/sys/BlockingQueue.h:71
#4  0x009cbdb7 in qpid::client::Dispatcher::run (this=0x98191b4) at qpid/client/Dispatcher.cpp:80
#5  0x009fe018 in qpid::client::SubscriptionManagerImpl::run (this=0x9819190)
    at qpid/client/SubscriptionManagerImpl.cpp:98
#6  0x009fc044 in qpid::client::SubscriptionManager::run (this=0x9819100)
    at qpid/client/SubscriptionManager.cpp:60
#7  0x004f99c8 in qmf::engine::RCSession::run (this=0x9818cf8)
    at qmf/engine/ResilientConnection.cpp:161
#8  0x00fe5761 in qpid::sys::(anonymous namespace)::runRunnable (p=0x9818cfc)
    at qpid/sys/posix/Thread.cpp:35
#9  0x00d0f832 in start_thread () from /lib/libpthread.so.0
#10 0x003810ae in clone () from /lib/libc.so.6

Thread 1 (Thread 9149):
#0  0x00208402 in __kernel_vsyscall ()
#1  0x002d7df0 in raise () from /lib/libc.so.6
#2  0x002d9701 in abort () from /lib/libc.so.6
---Type <return> to continue, or q <return> to quit---
#3  0x0080b4c2 in rb_bug (fmt=0x8af627 "Segmentation fault") at error.c:214
#4  0x0087780b in sigsegv (sig=11) at signal.c:537
#5  <signal handler called>
#6  0x004be92b in qmf::engine::AgentImpl::methodResponse (this=0x95f8160, sequence=2, status=0, 
    text=0x95933e8 "OK", argMap=...) at qmf/engine/Agent.cpp:374
#7  0x004bf06b in qmf::engine::Agent::methodResponse (this=0x95f7fd8, sequence=2, status=0, 
    text=0x95933e8 "OK", arguments=...) at qmf/engine/Agent.cpp:880
#8  0x00181e39 in _wrap_Agent_methodResponse (argc=4, argv=0xbfa85420, self=3086024940)
    at qmfengine.cpp:4536
#9  0x0080ede8 in call_cfunc (func=0x181d40 <_wrap_Agent_methodResponse(int, VALUE*, VALUE)>, 
    recv=3086024940, len=0, argc=157369072, argv=0xbfa85420) at eval.c:5654
#10 0x008164ab in rb_call0 (klass=3086115060, recv=3086024940, id=13905, oid=13905, argc=2, 
    argv=0xbfa85420, body=0xb7f25ed8, flags=0) at eval.c:5810
#11 0x008171c8 in rb_call (klass=3086115060, recv=3086024940, mid=13905, argc=4, argv=0xbfa85420, 
    scope=0) at eval.c:6048
#12 0x0081ee46 in rb_eval (self=3086025000, n=<value optimized out>) at eval.c:3443
#13 0x00816d7b in rb_call0 (klass=3086027340, recv=3086025000, id=10857, oid=10857, argc=0, 
    argv=0xbfa85b20, body=0xb7f29b78, flags=0) at eval.c:5954
#14 0x008171c8 in rb_call (klass=3086027340, recv=3086025000, mid=10857, argc=4, argv=0xbfa85b10, 
    scope=0) at eval.c:6048
#15 0x0081ee46 in rb_eval (self=3086025720, n=<value optimized out>) at eval.c:3443
#16 0x0081e415 in rb_eval (self=3086025720, n=<value optimized out>) at eval.c:2915
#17 0x00816d7b in rb_call0 (klass=3086026160, recv=3086025720, id=10825, oid=10825, argc=0, 
    argv=0xbfa866e4, body=0xb7f55c8c, flags=0) at eval.c:5954
#18 0x008171c8 in rb_call (klass=3086026160, recv=3086025720, mid=10825, argc=5, argv=0xbfa866d0, 
    scope=0) at eval.c:6048
---Type <return> to continue, or q <return> to quit---
#19 0x0081ee46 in rb_eval (self=3086025000, n=<value optimized out>) at eval.c:3443
#20 0x0081e415 in rb_eval (self=3086025000, n=<value optimized out>) at eval.c:2915
#21 0x0081f01a in rb_eval (self=3086025000, n=<value optimized out>) at eval.c:3097
#22 0x00816d7b in rb_call0 (klass=3086027340, recv=3086025000, id=14065, oid=14065, argc=0, 
    argv=0x0, body=0xb7f299e8, flags=0) at eval.c:5954
#23 0x008171c8 in rb_call (klass=3086027340, recv=3086025000, mid=14065, argc=0, argv=0x0, scope=2)
    at eval.c:6048
#24 0x0081eb87 in rb_eval (self=<value optimized out>, n=<value optimized out>) at eval.c:3464
#25 0x0081d670 in rb_eval (self=3086025000, n=<value optimized out>) at eval.c:3624
#26 0x0081d739 in rb_eval (self=3086025000, n=<value optimized out>) at eval.c:3132
#27 0x00816d7b in rb_call0 (klass=3086027340, recv=3086025000, id=13945, oid=13945, argc=0, 
    argv=0x0, body=0xb7f2819c, flags=0) at eval.c:5954
#28 0x008171c8 in rb_call (klass=3086027340, recv=3086025000, mid=13945, argc=0, argv=0x0, scope=2)
    at eval.c:6048
#29 0x0081eb87 in rb_eval (self=<value optimized out>, n=<value optimized out>) at eval.c:3464
#30 0x00816d7b in rb_call0 (klass=3086027340, recv=3086025000, id=11729, oid=11729, argc=0, 
    argv=0xbfa88bd8, body=0xb7f278c8, flags=0) at eval.c:5954
#31 0x008171c8 in rb_call (klass=3086027340, recv=3086025000, mid=11729, argc=2, argv=0xbfa88bd0, 
    scope=0) at eval.c:6048
#32 0x0081ee46 in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3443
#33 0x0081f91e in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3259
#34 0x0081f01a in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3097
#35 0x0081f01a in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3097
#36 0x00816d7b in rb_call0 (klass=3086036100, recv=3086025500, id=5113, oid=5113, argc=0, argv=0x0, 
    body=0xb7f48c58, flags=0) at eval.c:5954
#37 0x008171c8 in rb_call (klass=3086036100, recv=3086025500, mid=5113, argc=0, argv=0x0, scope=2)
---Type <return> to continue, or q <return> to quit---
    at eval.c:6048
#38 0x0081eb87 in rb_eval (self=<value optimized out>, n=<value optimized out>) at eval.c:3464
#39 0x00821a8c in rb_yield_0 (val=3086025180, self=3086025500, klass=<value optimized out>, flags=1, 
    avalue=2) at eval.c:4987
#40 0x0081a02e in rb_thread_start_0 (fn=0x822180 <rb_thread_yield>, arg=0xb7f101dc, th=0x95eda58)
    at eval.c:11800
#41 0x0080e9e7 in call_cfunc (func=0x81a100 <rb_thread_initialize>, recv=3086025200, 
    len=<value optimized out>, argc=0, argv=0x0) at eval.c:5651
#42 0x008164ab in rb_call0 (klass=3086385100, recv=3086025200, id=2953, oid=2953, argc=1027195397, 
    argv=0x0, body=0xb7f67f68, flags=0) at eval.c:5810
#43 0x008171c8 in rb_call (klass=3086385100, recv=3086025200, mid=2953, argc=0, argv=0x0, scope=1)
    at eval.c:6048
#44 0x00817497 in rb_obj_call_init (obj=3086025200, argc=0, argv=0x0) at eval.c:7529
#45 0x008174f2 in rb_thread_s_new (argc=0, argv=0x0, klass=3086385100) at eval.c:11913
#46 0x0080ede8 in call_cfunc (func=0x8174b0 <rb_thread_s_new>, recv=3086385100, len=0, 
    argc=157369072, argv=0x0) at eval.c:5654
#47 0x008164ab in rb_call0 (klass=3086385080, recv=3086385100, id=3337, oid=3337, argc=-1079463168, 
    argv=0x0, body=0xb7f67f90, flags=0) at eval.c:5810
#48 0x008171c8 in rb_call (klass=3086385080, recv=3086385100, mid=3337, argc=0, argv=0x0, scope=0)
    at eval.c:6048
#49 0x0081ee46 in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3443
#50 0x0082022c in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3173
#51 0x0081d384 in rb_eval (self=3086025500, n=<value optimized out>) at eval.c:3644
#52 0x00816d7b in rb_call0 (klass=3086036100, recv=3086025500, id=2953, oid=2953, argc=0, 
    argv=0xbfa8c134, body=0xb7f4993c, flags=2) at eval.c:5954
#53 0x008171c8 in rb_call (klass=3086036100, recv=3086025500, mid=2953, argc=1, argv=0xbfa8c130, 
---Type <return> to continue, or q <return> to quit---
    scope=1) at eval.c:6048
#54 0x00817497 in rb_obj_call_init (obj=3086025500, argc=1, argv=0xbfa8c130) at eval.c:7529
#55 0x0084517a in rb_class_new_instance (argc=1, argv=0xbfa8c130, klass=3086036100) at object.c:1567
#56 0x0080ede8 in call_cfunc (func=0x845140 <rb_class_new_instance>, recv=3086036100, len=0, 
    argc=157369072, argv=0xbfa8c130) at eval.c:5654
#57 0x008164ab in rb_call0 (klass=3086400780, recv=3086036100, id=3337, oid=3337, argc=62, 
    argv=0xbfa8c130, body=0xb7f6ab28, flags=0) at eval.c:5810
#58 0x008171c8 in rb_call (klass=3086400780, recv=3086036100, mid=3337, argc=1, argv=0xbfa8c130, 
    scope=0) at eval.c:6048
#59 0x0081ee46 in rb_eval (self=3086025720, n=<value optimized out>) at eval.c:3443
#60 0x0081d384 in rb_eval (self=3086025720, n=<value optimized out>) at eval.c:3644
#61 0x00816d7b in rb_call0 (klass=3086026160, recv=3086025720, id=5081, oid=5081, argc=0, argv=0x0, 
    body=0xb7f51e98, flags=0) at eval.c:5954
#62 0x008171c8 in rb_call (klass=3086026160, recv=3086025720, mid=5081, argc=0, argv=0x0, scope=0)
    at eval.c:6048
#63 0x0081ee46 in rb_eval (self=3086395900, n=<value optimized out>) at eval.c:3443
#64 0x00824a17 in ruby_exec_internal () at eval.c:1604
#65 0x00824a62 in ruby_exec () at eval.c:1624
#66 0x00824a9f in ruby_run () at eval.c:1634
#67 0x08048622 in main (argc=Cannot access memory at address 0x0
) at main.c:46

Comment 9 Petr Matousek 2011-01-24 16:01:30 UTC
I am going to retest -27, raising needinfo for last comments 7 & 8.

Comment 10 Petr Matousek 2011-01-24 16:24:31 UTC
Created attachment 474980 [details]
issue reproducer

Comment 12 Ted Ross 2011-01-27 16:53:49 UTC
Re: comment 8:  Yes, this trace is relevant and expected based on this bug.

Comment 13 Ted Ross 2011-01-27 17:21:52 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: There's a limitation in the code underlying the Ruby and Python qmf-agent APIs that results in a buffer overrun and crash if the size of a method request or response exceeds 65535 octets.  This is caused by a method call returning multiple large strings that in total exceed 65535.
Consequence: When the limit is exceeded, the agent (or console) library will segfault.
Fix: The limitation was removed from the design.
Result: Large method requests and responses now function normally.

Comment 14 Petr Matousek 2011-02-01 15:33:43 UTC
The issue has been fixed, tested on RHEL 4.9 / 5.6 i386 / x86_64 on packages:
python-qpid-0.7.946106-15.el5
qpid-cpp-client-0.7.946106-27.el5
qpid-cpp-client-devel-0.7.946106-27.el5
qpid-cpp-client-devel-docs-0.7.946106-27.el5
qpid-cpp-client-ssl-0.7.946106-27.el5
qpid-cpp-mrg-debuginfo-0.7.946106-27.el5
qpid-cpp-server-0.7.946106-27.el5
qpid-cpp-server-cluster-0.7.946106-27.el5
qpid-cpp-server-devel-0.7.946106-27.el5
qpid-cpp-server-ssl-0.7.946106-27.el5
qpid-cpp-server-store-0.7.946106-27.el5
qpid-cpp-server-xml-0.7.946106-27.el5
qpid-java-client-0.7.946106-14.el5
qpid-java-common-0.7.946106-14.el5
qpid-java-example-0.7.946106-14.el5
qpid-tools-0.7.946106-12.el5


-> VERIFIED

Comment 15 Jaromir Hradilek 2011-02-09 16:15:17 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-Cause: There's a limitation in the code underlying the Ruby and Python qmf-agent APIs that results in a buffer overrun and crash if the size of a method request or response exceeds 65535 octets.  This is caused by a method call returning multiple large strings that in total exceed 65535.
+When the size of a method request or response exceeded 65535 octets (that is, when a method call returned multiple large strings that in total exceeded this size), a buffer overflow could occur, causing a QMF engine-based agent to terminate unexpectedly with a segmentation fault. This update removes this limitation from the source code underlying the Ruby and Python QMF engines, and large method requests and responses no longer cause such agents to crash.-Consequence: When the limit is exceeded, the agent (or console) library will segfault.
-Fix: The limitation was removed from the design.
-Result: Large method requests and responses now function normally.

Comment 16 errata-xmlrpc 2011-02-15 12:13:22 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0217.html