Bug 756790

Summary: broker std::bad_alloc error
Product: Red Hat Enterprise MRG Reporter: Leonid Zhaldybin <lzhaldyb>
Component: qpid-cppAssignee: Andrew Stitcher <astitcher>
Status: CLOSED WONTFIX QA Contact: Leonid Zhaldybin <lzhaldyb>
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: esammons, jross
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-06 18:53:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Script to reproduce the bug.
none
qpid broker log with std::bad_alloc error. none

Description Leonid Zhaldybin 2011-11-24 14:53:23 UTC
Description of problem:
If the 32-bit qpid broker has a queue with ~3GB of messages in it, trying to send something more to this queue results in std::bad_alloc error from broker. The client returns the connection error:
warning Connection [127.0.0.1:59573-127.0.0.1:5672] closed
Failed to connect (reconnect disabled)
It seems that 32-bit broker is unable to allocate memory somewhere above 3GB. 

Version-Release number of selected component (if applicable):
qpid-cpp-client-0.12-6.el6.i686
qpid-cpp-client-devel-0.12-6.el6.i686
qpid-cpp-client-rdma-0.12-6.el6.i686
qpid-cpp-client-ssl-0.12-6.el6.i686
qpid-cpp-debuginfo-0.12-6.el6.i686
qpid-cpp-server-0.12-6.el6.i686
qpid-cpp-server-cluster-0.12-6.el6.i686
qpid-cpp-server-rdma-0.12-6.el6.i686
qpid-cpp-server-ssl-0.12-6.el6.i686
qpid-cpp-server-store-0.12-6.el6.i68

How reproducible:
always


Steps to Reproduce:
1. Start up the 32-bit broker, create a queue with capacity more than 3GB.
2. Start a client which sends lots and lots of messages to this queue.
3. When broker's memory allocation reaches 3GB, it will unable to receive any more messages.
  
Actual results:
Broker throws std::bad_alloc error and does not accept any more messages.

Expected results:
Broker should be able to accept a message from a client in case that there is some free memory left on the machine and the target queue did not reach its capacity limit.


Additional info:
I tried to run a few test scenarios, such as:
1. Create one queue with a big capacity (4-5 GB). Try to send about 3.5 GB of messages to it. Messages were of a various sizes, from 1KB up to 100MB.
2. Create a number of queues (from 2 to 10) with big capacity. Try to distribute about 3.5 GB of messages among them.
The result was always the same: as soon as broker's memory consumption reached some value around 3GB, it was unable to receive messages any more.
So, either there is a bug in qpid broker, or there is a limitation of its capacity which cannot be overrun in 32-bit environment, in which case such a limitation should be properly described in user documentation.

Comment 1 Leonid Zhaldybin 2011-11-24 14:54:34 UTC
Created attachment 535946 [details]
Script to reproduce the bug.

Just untar it, change to bug_bad_alloc directory and type "make run".

Comment 2 Leonid Zhaldybin 2011-11-24 14:58:30 UTC
Created attachment 535947 [details]
qpid broker log with std::bad_alloc error.

Comment 3 Justin Ross 2011-11-28 14:56:31 UTC
This is in part fundamental; it's exhausting process virtual memory.  Is there a better way to handle it?

I'm inclined to seek a better log message (if possible) and a documentation fix.

Comment 4 Andrew Stitcher 2012-04-24 18:36:24 UTC
I concur with Justin - I don't think we can do much in the case that std::bad_alloc is thrown - it does mean that we can allocate no more memory. And in this case it almost certainly that we've run out of virtual address space.

We might be able to produce a log message before exiting, but we might not. The log subsystem itself needs to allocate memory.

I think that this can only be solved with documentation.

We may need to figure out safe maxima for the memory size that can be used.

Comment 6 Andrew Stitcher 2012-07-31 14:09:38 UTC
I don't think this is an actual bug, I propose closing it "won't fix."