Bug 1065420 - mongdb error using aggregate 16mb size limit
Summary: mongdb error using aggregate 16mb size limit
Keywords:
Status: CLOSED DUPLICATE of bug 1047872
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ceilometer
Version: 4.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.0
Assignee: Eoghan Glynn
QA Contact: Shai Revivo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-14 15:59 UTC by Dave Sullivan
Modified: 2018-12-04 17:29 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-17 19:23:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dave Sullivan 2014-02-14 15:59:19 UTC
Description of problem:

Ceilometer seems to fail when clicking on Resource Usage in Horizon Dashboard.   In looking at the logs of Ceilometer API I can see the following traceback:

2014-02-13 06:43:08.178 8904 ERROR wsme.api [-] Server-side error: "command SON([('aggregate', u'meter'), ('pipeline', [{'$match': {u'resource_metadata.OS-EXT-AZ:availability_zone': u'nova'}}, {'$sort': {'timestamp': -1, 'project_id': -1, 'user_id': -1}}, {'$group': {'meters_unit': {'$push': '$counter_unit'}, 'source': {'$first': '$source'}, 'project_id': {'$first': '$project_id'}, 'user_id': {'$first': '$user_id'}, 'last_sample_timestamp': {'$max': '$timestamp'}, 'meters_name': {'$push': '$counter_name'}, 'first_sample_timestamp': {'$min': '$timestamp'}, 'meters_type': {'$push': '$counter_type'}, '_id': '$resource_id', 'metadata': {'$first': '$resource_metadata'}}}])]) failed: exception: aggregation result exceeds maximum document size (16MB)". Detail:
Traceback (most recent call last):

  File "/usr/lib/python2.6/site-packages/wsmeext/pecan.py", line 72, in callfunction
    result = f(self, *args, **kwargs)

  File "/usr/lib/python2.6/site-packages/ceilometer/api/controllers/v2.py", line 965, in get_all
    for r in pecan.request.storage_conn.get_resources(**kwargs)]

  File "/usr/lib/python2.6/site-packages/ceilometer/storage/impl_mongodb.py", line 651, in get_resources
    "meters_unit": {"$push": "$counter_unit"},

  File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 1061, in aggregate
    _use_master=use_master)

  File "/usr/lib64/python2.6/site-packages/pymongo/database.py", line 393, in command
    msg, allowable_errors)

  File "/usr/lib64/python2.6/site-packages/pymongo/helpers.py", line 147, in _check_command_response
    raise OperationFailure(msg % errmsg, code)

OperationFailure: command SON([('aggregate', u'meter'), ('pipeline', [{'$match': {u'resource_metadata.OS-EXT-AZ:availability_zone': u'nova'}}, {'$sort': {'timestamp': -1, 'project_id': -1, 'user_id': -1}}, {'$group': {'meters_unit': {'$push': '$counter_unit'}, 'source': {'$first': '$source'}, 'project_id': {'$first': '$project_id'}, 'user_id': {'$first': '$user_id'}, 'last_sample_timestamp': {'$max': '$timestamp'}, 'meters_name': {'$push': '$counter_name'}, 'first_sample_timestamp': {'$min': '$timestamp'}, 'meters_type': {'$push': '$counter_type'}, '_id': '$resource_id', 'metadata': {'$first': '$resource_metadata'}}}])]) failed: exception: aggregation result exceeds maximum document size (16MB)

The 16MB size limit is from mongodb which for any BSON aggregation queries limits the document size to 16MB.   

Version-Release number of selected component (if applicable):

RHOS 4.0 Havana

See above

known upstream issue where mapReduce replaces aggregrate

And there is already a stable havana backport

https://review.openstack.org/#/c/66861/

Comment 2 Eoghan Glynn 2014-02-16 20:57:19 UTC
This looks like a duplicate of https://bugzilla.redhat.com/1047872 which I have fixed upstream by changing the problematic mongo aggregation usage to a conventional map-reduce.

Due to the urgency, I landed that fix internally for RHOS 4.0.z A1 *prior* to my upstream fixes being landed and backported to stable:

  https://review.openstack.org/#/q/Ibef4a95acada411af385ff75ccb36c5724068b59,n,z

and then recently released upstream in 2013.2.2.

Since RHOS 4.0.z A2 has been rebased onto 2013.2.2, the fix I've landed upstream (identical for intents and purposes to the original internal fix) will be available in RHOS at that point.

Dave - is the customer seeing the issue in a bare 4.0 install?

If so, they'll need to update to A1 immediately, or alternatively wait for A2 and pick up a number of other fixes in the process:

  https://launchpad.net/ceilometer/+milestone/2013.2.2


Note You need to log in before you can comment on or make changes to this bug.