Description of problem: Ceilometer seems to fail when clicking on Resource Usage in Horizon Dashboard. In looking at the logs of Ceilometer API I can see the following traceback: 2014-02-13 06:43:08.178 8904 ERROR wsme.api [-] Server-side error: "command SON([('aggregate', u'meter'), ('pipeline', [{'$match': {u'resource_metadata.OS-EXT-AZ:availability_zone': u'nova'}}, {'$sort': {'timestamp': -1, 'project_id': -1, 'user_id': -1}}, {'$group': {'meters_unit': {'$push': '$counter_unit'}, 'source': {'$first': '$source'}, 'project_id': {'$first': '$project_id'}, 'user_id': {'$first': '$user_id'}, 'last_sample_timestamp': {'$max': '$timestamp'}, 'meters_name': {'$push': '$counter_name'}, 'first_sample_timestamp': {'$min': '$timestamp'}, 'meters_type': {'$push': '$counter_type'}, '_id': '$resource_id', 'metadata': {'$first': '$resource_metadata'}}}])]) failed: exception: aggregation result exceeds maximum document size (16MB)". Detail: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/wsmeext/pecan.py", line 72, in callfunction result = f(self, *args, **kwargs) File "/usr/lib/python2.6/site-packages/ceilometer/api/controllers/v2.py", line 965, in get_all for r in pecan.request.storage_conn.get_resources(**kwargs)] File "/usr/lib/python2.6/site-packages/ceilometer/storage/impl_mongodb.py", line 651, in get_resources "meters_unit": {"$push": "$counter_unit"}, File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 1061, in aggregate _use_master=use_master) File "/usr/lib64/python2.6/site-packages/pymongo/database.py", line 393, in command msg, allowable_errors) File "/usr/lib64/python2.6/site-packages/pymongo/helpers.py", line 147, in _check_command_response raise OperationFailure(msg % errmsg, code) OperationFailure: command SON([('aggregate', u'meter'), ('pipeline', [{'$match': {u'resource_metadata.OS-EXT-AZ:availability_zone': u'nova'}}, {'$sort': {'timestamp': -1, 'project_id': -1, 'user_id': -1}}, {'$group': {'meters_unit': {'$push': '$counter_unit'}, 'source': {'$first': '$source'}, 'project_id': {'$first': '$project_id'}, 'user_id': {'$first': '$user_id'}, 'last_sample_timestamp': {'$max': '$timestamp'}, 'meters_name': {'$push': '$counter_name'}, 'first_sample_timestamp': {'$min': '$timestamp'}, 'meters_type': {'$push': '$counter_type'}, '_id': '$resource_id', 'metadata': {'$first': '$resource_metadata'}}}])]) failed: exception: aggregation result exceeds maximum document size (16MB) The 16MB size limit is from mongodb which for any BSON aggregation queries limits the document size to 16MB. Version-Release number of selected component (if applicable): RHOS 4.0 Havana See above known upstream issue where mapReduce replaces aggregrate And there is already a stable havana backport https://review.openstack.org/#/c/66861/
This looks like a duplicate of https://bugzilla.redhat.com/1047872 which I have fixed upstream by changing the problematic mongo aggregation usage to a conventional map-reduce. Due to the urgency, I landed that fix internally for RHOS 4.0.z A1 *prior* to my upstream fixes being landed and backported to stable: https://review.openstack.org/#/q/Ibef4a95acada411af385ff75ccb36c5724068b59,n,z and then recently released upstream in 2013.2.2. Since RHOS 4.0.z A2 has been rebased onto 2013.2.2, the fix I've landed upstream (identical for intents and purposes to the original internal fix) will be available in RHOS at that point. Dave - is the customer seeing the issue in a bare 4.0 install? If so, they'll need to update to A1 immediately, or alternatively wait for A2 and pick up a number of other fixes in the process: https://launchpad.net/ceilometer/+milestone/2013.2.2