Bug 1038704 (ceil-bp-central-agent-improve) - [RFE][ceilometer]: improve the ceilometer central agent
Summary: [RFE][ceilometer]: improve the ceilometer central agent
Keywords:
Status: CLOSED NOTABUG
Alias: ceil-bp-central-agent-improve
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ceilometer
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: ---
Assignee: Eoghan Glynn
QA Contact: Shai Revivo
URL: https://blueprints.launchpad.net/ceil...
Whiteboard: upstream_milestone_none upstream_stat...
Depends On:
Blocks: 799011 1038706
TreeView+ depends on / blocked
 
Reported: 2013-12-05 16:21 UTC by Stephen Gordon
Modified: 2016-04-27 02:45 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-10 14:00:41 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Stephen Gordon 2013-12-05 16:21:38 UTC
Cloned from launchpad blueprint https://blueprints.launchpad.net/ceilometer/+spec/central-agent-improvement.

Description:

This is the umbrella blueprint as the result of OpenStack HongKong summit design session https://etherpad.openstack.org/p/icehouse-summit-ceilometer-central-agent

Specification URL (additional information):

None

Comment 2 Eoghan Glynn 2014-01-28 13:37:44 UTC
To clarify, the main focus of this RFE is the ability to horizontally scale the ceilometer-central agent.

Currently, such scale-out is not possible without duplicating the polling of the public REST APIs from which the central agent generates sample data. Such duplication would add unnecessary traffic on the message bus, result in duped samples in the metering store and potentially lead ultimately to double-charging the user.

Horizontal scale-out is not possible currently because there is no co-ordination mechanism to divide the workload among multiple central agents (unlike say the partitioned alarm evaluator).

So the focus of the implementation will be rebasing the central agent on some co-ordination protocol, with the intention that this would be sufficiently general to be implemented in Oslo[1] and shared among several services.

Tooz[2] was one potential concrete protocol considered as the underpinnings for the generic synchronization service.

In terms of testing these improvements to the central agent, the key would be to spin up multiple instances of the agent and then check for effective duplication in the samples gathered for the meters for which the central is responsible for gathering.

For example, the image meter should be gathered once for every existing image glance in every polling period (default 600s). The samples observed being more frequent that the interval defined in the pipeline.yaml would suggest duplication, hence a lack of correct support for horizontal scaling.

Depending on the detailed mechanism used for co-ordination, testing should also assert that the pool of central agents can be grown and shrunk dynamically without causing duplication or starvation of any meters.

[1] https://blueprints.launchpad.net/oslo/+spec/service-sync 
[2] https://github.com/stackforge/tooz

Comment 4 Nick Barcet 2015-08-10 14:00:41 UTC
This was implemented via another RFE.


Note You need to log in before you can comment on or make changes to this bug.