Bug 1520694
| Summary: | Unable to calculate rates correctly when sample is handled by another controller | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | David Vallee Delisle <dvd> |
| Component: | openstack-ceilometer | Assignee: | Mehdi ABAAKOUK <mabaakou> |
| Status: | CLOSED ERRATA | QA Contact: | Sasha Smolyak <ssmolyak> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 10.0 (Newton) | CC: | djuran, jdanjou, jruzicka, mabaakou, marjones, pkundal, rlondhe, sacpatil, srevivo |
| Target Milestone: | Upstream M1 | Keywords: | FutureFeature, Triaged |
| Target Release: | 14.0 (Rocky) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-ceilometer-10.0.1-0.20180530162349.1c02e4b.el7ost | Doc Type: | No Doc Update |
| Doc Text: |
-
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-01-11 11:48:37 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
David Vallee Delisle
2017-12-05 00:29:52 UTC
We have two ways to do that: - The current way to do it, on Ceilometer side, by setting workload_partitioning=True This creates many new queues on rabbitmq to be able to ensure that all "cpu" samples are routed to the same ceilometer-agent-notification worker. But this increases the cpu usage of ceilometer-agent-notification, the load on rabbitmq, and adds lag to the processing. Also that's solution is not perfect because samples can still comes unordered. So if the received sample is older that the previous kept one, it will be dropped. This computation of the rate of change will be good, but some points will miss like when workload_partitioning=False. This feature does not have comprehensive testing and I have reviewed many fixes upstream that are not backported in stable versions. It decreases a the performance of Ceilometer. - A better way to do it, on Gnocchi side: Create a special archive policy for all rated metrics (cpu_util, network.*rate, disk.*rate, ...), that computes the "rate:last" aggregation. Better calculation, Gnocchi keep all needed points to compute that correctly. No more missing point for "rate of change" computation. But it requires Gnocchi 4.X, so that can't be used before OSP12. And the archive policy need to be create manually. vcpus, disk.ephemeral.size, disk.root.size are sent by nova every hour, so that normal you didn't see them every 10 minutes. Others are the rate metrics issue I'm talking about comment 6 and 9. *** Bug 1525977 has been marked as a duplicate of this bug. *** Verified, automated Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045 |