Bug 2167428
| Summary: | [RHOSP 17.1] Ceilometer doesn't send data to Gnocchi. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Leonid Natapov <lnatapov> | ||||
| Component: | openstack-ceilometer | Assignee: | Yadnesh Kulkarni <ykulkarn> | ||||
| Status: | ON_DEV --- | QA Contact: | Leonid Natapov <lnatapov> | ||||
| Severity: | medium | Docs Contact: | mgeary <mgeary> | ||||
| Priority: | medium | ||||||
| Version: | 17.1 (Wallaby) | CC: | apevec, erpeters, jamsmith, lmadsen, mrunge, ykulkarn | ||||
| Target Milestone: | z2 | Keywords: | Reopened, Triaged | ||||
| Target Release: | 17.1 | Flags: | jamsmith:
needinfo?
(ykulkarn) |
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Known Issue | |||||
| Doc Text: |
During a new deployment, the keystone service is often not available when the agent-notification service is initializing. This prevents ceilometer from discovering the gnocchi endpoint. As a result, metrics are not sent to gnocchi.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-04-17 12:05:41 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
The archive policy used in the configuration is `ceilometer-high`
~~~
# cat pipeline.yaml
---
sources:
- name: meter_source
meters:
- "*"
sinks:
- meter_sink
sinks:
- name: meter_sink
publishers:
- gnocchi://?filter_project=service&archive_policy=ceilometer-high
- notifier://172.17.1.73:5666/?driver=amqp&topic=osp17-metering
# cat event_pipeline.yaml
---
sources:
- name: event_source
events:
- "*"
sinks:
- event_sink
sinks:
- name: event_sink
transformers:
triggers:
publishers:
- gnocchi://?filter_project=service&archive_policy=ceilometer-high
- notifier://172.17.1.73:5666/?driver=amqp&topic=osp17-event
~~~
No such archive policy exists in gnocchi which should've been generated during "ceilometer-upgrade".
However, it doesn't complain/log anything about the incoming metrics having an undefined archive policy.
~~~
$ openstack metric archive-policy list
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
| name | back_window | definition | aggregation_methods |
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
| bool | 3600 | - timespan: 365 days, 0:00:00, granularity: 0:00:01, points: 31536000 | last |
| high | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | min, mean, count, max, sum, std |
| | | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| low | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | min, mean, count, max, sum, std |
| medium | 0 | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | min, mean, count, max, sum, std |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
~~~
Upon restarting notification agent on one of the ctrl nodes, the missing policies were created
~~~
$ openstack metric archive-policy list
+----------------------+-------------+-----------------------------------------------------------------------+---------------------------------+
| name | back_window | definition | aggregation_methods |
+----------------------+-------------+-----------------------------------------------------------------------+---------------------------------+
| bool | 3600 | - timespan: 365 days, 0:00:00, granularity: 0:00:01, points: 31536000 | last |
| ceilometer-high | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | mean |
| | | - timespan: 1 day, 0:00:00, granularity: 0:01:00, points: 1440 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| ceilometer-high-rate | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | mean, rate:mean |
| | | - timespan: 1 day, 0:00:00, granularity: 0:01:00, points: 1440 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| ceilometer-low | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | mean |
| ceilometer-low-rate | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | mean, rate:mean |
| high | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | mean, count, max, min, sum, std |
| | | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| low | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | mean, count, max, min, sum, std |
| medium | 0 | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | mean, count, max, min, sum, std |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
+----------------------+-------------+-----------------------------------------------------------------------+---------------------------------+
~~~
Couldn't reproduce this issues. Closing as NOTABUG. If will be consistent reproduction will file a new BZ. It seems that during deployment, keystone didn't respond to ceilometer's request to obtain gnocchi endpoint using gnocchiclient [1] ~~~ 2023-05-08 18:33:49.147 14 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://172.17.1.82:5000. Attempting to parse version from URL.: keystoneauth1.exceptions.connection.ConnectTimeout: Request to http://172.17.1.82:5000 timed out 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base [-] Unable to load publisher gnocchi://?filter_project=service&archive_policy=ceilometer-high: keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Request to http://172.17.1.82:5000 timed out 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base Traceback (most recent call last): 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base six.raise_from(e, None) 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "<string>", line 3, in raise_from 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base httplib_response = conn.getresponse() 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/http/client.py", line 1377, in getresponse 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base response.begin() 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/http/client.py", line 320, in begin 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base version, status, reason = self._read_status() 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/http/client.py", line 281, in _read_status 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/socket.py", line 704, in readinto 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base return self._sock.recv_into(b) 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base socket.timeout: timed out 2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base ~~~ Since ceilometer couldn't get gnocchiclient[2] with proper auth values, it couldn't create the necessary archive policies[3] Restarting agent_notification service after deployment fixes this because by that time keystone is healthy and responding. This seems intermittent because ceilometer & gnocchi services are spawned during step 4 & 5 till then keystone should be completely operational. [1] https://github.com/openstack/ceilometer/blob/stable/wallaby/ceilometer/gnocchi_client.py#L36-L39 [2] https://github.com/openstack/ceilometer/blob/stable/wallaby/ceilometer/publisher/gnocchi.py#L216-L217 [3] https://github.com/openstack/ceilometer/blob/stable/wallaby/ceilometer/publisher/gnocchi.py#L252 Since we saw this issue happening again I am going to resurrect this bug . I will lower the priority and severity to medium since there is a clear work around for this issue and since it seems to happen only sometimes. Probably a rise condition ? Bulk moving target milestone to GA after the release of Beta on 14th June '23. Shifting this to 17.1 z2 due to z1 being only for urgent bugs and this has missed beta and GA. Root cause of this issue ~~~ 2023-08-06 09:47:24.788 14 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://172.17.1.57:5000. Attempting to parse version from URL.: keystoneauth1.exceptions.connection.ConnectTimeout: Request to http://172.17.1.57:5000 timed out 2023-08-06 09:47:24.792 14 ERROR ceilometer.pipeline.base [-] Unable to load publisher gnocchi://?filter_project=service&archive_policy=ceilometer-high: keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Request to http://172.17.1.57:5000 timed out ~~~ When notification service is initialized and keystone service is not available, ceilometer will not be able to fetch the endpoint for gnocchi and assumes the gnocchi publisher is invalid. If any ceilometer fails to load any publisher it will not send metrics to it. Which is why no metrics are found in gnocchi. Restarting notification service will reload all publishers which fixes this issue. |
Created attachment 1942553 [details] ceilometer config OSP17.1 | Ceilometer doesn't send data to Gnocchi. I have freshly installed OSP17.1 with two instances up and running and ceilometer configured to send data to gnocchi. gnocchi metric list command returns empty. After restarting ceilometer container metrics start to flow W/A Restart ceilometer container Attached files: 1.ceilometer conf files 2.ceilometer logs 3.gnocchi logs