Bug 1983912 - STF | Ceilometer metrics are not delivered to the STF server after STF server was down and came up again.
Summary: STF | Ceilometer metrics are not delivered to the STF server after STF server...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ceilometer
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 17.1
Assignee: Yadnesh Kulkarni
QA Contact: Leonid Natapov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-20 07:11 UTC by Leonid Natapov
Modified: 2023-08-09 06:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-16 12:36:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-6316 0 None None None 2021-11-15 13:06:14 UTC

Description Leonid Natapov 2021-07-20 07:11:14 UTC
STF | Ceilometer metrics are not delivered to the STF server after STF server was down and came up again.

Scenario:

1.OSP deployment with STF successfully sends ceilometer metrics to the STF server.
2.OCP cluster with STF server deployed on it goes down for several hours.
3.Ceilometer unable to send metrics because STF server is down and there is "time out" message in the ceilometer logs.
4.STF server comes back to life but ceilometer still unable to send metrics with the same time out message in the logs.

Only after manually restarting metrics_qdr container,ceilometer metrics started to get to the server side.

-----------------------------
021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging [-] Could not send notification to osp162-metering. Payload={'message_id': '8f39945d-261b-4f95-9ec0-36e6e215ed52', 'publisher_id': 'telemetry.publisher.controller-1.redhat.l
ocal', 'event_type': 'metering', 'priority': 'SAMPLE', 'payload': [{'source': 'openstack', 'counter_name': 'disk.device.read.requests', 'counter_type': 'cumulative', 'counter_unit': 'request', 'counter_volume': 861, 'user_id': '0182fc3ff5
eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'c666091a-1856-46d3-930e-fe6c07df43a0-vda', 'timestamp': '2021-07-19T17:20:16.942127', 'resource_metadata': {'display_name': 'workload_instance_1',
'name': 'instance-00000005', 'instance_id': 'c666091a-1856-46d3-930e-fe6c07df43a0', 'instance_type': 'workload_flavor_1', 'host': 'e18e558a7c9da2666a88a59abf307a2f6306cc4b2b878b58ecf5ce0f', 'instance_host': 'compute-1.redhat.local', 'flav
or': {'id': '42bb29f4-6de8-4352-9839-3593636a71ff', 'name': 'workload_flavor_1', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '936668a5-1dad-4ce7-
a484-1be3d4acff0c'}, 'image_ref': '936668a5-1dad-4ce7-a484-1be3d4acff0c', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'disk_name': 'vda'},
 'message_id': '9612dd84-e8b5-11eb-8263-5254007a033a', 'monotonic_time': None, 'message_signature': '4cf94a7b9134baba0e06f5f6ac75bbe009762295fe6c5e2e1dce57a50c90e258'}], 'timestamp': '2021-07-19 17:20:17.061598'}: oslo_messaging.exception
s.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging Traceback (most recent call last):
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/messaging.py", line 69, in notify
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     retry=retry)
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 136, in _send_notification
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     retry=retry)
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 295, in wrap
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     return func(self, *args, **kws)
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 397, in send_notification
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     raise rc
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging
2021-07-19 17:20:47.797 16 WARNING oslo_messaging._drivers.amqp1_driver.controller [-] Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging [-] Could not send notification to osp162-metering. Payload={'message_id': 'f2841cd9-2b8e-430f-a76c-23f55ddc0571', 'publisher_id': 'telemetry.publisher.controller-1.redhat.local', 'event_type': 'metering', 'priority': 'SAMPLE', 'payload': [{'source': 'openstack', 'counter_name': 'network.incoming.packets', 'counter_type': 'cumulative', 'counter_unit': 'packet', 'counter_volume': 89, 'user_id': '0182fc3ff5eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'instance-00000005-c666091a-1856-46d3-930e-fe6c07df43a0-tapb0001a7d-b8', 'timestamp': '2021-07-19T17:20:16.981129', 'resource_metadata': {'display_name': 'workload_instance_1', 'name': 'tapb0001a7d-b8', 'instance_id': 'c666091a-1856-46d3-930e-fe6c07df43a0', 'instance_type': 'workload_flavor_1', 'host': 'e18e558a7c9da2666a88a59abf307a2f6306cc4b2b878b58ecf5ce0f', 'instance_host': 'compute-1.redhat.local', 'flavor': {'id': '42bb29f4-6de8-4352-9839-3593636a71ff', 'name': 'workload_flavor_1', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '936668a5-1dad-4ce7-a484-1be3d4acff0c'}, 'image_ref': '936668a5-1dad-4ce7-a484-1be3d4acff0c', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'mac': 'fa:16:3e:5c:01:b8', 'fref': None, 'parameters': {'interfaceid': 'b0001a7d-b859-4e06-8c7b-66d7dc2d55f6', 'bridge': 'br-int'}, 'vnic_name': 'tapb0001a7d-b8'}, 'message_id': '96180ab6-e8b5-11eb-8263-5254007a033a', 'monotonic_time': None, 'message_signature': '861cd13453fee69c45ecab15a3a4625aab8a07ea3531ac943b13b2663e793dfd'}], 'timestamp': '2021-07-19 17:20:17.100501'}: oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging Traceback (most recent call last):
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/messaging.py", line 69, in notify
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     retry=retry)
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 136, in _send_notification
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     retry=retry)
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 295, in wrap
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     return func(self, *args, **kws)
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 397, in send_notification
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging     raise rc
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging
2021-07-19 17:20:47.799 16 WARNING oslo_messaging._drivers.amqp1_driver.controller [-] Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging [-] Could not send notification to osp162-metering. Payload={'message_id': '8fac7376-665d-45eb-94aa-9ad9a150b2b2', 'publisher_id': 'telemetry.publisher.controller-1.redhat.local', 'event_type': 'metering', 'priority': 'SAMPLE', 'payload': [{'source': 'openstack', 'counter_name': 'disk.device.read.bytes', 'counter_type': 'cumulative', 'counter_unit': 'B', 'counter_volume': 23407104, 'user_id': '0182fc3ff5eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'be278c8b-b0eb-4c5d-bd00-fbc5f3305421-vda', 'timestamp': '2021-07-19T17:20:17.073441', 'resource_metadata': {'display_name': 'workload_instance_0', 'name': 'instance-00000002', 'instance_id': 'be278c8b-b0eb-4c5d-bd00-fbc5f3305421', 'instance_type': 'workload_flavor_0', 'host': 'fbf73fd808d0c19d9d9931c48e3fe04034ad6a659cb2364237bb62bb', 'instance_host': 'compute-0.redhat.local', 'flavor': {'id': 'f76ffd91-b230-48f8-b9c6-5cc97398dc96', 'name': 'workload_flavor_0', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '6910f5ab-f613-4a25-b81f-1df38faa22b8'}, 'image_ref': '6910f5ab-f613-4a25-b81f-1df38faa22b8', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'disk_name': 'vda'}, 'message_id': '962761d2-e8b5-11eb-a6c7-5254005b0f0d', 'monotonic_time': None, 'message_signature': 'af66eb9857c9254280a99dbf6f5332a2dc19099e458978942dc678b5c8262a75'}, {'source': 'openstack', 'counter_name': 'disk.device.read.bytes', 'counter_type': 'cumulative', 'counter_unit': 'B', 'counter_volume': 23407104, 'user_id': '0182fc3ff5eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'b7d0ad79-c92f-4ea8-a66d-02d000c2f28c-vda', 'timestamp': '2021-07-19T17:20:17.073441', 'resource_metadata': {'display_name': 'leonid', 'name': 'instance-00000008', 'instance_id': 'b7d0ad79-c92f-4ea8-a66d-02d000c2f28c', 'instance_type': 'workload_flavor_0', 'host': 'fbf73fd808d0c19d9d9931c48e3fe04034ad6a659cb2364237bb62bb', 'instance_host': 'compute-0.redhat.local', 'flavor': {'id': 'f76ffd91-b230-48f8-b9c6-5cc97398dc96', 'name': 'workload_flavor_0', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '6910f5ab-f613-4a25-b81f-1df38faa22b8'}, 'image_ref': '6910f5ab-f613-4a25-b81f-1df38faa22b8', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'disk_name': 'vda'}, 'message_id': '96289eb2-e8b5-11eb-a6c7-5254005b0f0d', 'monotonic_time': None, 'message_signature': 'dbe9149ee8276d68f1be58762e09920796f0ae8823fb0d6f0baf579e56682266'}], 'timestamp': '2021-07-19 17:20:17.629814'}: oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out
2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging Traceback (most recent call last):
2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging   File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/messaging.py", line 69, in notify
2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging     retry=retry)

Comment 11 Leonid Natapov 2023-03-16 12:36:25 UTC
I am closing this issue as notabug since I've done some tests with OSP17.1 and STF 1.5. When I've restarted worker nodes on OCP cluster and after they came up I could see metrics flow from ceilometer and collectd. If described issue will be reproduced at some point I will file a bug.


Note You need to log in before you can comment on or make changes to this bug.