I am closing this issue as notabug since I've done some tests with OSP17.1 and STF 1.5. When I've restarted worker nodes on OCP cluster and after they came up I could see metrics flow from ceilometer and collectd. If described issue will be reproduced at some point I will file a bug.
STF | Ceilometer metrics are not delivered to the STF server after STF server was down and came up again. Scenario: 1.OSP deployment with STF successfully sends ceilometer metrics to the STF server. 2.OCP cluster with STF server deployed on it goes down for several hours. 3.Ceilometer unable to send metrics because STF server is down and there is "time out" message in the ceilometer logs. 4.STF server comes back to life but ceilometer still unable to send metrics with the same time out message in the logs. Only after manually restarting metrics_qdr container,ceilometer metrics started to get to the server side. ----------------------------- 021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging [-] Could not send notification to osp162-metering. Payload={'message_id': '8f39945d-261b-4f95-9ec0-36e6e215ed52', 'publisher_id': 'telemetry.publisher.controller-1.redhat.l ocal', 'event_type': 'metering', 'priority': 'SAMPLE', 'payload': [{'source': 'openstack', 'counter_name': 'disk.device.read.requests', 'counter_type': 'cumulative', 'counter_unit': 'request', 'counter_volume': 861, 'user_id': '0182fc3ff5 eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'c666091a-1856-46d3-930e-fe6c07df43a0-vda', 'timestamp': '2021-07-19T17:20:16.942127', 'resource_metadata': {'display_name': 'workload_instance_1', 'name': 'instance-00000005', 'instance_id': 'c666091a-1856-46d3-930e-fe6c07df43a0', 'instance_type': 'workload_flavor_1', 'host': 'e18e558a7c9da2666a88a59abf307a2f6306cc4b2b878b58ecf5ce0f', 'instance_host': 'compute-1.redhat.local', 'flav or': {'id': '42bb29f4-6de8-4352-9839-3593636a71ff', 'name': 'workload_flavor_1', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '936668a5-1dad-4ce7- a484-1be3d4acff0c'}, 'image_ref': '936668a5-1dad-4ce7-a484-1be3d4acff0c', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'disk_name': 'vda'}, 'message_id': '9612dd84-e8b5-11eb-8263-5254007a033a', 'monotonic_time': None, 'message_signature': '4cf94a7b9134baba0e06f5f6ac75bbe009762295fe6c5e2e1dce57a50c90e258'}], 'timestamp': '2021-07-19 17:20:17.061598'}: oslo_messaging.exception s.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging Traceback (most recent call last): 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/messaging.py", line 69, in notify 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging retry=retry) 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 136, in _send_notification 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging retry=retry) 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 295, in wrap 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging return func(self, *args, **kws) 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 397, in send_notification 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging raise rc 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging 2021-07-19 17:20:47.797 16 WARNING oslo_messaging._drivers.amqp1_driver.controller [-] Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging [-] Could not send notification to osp162-metering. Payload={'message_id': 'f2841cd9-2b8e-430f-a76c-23f55ddc0571', 'publisher_id': 'telemetry.publisher.controller-1.redhat.local', 'event_type': 'metering', 'priority': 'SAMPLE', 'payload': [{'source': 'openstack', 'counter_name': 'network.incoming.packets', 'counter_type': 'cumulative', 'counter_unit': 'packet', 'counter_volume': 89, 'user_id': '0182fc3ff5eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'instance-00000005-c666091a-1856-46d3-930e-fe6c07df43a0-tapb0001a7d-b8', 'timestamp': '2021-07-19T17:20:16.981129', 'resource_metadata': {'display_name': 'workload_instance_1', 'name': 'tapb0001a7d-b8', 'instance_id': 'c666091a-1856-46d3-930e-fe6c07df43a0', 'instance_type': 'workload_flavor_1', 'host': 'e18e558a7c9da2666a88a59abf307a2f6306cc4b2b878b58ecf5ce0f', 'instance_host': 'compute-1.redhat.local', 'flavor': {'id': '42bb29f4-6de8-4352-9839-3593636a71ff', 'name': 'workload_flavor_1', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '936668a5-1dad-4ce7-a484-1be3d4acff0c'}, 'image_ref': '936668a5-1dad-4ce7-a484-1be3d4acff0c', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'mac': 'fa:16:3e:5c:01:b8', 'fref': None, 'parameters': {'interfaceid': 'b0001a7d-b859-4e06-8c7b-66d7dc2d55f6', 'bridge': 'br-int'}, 'vnic_name': 'tapb0001a7d-b8'}, 'message_id': '96180ab6-e8b5-11eb-8263-5254007a033a', 'monotonic_time': None, 'message_signature': '861cd13453fee69c45ecab15a3a4625aab8a07ea3531ac943b13b2663e793dfd'}], 'timestamp': '2021-07-19 17:20:17.100501'}: oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging Traceback (most recent call last): 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/messaging.py", line 69, in notify 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging retry=retry) 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 136, in _send_notification 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging retry=retry) 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 295, in wrap 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging return func(self, *args, **kws) 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_amqp1.py", line 397, in send_notification 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging raise rc 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.797 16 ERROR oslo_messaging.notify.messaging 2021-07-19 17:20:47.799 16 WARNING oslo_messaging._drivers.amqp1_driver.controller [-] Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging [-] Could not send notification to osp162-metering. Payload={'message_id': '8fac7376-665d-45eb-94aa-9ad9a150b2b2', 'publisher_id': 'telemetry.publisher.controller-1.redhat.local', 'event_type': 'metering', 'priority': 'SAMPLE', 'payload': [{'source': 'openstack', 'counter_name': 'disk.device.read.bytes', 'counter_type': 'cumulative', 'counter_unit': 'B', 'counter_volume': 23407104, 'user_id': '0182fc3ff5eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'be278c8b-b0eb-4c5d-bd00-fbc5f3305421-vda', 'timestamp': '2021-07-19T17:20:17.073441', 'resource_metadata': {'display_name': 'workload_instance_0', 'name': 'instance-00000002', 'instance_id': 'be278c8b-b0eb-4c5d-bd00-fbc5f3305421', 'instance_type': 'workload_flavor_0', 'host': 'fbf73fd808d0c19d9d9931c48e3fe04034ad6a659cb2364237bb62bb', 'instance_host': 'compute-0.redhat.local', 'flavor': {'id': 'f76ffd91-b230-48f8-b9c6-5cc97398dc96', 'name': 'workload_flavor_0', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '6910f5ab-f613-4a25-b81f-1df38faa22b8'}, 'image_ref': '6910f5ab-f613-4a25-b81f-1df38faa22b8', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'disk_name': 'vda'}, 'message_id': '962761d2-e8b5-11eb-a6c7-5254005b0f0d', 'monotonic_time': None, 'message_signature': 'af66eb9857c9254280a99dbf6f5332a2dc19099e458978942dc678b5c8262a75'}, {'source': 'openstack', 'counter_name': 'disk.device.read.bytes', 'counter_type': 'cumulative', 'counter_unit': 'B', 'counter_volume': 23407104, 'user_id': '0182fc3ff5eb4bf79ce6fa4cf2c57e04', 'project_id': 'bcffa5d120ee4f1cad7ed883856735b0', 'resource_id': 'b7d0ad79-c92f-4ea8-a66d-02d000c2f28c-vda', 'timestamp': '2021-07-19T17:20:17.073441', 'resource_metadata': {'display_name': 'leonid', 'name': 'instance-00000008', 'instance_id': 'b7d0ad79-c92f-4ea8-a66d-02d000c2f28c', 'instance_type': 'workload_flavor_0', 'host': 'fbf73fd808d0c19d9d9931c48e3fe04034ad6a659cb2364237bb62bb', 'instance_host': 'compute-0.redhat.local', 'flavor': {'id': 'f76ffd91-b230-48f8-b9c6-5cc97398dc96', 'name': 'workload_flavor_0', 'vcpus': 1, 'ram': 512, 'disk': 5, 'ephemeral': 0, 'swap': 0}, 'status': 'active', 'state': 'running', 'task_state': '', 'image': {'id': '6910f5ab-f613-4a25-b81f-1df38faa22b8'}, 'image_ref': '6910f5ab-f613-4a25-b81f-1df38faa22b8', 'image_ref_url': None, 'architecture': 'x86_64', 'os_type': 'hvm', 'vcpus': 1, 'memory_mb': 512, 'disk_gb': 5, 'ephemeral_gb': 0, 'root_gb': 5, 'disk_name': 'vda'}, 'message_id': '96289eb2-e8b5-11eb-a6c7-5254005b0f0d', 'monotonic_time': None, 'message_signature': 'dbe9149ee8276d68f1be58762e09920796f0ae8823fb0d6f0baf579e56682266'}], 'timestamp': '2021-07-19 17:20:17.629814'}: oslo_messaging.exceptions.MessageDeliveryFailure: Notify message sent to <Target topic=osp162-metering.sample> failed: timed out 2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging Traceback (most recent call last): 2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/messaging.py", line 69, in notify 2021-07-19 17:20:47.799 16 ERROR oslo_messaging.notify.messaging retry=retry)