Bug 2130078

Summary: [RHOSP 16.1] Ceilometer-agent-compute could be started on compute nodes before libvirt which will cause ceilometer to fail to collect virt mertrics
Product: Red Hat OpenStack Reporter: Yadnesh Kulkarni <ykulkarn>
Component: openstack-tripleo-heat-templatesAssignee: Yadnesh Kulkarni <ykulkarn>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: medium Docs Contact: Joanne O'Flynn <joflynn>
Priority: high    
Version: 16.1 (Train)CC: apevec, csibbitt, jelynch, mburns
Target Milestone: z9Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20220928053641.29a02c1.el8ost Doc Type: Bug Fix
Doc Text:
Before this update, the libvirt service started after the ceilometer-agent-compute service and the ceilometer-agent-compute service did not communicate with libvirt, resulting in missing libvirt metrics. With this update, the ceilometer-agent-compute service starts after the libvirt service and can poll libvirt metrics without "Permission denied" errors.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-07 20:29:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2092088    
Bug Blocks:    

Description Yadnesh Kulkarni 2022-09-27 06:20:07 UTC
This bug was initially created as a copy of Bug #2092088


Ceilometer-agent-compute could be started on compute nodes before libvirt populated the corresponding directories under /var/run/libvirt which will cause ceilometer to fail to collect virt mertrics.

I observe this behavior on a recent OSP16.2 deployment where ceilometer fails to collect virt metrics for instances.

Error that I see in ceilometer compute log file:

2022-05-29 16:59:40.289 15 DEBUG ceilometer.compute.virt.libvirt.utils [-] Connecting to libvirt: qemu:///system new_libvirt_connection /usr/lib/python3.6/site-packages/ceilometer/compute/virt/libvirt/utils.py:87
2022-05-29 16:59:40.290 15 DEBUG ceilometer.polling.manager [-] Skip loading extension for perf.cache.misses: Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': Permission denied _catch_extension_load_error /usr/lib/python3.6/sit

Restarting ceilometer_agent_compute container solves the problem.

python3-ceilometer-13.1.3-2.20210802103828.20756c9.el8ost.noarch
openstack-ceilometer-common-13.1.3-2.20210802103828.20756c9.el8ost.noarch
openstack-ceilometer-polling-13.1.3-2.20210802103828.20756c9.el8ost.noarch
openstack-ceilometer-compute-13.1.3-2.20210802103828.20756c9.el8ost.noarch

Comment 7 Leonid Natapov 2022-11-06 04:09:41 UTC
Fiexed.

[Unit]
Description=ceilometer_agent_compute container
After=paunch-container-shutdown.service
Wants=tripleo_nova_libvirt.service
After=tripleo_nova_libvirt.service

Comment 16 errata-xmlrpc 2022-12-07 20:29:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenStack 16.1.9 (openstack-tripleo-heat-templates) security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8796