Bug 2116340

Summary: [RHOSP 17.1][puppet] Collectd ceph plugin can not connect to the socket since its path has been changed in ceph 5.x. As a result ceph metrics can't be collected.
Product: Red Hat OpenStack Reporter: Leonid Natapov <lnatapov>
Component: openstack-tripleo-heat-templatesAssignee: Yadnesh Kulkarni <ykulkarn>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact: mgeary <mgeary>
Priority: high    
Version: 17.1 (Wallaby)CC: eduen, fpantano, jbadiapa, jjoyce, johfulto, jschluet, lars, lmadsen, mburns, mmagr, mrunge, pgrist, slinaber, spower, tvignaud
Target Milestone: betaKeywords: Regression, Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-14.2.3-1.20230227001309.27145bc.el9ost openstack-tripleo-heat-templates-14.3.1-1.20230412011051.2e6d826.el9ost Doc Type: Bug Fix
Doc Text:
RHCS5 onwards socket paths for Ceph services are located under "/var/lib/ceph/<FSID>". Collectd does not retrieve any Ceph metrics due to this change because it attempts to find socket files under "/var/lib/ceph/". This fix accounts for the change in socket path and Collectd is now able to make Ceph metrics available to its consumers.
Story Points: ---
Clone Of:
: 2150938 (view as bug list) Environment:
Last Closed: 2023-08-16 01:11:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2150938    

Description Leonid Natapov 2022-08-08 10:23:46 UTC
Since latest OSP17 compose we introduced ceph 5.2 in OSP17. 
In Ceph 5.x , socket path has been changed from /var/run/ceph/ to /var/run/ceph/FSID  FSID is the ceph cluster ID




We currently using a code that  creates a conf file for collectd ceph plugin and defines socket path in /var/run/ceph


/usr/share/openstack-puppet/modules/collectd/templates/plugin/ceph.conf.erb


<Plugin ceph>
  LongRunAvgLatency <%= @longrunavglatency %>
  ConvertSpecialMetricTypes <%= @convertspecialmetrictypes %>
<% @daemons.each do |daemon| -%>
  <Daemon "<%= daemon %>">
    SocketPath "/var/run/ceph/<%= daemon %>.asok"
  </Daemon>
<% end -%>
</Plugin>

As a result of that collectd ceph plugin can't connect to the socket because there is no socket in /var/run/ceph and no ceph metrics collected.

Workaround:
Manually change socket path in collectd ceph plugin config file.

Comment 25 Leif Madsen 2022-12-14 16:25:47 UTC
Moving this for resolution against RHOSP 17.1.

Comment 26 Matthias Runge 2023-03-06 16:22:53 UTC
In order to get this moved forward, it'll need a downstream build, the build name-version-release put in the fixed-in-version field and then the bug moved to MODIFIED

Comment 27 Leif Madsen 2023-03-06 18:37:19 UTC
(In reply to Matthias Runge from comment #26)
> In order to get this moved forward, it'll need a downstream build, the build
> name-version-release put in the fixed-in-version field and then the bug
> moved to MODIFIED

Is that because collectd is a non-importing component?

Comment 28 Matthias Runge 2023-03-10 08:59:06 UTC
(In reply to Leif Madsen from comment #27)
> (In reply to Matthias Runge from comment #26)
> > In order to get this moved forward, it'll need a downstream build, the build
> > name-version-release put in the fixed-in-version field and then the bug
> > moved to MODIFIED
> 
> Is that because collectd is a non-importing component?

Neither puppet-collectd nor collectd are being imported like upstream openstack bits. Only releases are being imported via RDO. You have to have it built in rdo and then tag the release via RDO info https://review.rdoproject.org/r/q/project:rdoinfo+status:open
and then it'll gets pulled downstream.

Patches like this are not included in new releases, and upstream projects usually also don't care about backports.

Comment 29 Matthias Runge 2023-04-13 13:39:22 UTC
The patch is merged downstream, the package needs a to include/use the patch and needs a rebuild

Comment 33 Leonid Natapov 2023-05-07 15:19:39 UTC
path to osds includes now FSID.


from /usr/share/openstack-puppet/modules/collectd/templates/plugin/ceph.conf.erb

---
SocketPath "/var/run/ceph/<%= @ceph_fsid %>/<%= daemon %>.asok"
---

ceph plugin was successfully loaded and I can see OSDs metrics in Prometheus.

Comment 41 errata-xmlrpc 2023-08-16 01:11:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577