Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1516285 - collectd ceph plugin crashes
collectd ceph plugin crashes
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: collectd (Show other bugs)
13.0 (Queens)
Unspecified Unspecified
high Severity urgent
: Upstream M1
: 13.0 (Queens)
Assigned To: Matthias Runge
Leonid Natapov
: Triaged
Depends On: 1558015
Blocks:
  Show dependency treegraph
 
Reported: 2017-11-22 07:17 EST by Matthias Runge
Modified: 2018-06-27 10:15 EDT (History)
6 users (show)

See Also:
Fixed In Version: collectd-5.8.0-1.1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-06-27 09:08:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Github collectd/collectd/issues/2572 None None None 2017-11-23 10:39 EST
Red Hat Product Errata RHEA-2018:2084 None None None 2018-06-27 09:10 EDT

  None (edit)
Description Matthias Runge 2017-11-22 07:17:45 EST
Description of problem:
Nov 22 12:53:34 euler systemd: Starting Collectd statistics daemon...
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "syslog" successfully loaded.
Nov 22 12:53:34 euler collectd: [2017-11-22 12:53:35] plugin_load: plugin "logfile" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "logfile" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: syslog: invalid loglevel [debug] defaulting to 'info'
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "cpu" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "interface" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "load" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "memory" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "write_graphite" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "ceph" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "df" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "disk" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "virt" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: Systemd detected, trying to signal readyness.
Nov 22 12:53:34 euler systemd: Started Collectd statistics daemon.
Nov 22 12:53:34 euler collectd[5437]: virt plugin: reader virt-0 initialized
Nov 22 12:53:34 euler collectd[5437]: Initialization complete, entering read-loop.
Nov 22 12:53:34 euler kernel: reader#3[5446]: segfault at 0 ip 00007f1bf12698dd sp 00007f1be5272e00 error
 4 in ceph.so[7f1bf1268000+5000]
Nov 22 12:53:34 euler abrt-hook-ccpp: Process 5437 (collectd) of user 0 killed by SIGSEGV - ignoring (repeated crash)
Nov 22 12:53:34 euler libvirtd: 2017-11-22 11:53:34.570+0000: 1461: error : virNetSocketReadWire:1793 : Cannot recv data: Connection reset by peer
Nov 22 12:53:34 euler systemd: collectd.service: main process exited, code=killed, status=11/SEGV
Nov 22 12:53:34 euler systemd: Unit collectd.service entered failed state.


Version-Release number of selected component (if applicable):
collectd-5.8
Comment 3 Matthias Runge 2017-11-27 08:36:49 EST
There is a commit message in collect-5.8: https://github.com/collectd/collectd/commit/647ac31bf9db60b1685d6d8d25be65375ba85891#diff-20b37368527caaa7f0318870e8cefd51

"""
This patch is not backward compatible with previous ceph versions.
"""
Comment 5 Matthias Runge 2017-11-29 04:12:29 EST
It seems, the crash only happens, when the option 
ConvertSpecialMetricTypes is set to true. Explicitly setting it to false, makes the plugin work even with older ceph releases.
Comment 6 Matthias Runge 2017-12-05 02:27:28 EST
Proposed fix usptream: https://github.com/collectd/collectd/commit/de05fb53fad6bc998f585b704ca0caeadc14a035
Comment 14 Leonid Natapov 2018-02-19 04:01:49 EST
Please,provide instructions how to test/configure

Thank you,
Comment 15 Matthias Runge 2018-02-19 05:54:36 EST
https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_ceph

In my case, the ceph config file looks like:

<LoadPlugin ceph>
  Globals false
</LoadPlugin>  
<Plugin "ceph">
    LongRunAvgLatency false
    ConvertSpecialMetricTypes false
    <Daemon "osd.0">
      SocketPath "/var/run/ceph/ceph-osd.0.asok"
    </Daemon>
</Plugin>

You'd probably need to figure out, where your ceph*.asok is stored.

and tons of ceph related metrics will show up in grafana, all beginning with "ceph_"
Comment 17 Leonid Natapov 2018-05-30 03:31:33 EDT
[2018-05-30 07:00:39] plugin_load: plugin "ceph" successfully loaded.
Comment 19 errata-xmlrpc 2018-06-27 09:08:58 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2084

Note You need to log in before you can comment on or make changes to this bug.