Bug 1516285 - collectd ceph plugin crashes
Summary: collectd ceph plugin crashes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: collectd
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: Upstream M1
: 13.0 (Queens)
Assignee: Matthias Runge
QA Contact: Leonid Natapov
URL:
Whiteboard:
Depends On: 1558015
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-22 12:17 UTC by Matthias Runge
Modified: 2018-06-27 14:15 UTC (History)
6 users (show)

Fixed In Version: collectd-5.8.0-1.1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:08:58 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github collectd collectd issues 2572 None None None 2017-11-23 15:39:54 UTC
Red Hat Product Errata RHEA-2018:2084 None None None 2018-06-27 13:10:40 UTC

Description Matthias Runge 2017-11-22 12:17:45 UTC
Description of problem:
Nov 22 12:53:34 euler systemd: Starting Collectd statistics daemon...
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "syslog" successfully loaded.
Nov 22 12:53:34 euler collectd: [2017-11-22 12:53:35] plugin_load: plugin "logfile" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "logfile" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: syslog: invalid loglevel [debug] defaulting to 'info'
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "cpu" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "interface" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "load" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "memory" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "write_graphite" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "ceph" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "df" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "disk" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: plugin_load: plugin "virt" successfully loaded.
Nov 22 12:53:34 euler collectd[5437]: Systemd detected, trying to signal readyness.
Nov 22 12:53:34 euler systemd: Started Collectd statistics daemon.
Nov 22 12:53:34 euler collectd[5437]: virt plugin: reader virt-0 initialized
Nov 22 12:53:34 euler collectd[5437]: Initialization complete, entering read-loop.
Nov 22 12:53:34 euler kernel: reader#3[5446]: segfault at 0 ip 00007f1bf12698dd sp 00007f1be5272e00 error
 4 in ceph.so[7f1bf1268000+5000]
Nov 22 12:53:34 euler abrt-hook-ccpp: Process 5437 (collectd) of user 0 killed by SIGSEGV - ignoring (repeated crash)
Nov 22 12:53:34 euler libvirtd: 2017-11-22 11:53:34.570+0000: 1461: error : virNetSocketReadWire:1793 : Cannot recv data: Connection reset by peer
Nov 22 12:53:34 euler systemd: collectd.service: main process exited, code=killed, status=11/SEGV
Nov 22 12:53:34 euler systemd: Unit collectd.service entered failed state.


Version-Release number of selected component (if applicable):
collectd-5.8

Comment 3 Matthias Runge 2017-11-27 13:36:49 UTC
There is a commit message in collect-5.8: https://github.com/collectd/collectd/commit/647ac31bf9db60b1685d6d8d25be65375ba85891#diff-20b37368527caaa7f0318870e8cefd51

"""
This patch is not backward compatible with previous ceph versions.
"""

Comment 5 Matthias Runge 2017-11-29 09:12:29 UTC
It seems, the crash only happens, when the option 
ConvertSpecialMetricTypes is set to true. Explicitly setting it to false, makes the plugin work even with older ceph releases.

Comment 6 Matthias Runge 2017-12-05 07:27:28 UTC
Proposed fix usptream: https://github.com/collectd/collectd/commit/de05fb53fad6bc998f585b704ca0caeadc14a035

Comment 14 Leonid Natapov 2018-02-19 09:01:49 UTC
Please,provide instructions how to test/configure

Thank you,

Comment 15 Matthias Runge 2018-02-19 10:54:36 UTC
https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_ceph

In my case, the ceph config file looks like:

<LoadPlugin ceph>
  Globals false
</LoadPlugin>  
<Plugin "ceph">
    LongRunAvgLatency false
    ConvertSpecialMetricTypes false
    <Daemon "osd.0">
      SocketPath "/var/run/ceph/ceph-osd.0.asok"
    </Daemon>
</Plugin>

You'd probably need to figure out, where your ceph*.asok is stored.

and tons of ceph related metrics will show up in grafana, all beginning with "ceph_"

Comment 17 Leonid Natapov 2018-05-30 07:31:33 UTC
[2018-05-30 07:00:39] plugin_load: plugin "ceph" successfully loaded.

Comment 19 errata-xmlrpc 2018-06-27 13:08:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2084


Note You need to log in before you can comment on or make changes to this bug.