Bug 2226978

Summary: collectd-write_riemann segfaults due to upstream bug
Product: [Fedora] Fedora EPEL Reporter: Ade Rixon <ade.rixon>
Component: collectdAssignee: Ruben Kerkhof <ruben>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: epel9CC: jskarvad, kevin, mail, mhlavink, ruben
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: ---
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ade Rixon 2023-07-27 09:08:57 UTC
Description of problem:
collectd-write_riemann-5.12.0 causes repeated segfaults on RHEL 9.2 due to upstream bug #4050 (https://github.com/collectd/collectd/issues/4050)

Version-Release number of selected component (if applicable):
5.12.0-24

How reproducible:
Install collectd & collectd-write_riemann, configure write_riemann plugin to send metrics to remote Riemann host via TCP.

Steps to Reproduce:
1. Install collectd & collectd-write_riemann
2. Add config for write_riemann plugin, e.g.:
LoadPlugin write_riemann
<Plugin write_riemann>
       <Node "central">
               Host "riemann.server"
               Port 5555
               Protocol TCP
               Batch true
               BatchMaxSize 8192
               StoreRates true
               AlwaysAppendDS false
               TTLFactor 2.0
               Notifications true
               CheckThresholds false
       </Node>
</Plugin>
3.Start collectd

Actual results:
Occurrences in syslog for three processes of:
Jul 27 09:57:01 my.client kernel: writer#3[33525]: segfault at 20 ip 00
007f98c48ce2fc sp 00007f98c2030088 error 4 in libc.so.6[7f98c4828000+175000]
Jul 27 09:57:01 pidmapp03.cf.ac.uk kernel: Code: c3 66 2e 0f 1f 84 00 00 00 00 0
0 0f 1f 00 f3 0f 1e fa 89 f8 62 a1 fd 00 ef c0 25 ff 0f 00 00 3d e0 0f 00 00 0f 
87 34 01 00 00 <62> f3 7d 20 3f 07 00 c5 fb 93 c0 85 c0 74 55 f3 0f bc c0 c3 f3 
0f
Followed by:
Jul 27 09:57:01 my.client systemd[1]: Started Process Core Dump (PID 33535/UID 0).
Jul 27 09:57:01 my.client systemd-coredump[33536]: Resource limits disable core dumping for process 33521 (collectd).
Jul 27 09:57:01 my.client systemd-coredump[33536]: Process 33521 (collectd) of user 0 dumped core.
Jul 27 09:57:01 my.client systemd[1]: collectd.service: Main process exited, code=dumped, status=11/SEGV
Jul 27 09:57:01 my.client systemd[1]: collectd.service: Failed with result 'core-dump'.

Expected results:
collectd continues to run normally, sending metrics to Riemann (per previous releases).

Additional info:
Does not occur on EL7/8. See upstream bug for workarounds or reversions.