Bug 1706350

Summary: collectd-dns no longer resolves numeric qtypes to strings after upgrade from collectd-5.8.0-3 to collectd-5.8.0-4
Product: [Fedora] Fedora EPEL Reporter: Christian Bartolomäus <bartolin>
Component: collectdAssignee: Jonathan Wright <jonathan>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: epel7CC: gregswift, jskarvad, kevin, mhlavink, ruben
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-07-09 02:50:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christian Bartolomäus 2019-05-04 11:52:24 UTC
Description of problem:
After a minor upgrade of collectd and collectd-dns the metrics for qtypes are no longer reported as strings (e.g. 'dns_qtype.A'), but with their numeric value (e.g. 'dns_qtype.#1').

Version-Release number of selected component (if applicable):
Old versions:
* collectd-5.8.0-3.el7.x86_64
* collectd-dns-5.8.0-3.el7.x86_64

New versions:
* collectd-5.8.0-4.el7.x86_64
* collectd-dns-5.8.0-4.el7.x86_64

How reproducible:
I'm not really sure how to reproduce this in a clean way. On our physical machines the problem is totally reproducible, but I wasn't able to reproduce it on a fully updated virtual machine (CentOS 7) with latest collectd and bind installed. I have some additional information/ideas, though (see below).

Steps to Reproduce:
$ yum downgrade collectd-5.8.0-3.el7 collectd-dns-5.8.0-3.el7  ## stringified qtype metrics are reported
$ yum install collectd-5.8.0-4.el7 collectd-dns-5.8.0-4.el7    ## numeric qtype metrics are reported

Actual results:
Metric name: host.dns.dns_qtype.#1.value 0.400000 1556968058

Expected results:
Metric name: host.dns.dns_qtype.A.value 0.400000 1556968058

Additional info:
I *guess* this problem is related to the specific configuration of the build machine that produced the packages:
* https://cbs.centos.org/koji/buildinfo?buildID=21280 build for collectd-5.8.0-3.el7.x86_64
* https://cbs.centos.org/koji/buildinfo?buildID=21737 build for collectd-5.8.0-4.el7.x86_64

I've compared the source packages for both builds and there was absolutely nothing related to the functionality of the dns plugin. Also, the build logs didn't give me a clue.

But looking at the code of src/dns.c and src/utils/dns/dns.c it totally looks like neihter the symbol __NAMESER nor __BIND is defined on the later build.

The translation from numeric qtypes to stringified metrics names happens by calling qtype_str on the numeric value here: https://github.com/collectd/collectd/blob/55cf383915/src/dns.c#L364

The function qtype_str looks for the symbols __NAMESER and __BIND (and whether their values are greater than specific dates) and if that's not the case it just prepends a '#' to the numeric value: https://github.com/collectd/collectd/blob/55cf383915/src/utils/dns/dns.c#L914 That's exactly the format I get for the qtype metrics, so I conclude that the default case is executed.

Now I've seen that the symbols __NAMESER and __BIND have been removed from newer versions of glibc in 2016: https://sourceware.org/git/?p=glibc.git;a=commit;h=006768c72a

Looking at one of our machines (with glibc-headers-2.17-260.el7_6.4.x86_64 installed) there is indeed no definition for __NAMESER or __BIND in /usr/include/arpa/nameser.h anymore.

Could you maybe verify whether my hypothesis, that said symbols were defined during the build of collectd-5.8.0-3, but not during the build of collectd-5.8.0-4, could be true? (According to build.log the version of glibc-headers was identical (glibc-headers-2.17-196.el7_4.2.x86_64) for both builds, but maybe the symbols were defined somewhere else?

In the meantime I'll open an issue for upstream, because the check for __NAMESER or __BIND in src/utils/dns/dns.c should be replaced/extended.

I might be wrong with my guess -- maybe you've got a better idea what goes wrong on our machines? If you need further details, please ask. I'll try to provide them (if possible).

Comment 1 Christian Bartolomäus 2019-05-04 12:28:29 UTC
This is the upstream bug report: https://github.com/collectd/collectd/issues/3145

Comment 2 Ruben Kerkhof 2019-05-06 13:35:34 UTC
Are you sure the package comes from EPEL 7? The latest collectd version in EPEL is 5.8.1

Comment 3 Christian Bartolomäus 2019-05-06 15:26:12 UTC
Yes, I'm quite sure the package comes from EPEL 7. I had the same problem with the latest package from EPEL (5.8.1), but tried to isolate the problem by comparing the last working version (5.8.0-3) with the first problematic version (5.8.0-4).

Comment 4 Christian Bartolomäus 2019-05-07 10:15:08 UTC
Actually, I was wrong about the packages. I re-tested with these packages:
* https://koji.fedoraproject.org/koji/buildinfo?buildID=1065760 build for collectd-5.8.0-3.el7.x86_64
* https://koji.fedoraproject.org/koji/buildinfo?buildID=1082562 build for collectd-5.8.0-4.el7.x86_64

The results were identical.

But -- and that supports my analysis -- according to the root.log files for those builds the glibc (and glibc-headers) versions *were* different
* glibc-headers-2.17-196.el7_4.2.x86_64 was used for the build of collectd-5.8.0-3
* glibc-headers-2.17-222.el7.x86_64 was used for the build of collectd-5.8.0-4

Comment 5 Christian Bartolomäus 2019-05-07 12:27:51 UTC
I just saw that my last comment was ambigious:

> The results were identical.

I wanted to say that I got the same behaviour as with my original tests: collectd-5.8.0-3 worked as expected, with collectd-5.8.0-4 only numeric qtypes were reported.

Sorry for the noise.

Comment 6 Christian Bartolomäus 2019-06-30 07:51:43 UTC
This bug has been fixed upstream with https://github.com/collectd/collectd/pull/3156

There is no upstream release containing the fix, yet

Comment 7 Christian Bartolomäus 2019-07-29 12:29:20 UTC
This should be fixed in upstream release collectd 5.9.1: https://github.com/collectd/collectd/releases/tag/collectd-5.9.1

Comment 8 Christian Bartolomäus 2019-11-27 13:32:33 UTC
Are there any plans to provide a new collectd package that contains the fix from upstream? In the meantime collectd 5.10 has been released (https://github.com/collectd/collectd/releases/tag/5.10.0).

Comment 9 Fedora Admin user for bugzilla script actions 2024-05-17 00:14:04 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 10 Fedora Admin user for bugzilla script actions 2024-05-17 12:33:06 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 11 Troy Dawson 2024-07-09 02:50:40 UTC
EPEL 7 entered end-of-life (EOL) status on 2024-06-30.\n\nEPEL 7 is no longer maintained, which means that it\nwill not receive any further security or bug fix updates.\n As a result we are closing this bug.