Bug 1585015

Summary: machine_type_func() returns uninitialized local variable value on aarch64
Product: [Fedora] Fedora EPEL Reporter: huan.xiong
Component: gangliaAssignee: Nick <nick>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: epel7CC: jose.p.oliveira.oss, nick, tanakahda
Target Milestone: ---   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: ganglia-3.7.2-33.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-23 22:50:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The patch fixes the issues. none

Description huan.xiong 2018-06-01 07:03:17 UTC
Created attachment 1446542 [details]
The patch fixes the issues.

Description of problem:

gmond system metric module returns uninitialized local variable value for machine_type metric, which causes gmetad fails to parse the data. As a result, no metric data can be saved.

Version-Release number of selected component (if applicable): 

It's 3.7.2.

# yum info ganglia-gmond
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * epel: mirrors.sohu.com
 * epel-debuginfo: mirrors.tongji.edu.cn
Installed Packages
Name        : ganglia-gmond
Arch        : aarch64
Version     : 3.7.2
Release     : 2.el7
Size        : 295 k
Repo        : installed
From repo   : epel
Summary     : Ganglia Monitoring daemon
URL         : http://ganglia.sourceforge.net/
License     : BSD
Description : Ganglia is a scalable, real-time monitoring and execution environment
            : with all execution requests and statistics expressed in an open
            : well-defined XML format.
            : 
            : This gmond daemon provides the ganglia service within a single cluster or
            : Multicast domain.


How reproducible:


Steps to Reproduce:

1. Make sure gmond service is started

2. Get machine_type metric value by using nc

$ nc localhost 8649 | grep machine_type
<METRIC NAME="machine_type" VAL="I&#30;&#13;[" TYPE="string" UNITS="" TN="19" TMAX="1200" DMAX="0" SLOPE="zero">
^C

Note the value above, it's garbage.

3. Start the Python version of gmetad in debug mode. It will dump error message on screen, like the following:

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
    self.run()
  File "/root/osgcloud/cbtool/3rd_party/monitor-core/gmetad-python/Gmetad/gmetad_gmondReader.py", line 140, in run
    xml.sax.parseString(xmlbuf, gch)
  File "/usr/lib64/python2.7/xml/sax/__init__.py", line 49, in parseString
    parser.parse(inpsrc)
  File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib64/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 214, in feed
    self._err_handler.fatalError(exc)
  File "/usr/lib64/python2.7/xml/sax/handler.py", line 38, in fatalError
    raise exception
SAXParseException: <unknown>:252:34: reference to invalid character number

The invalid charcter in question is the "I&#30;&#13;[" string shown in step 2.

Query gmetad as below, it outputs nothing because the exception above prevents it from saving any metric data.

$ nc localhost 8651


Actual results:

See abvoe

Expected results:

It should returns "aarch64" on aarch64 platform, or "unknown" on unsupported platform.

Additional info:

The fix is attached.

Comment 1 huan.xiong 2018-06-01 07:24:12 UTC
Seems I messed up the attachment's file name, and I didn't find a way to change it. So I'll just paste the diff below. The fix uses the latest code from Ganglia upstream:

 $ cat ganglia-3.7.2-libmetric-linux.patch
--- libmetrics/linux/metrics.c.orig	2018-06-01 14:56:37.690006517 +0800
+++ libmetrics/linux/metrics.c	2018-06-01 13:45:18.149298706 +0800
@@ -593,36 +593,30 @@
 
 #ifdef __i386__
    snprintf(val.str, MAX_G_STRING_SIZE, "x86");
-#endif
-#ifdef __x86_64__
+#elif __x86_64__
    snprintf(val.str, MAX_G_STRING_SIZE, "x86_64");
-#endif
-#ifdef __ia64__
+#elif __ia64__
    snprintf(val.str, MAX_G_STRING_SIZE, "ia64");
-#endif
-#ifdef __sparc__
+#elif __sparc__
    snprintf(val.str, MAX_G_STRING_SIZE, "sparc");
-#endif
-#ifdef __alpha__
+#elif __alpha__
    snprintf(val.str, MAX_G_STRING_SIZE, "alpha");
-#endif
-#ifdef __powerpc__
+#elif __powerpc__
    snprintf(val.str, MAX_G_STRING_SIZE, "powerpc");
-#endif
-#ifdef __m68k__
+#elif __m68k__
    snprintf(val.str, MAX_G_STRING_SIZE, "m68k");
-#endif
-#ifdef __mips__
+#elif __mips__
    snprintf(val.str, MAX_G_STRING_SIZE, "mips");
-#endif
-#ifdef __arm__
+#elif __arm__
    snprintf(val.str, MAX_G_STRING_SIZE, "arm");
-#endif
-#ifdef __hppa__
+#elif __aarch64__
+   snprintf(val.str, MAX_G_STRING_SIZE, "aarch64");
+#elif __hppa__
    snprintf(val.str, MAX_G_STRING_SIZE, "hppa");
-#endif
-#ifdef __s390__
+#elif __s390__
    snprintf(val.str, MAX_G_STRING_SIZE, "s390");
+#else
+   snprintf(val.str, MAX_G_STRING_SIZE, "unknown");
 #endif
 
    return val;

Comment 2 tanakahda 2020-03-25 23:29:21 UTC
Hello,

I think we see this problem because a following commit is not applied to the EPEL Gangalia aarch64 rpm.

https://github.com/ganglia/monitor-core/commit/fcf4c7c46a7f4bfbe845018ae5fc82a07269b444

The commit was merged into Ganglia master repo 4 years ago. However, Ganglia project does not create a new release this 5 years. Is it possible to apply this commit to EPEL Ganglia aarch64 rpm. Without this commit, gmond process does not work at all on aarch64 node.

Comment 3 Fedora Update System 2020-10-08 22:13:57 UTC
FEDORA-EPEL-2020-fee165c6a2 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-fee165c6a2

Comment 4 Fedora Update System 2020-10-09 00:02:50 UTC
FEDORA-EPEL-2020-fee165c6a2 has been pushed to the Fedora EPEL 7 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-fee165c6a2

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 5 Fedora Update System 2020-10-23 22:50:45 UTC
FEDORA-EPEL-2020-fee165c6a2 has been pushed to the Fedora EPEL 7 stable repository.
If problem still persists, please make note of it in this bug report.