Created attachment 1446542 [details] The patch fixes the issues. Description of problem: gmond system metric module returns uninitialized local variable value for machine_type metric, which causes gmetad fails to parse the data. As a result, no metric data can be saved. Version-Release number of selected component (if applicable): It's 3.7.2. # yum info ganglia-gmond Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * epel: mirrors.sohu.com * epel-debuginfo: mirrors.tongji.edu.cn Installed Packages Name : ganglia-gmond Arch : aarch64 Version : 3.7.2 Release : 2.el7 Size : 295 k Repo : installed From repo : epel Summary : Ganglia Monitoring daemon URL : http://ganglia.sourceforge.net/ License : BSD Description : Ganglia is a scalable, real-time monitoring and execution environment : with all execution requests and statistics expressed in an open : well-defined XML format. : : This gmond daemon provides the ganglia service within a single cluster or : Multicast domain. How reproducible: Steps to Reproduce: 1. Make sure gmond service is started 2. Get machine_type metric value by using nc $ nc localhost 8649 | grep machine_type <METRIC NAME="machine_type" VAL="I [" TYPE="string" UNITS="" TN="19" TMAX="1200" DMAX="0" SLOPE="zero"> ^C Note the value above, it's garbage. 3. Start the Python version of gmetad in debug mode. It will dump error message on screen, like the following: Exception in thread Thread-3: Traceback (most recent call last): File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File "/root/osgcloud/cbtool/3rd_party/monitor-core/gmetad-python/Gmetad/gmetad_gmondReader.py", line 140, in run xml.sax.parseString(xmlbuf, gch) File "/usr/lib64/python2.7/xml/sax/__init__.py", line 49, in parseString parser.parse(inpsrc) File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/lib64/python2.7/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 214, in feed self._err_handler.fatalError(exc) File "/usr/lib64/python2.7/xml/sax/handler.py", line 38, in fatalError raise exception SAXParseException: <unknown>:252:34: reference to invalid character number The invalid charcter in question is the "I [" string shown in step 2. Query gmetad as below, it outputs nothing because the exception above prevents it from saving any metric data. $ nc localhost 8651 Actual results: See abvoe Expected results: It should returns "aarch64" on aarch64 platform, or "unknown" on unsupported platform. Additional info: The fix is attached.
Seems I messed up the attachment's file name, and I didn't find a way to change it. So I'll just paste the diff below. The fix uses the latest code from Ganglia upstream: $ cat ganglia-3.7.2-libmetric-linux.patch --- libmetrics/linux/metrics.c.orig 2018-06-01 14:56:37.690006517 +0800 +++ libmetrics/linux/metrics.c 2018-06-01 13:45:18.149298706 +0800 @@ -593,36 +593,30 @@ #ifdef __i386__ snprintf(val.str, MAX_G_STRING_SIZE, "x86"); -#endif -#ifdef __x86_64__ +#elif __x86_64__ snprintf(val.str, MAX_G_STRING_SIZE, "x86_64"); -#endif -#ifdef __ia64__ +#elif __ia64__ snprintf(val.str, MAX_G_STRING_SIZE, "ia64"); -#endif -#ifdef __sparc__ +#elif __sparc__ snprintf(val.str, MAX_G_STRING_SIZE, "sparc"); -#endif -#ifdef __alpha__ +#elif __alpha__ snprintf(val.str, MAX_G_STRING_SIZE, "alpha"); -#endif -#ifdef __powerpc__ +#elif __powerpc__ snprintf(val.str, MAX_G_STRING_SIZE, "powerpc"); -#endif -#ifdef __m68k__ +#elif __m68k__ snprintf(val.str, MAX_G_STRING_SIZE, "m68k"); -#endif -#ifdef __mips__ +#elif __mips__ snprintf(val.str, MAX_G_STRING_SIZE, "mips"); -#endif -#ifdef __arm__ +#elif __arm__ snprintf(val.str, MAX_G_STRING_SIZE, "arm"); -#endif -#ifdef __hppa__ +#elif __aarch64__ + snprintf(val.str, MAX_G_STRING_SIZE, "aarch64"); +#elif __hppa__ snprintf(val.str, MAX_G_STRING_SIZE, "hppa"); -#endif -#ifdef __s390__ +#elif __s390__ snprintf(val.str, MAX_G_STRING_SIZE, "s390"); +#else + snprintf(val.str, MAX_G_STRING_SIZE, "unknown"); #endif return val;
Hello, I think we see this problem because a following commit is not applied to the EPEL Gangalia aarch64 rpm. https://github.com/ganglia/monitor-core/commit/fcf4c7c46a7f4bfbe845018ae5fc82a07269b444 The commit was merged into Ganglia master repo 4 years ago. However, Ganglia project does not create a new release this 5 years. Is it possible to apply this commit to EPEL Ganglia aarch64 rpm. Without this commit, gmond process does not work at all on aarch64 node.
FEDORA-EPEL-2020-fee165c6a2 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-fee165c6a2
FEDORA-EPEL-2020-fee165c6a2 has been pushed to the Fedora EPEL 7 testing repository. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-fee165c6a2 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-EPEL-2020-fee165c6a2 has been pushed to the Fedora EPEL 7 stable repository. If problem still persists, please make note of it in this bug report.