Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1688368

Summary: If dmidecode can't be executed, abrt goes into crash loop forever.
Product: Red Hat Enterprise Linux 7 Reporter: Ryan Blakley <rblakley>
Component: abrtAssignee: Martin Kutlak <mkutlak>
Status: CLOSED ERRATA QA Contact: Martin Kyral <mkyral>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: ekulik, fkrska, jkubin, jwright, mkutlak, mkyral, mmccune, msuchy, pdwyer, rmetrich
Target Milestone: rcKeywords: Patch, Reproducer
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: abrt-2.1.11-57.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 19:42:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1716965    

Description Ryan Blakley 2019-03-13 15:49:39 UTC
Description of problem:
If dmidecode can't be executed for some reason, be it operation not permitted, permission denied, etc. Then abrt will enter into a crash loop when executing the script /usr/libexec/abrt-action-generate-machine-id, which throws an unhandled exception. This causes a sosreport to be ran over and over in the loop as well, which caused excessive connections to a satellite for a customer. This issue appears to have been introduced by the changes from bz 1569076.


Version-Release number of selected component (if applicable):
abrt-2.1.11-52.el7.x86_64


Steps to Reproduce:
** The customer hit the issue with dmidecode returning operation not permitted, but I was able recreate the crash loop with the below steps.
1. # chmod -x /usr/sbin/dmidecode
2. # /usr/libexec/abrt-action-generate-machine-id

Actual results:
root@ryan-rhel7 ~ # /usr/libexec/abrt-action-generate-machine-id
Traceback (most recent call last):
  File "/usr/libexec/abrt-action-generate-machine-id", line 174, in <module>
    machineids = generate_machine_id(requested_generators)
  File "/usr/libexec/abrt-action-generate-machine-id", line 104, in generate_machine_id
    ids[sd] = workers[sd]()
  File "/usr/libexec/abrt-action-generate-machine-id", line 56, in generate_machine_id_dmidecode
    data = check_output(["dmidecode", "-s", k]).strip()
  File "/usr/lib64/python2.7/subprocess.py", line 568, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 13] Permission denied

root@ryan-rhel7 ~ # egrep "python:|abrt-server" /var/log/messages | grep -iv email
Mar 13 10:58:58 ryan-rhel7 python: detected unhandled Python exception in '/usr/libexec/abrt-action-generate-machine-id'
Mar 13 10:59:03 ryan-rhel7 python: communication with ABRT daemon failed: timed out
Mar 13 11:00:15 ryan-rhel7 python: detected unhandled Python exception in '/usr/libexec/abrt-action-generate-machine-id'
Mar 13 11:00:20 ryan-rhel7 python: communication with ABRT daemon failed: timed out
Mar 13 11:00:20 ryan-rhel7 abrt-server: Duplicate: core backtrace
Mar 13 11:00:20 ryan-rhel7 abrt-server: DUP_OF_DIR: /var/spool/abrt/Python-2019-03-13-11:00:15-378
Mar 13 11:00:20 ryan-rhel7 abrt-server: Deleting problem directory Python-2019-03-13-10:58:58-31057 (dup of Python-2019-03-13-11:00:15-378)
Mar 13 11:00:20 ryan-rhel7 abrt-server: Undefined variable outside of [[ ]] bracket
Mar 13 11:01:37 ryan-rhel7 python: detected unhandled Python exception in '/usr/libexec/abrt-action-generate-machine-id'
Mar 13 11:01:42 ryan-rhel7 python: communication with ABRT daemon failed: timed out
Mar 13 11:01:42 ryan-rhel7 abrt-server: Duplicate: core backtrace
Mar 13 11:01:42 ryan-rhel7 abrt-server: DUP_OF_DIR: /var/spool/abrt/Python-2019-03-13-11:01:37-2239
Mar 13 11:01:42 ryan-rhel7 abrt-server: Deleting problem directory Python-2019-03-13-11:00:15-378 (dup of Python-2019-03-13-11:01:37-2239)
Mar 13 11:01:42 ryan-rhel7 abrt-server: Undefined variable outside of [[ ]] bracket
Mar 13 11:02:56 ryan-rhel7 python: detected unhandled Python exception in '/usr/libexec/abrt-action-generate-machine-id'
Mar 13 11:03:01 ryan-rhel7 python: communication with ABRT daemon failed: timed out
Mar 13 11:03:01 ryan-rhel7 abrt-server: Duplicate: core backtrace
Mar 13 11:03:01 ryan-rhel7 abrt-server: DUP_OF_DIR: /var/spool/abrt/Python-2019-03-13-11:02:56-4008
Mar 13 11:03:01 ryan-rhel7 abrt-server: Deleting problem directory Python-2019-03-13-11:01:37-2239 (dup of Python-2019-03-13-11:02:56-4008)
Mar 13 11:03:01 ryan-rhel7 abrt-server: Undefined variable outside of [[ ]] bracket


Expected results:
** I threw in a catch exception, which I'll will post below, which allows the script to run and not cause a loop.
root@ryan-rhel7 ~ # /usr/libexec/abrt-action-generate-machine-id
Machine-ID generator 'sosreport_uploader-dmidecode' failed: Could not execute dmidecode due to: [Errno 13] Permission denied
systemd=940ca953abe70648a79ff135488faa65


Additional info:
** Below is how many times the crash occurred on the customers system over a two week period before abrtd was stopped. Abrt was initially triggered due to the server hanging and systemd-journal timed out, which triggered abrt to dump a core.
$ grep "/usr/libexec/abrt-action-generate-machine-id" -r var/log/ | wc -l
11815

** Below is a patch I threw together that adds a catch when running dmidecode, so the crash loop doesn't occur.
# diff -up /usr/libexec/abrt-action-generate-machine-id.old /usr/libexec/abrt-action-generate-machine-id
--- /usr/libexec/abrt-action-generate-machine-id.old	2019-03-13 11:13:42.493769926 -0400
+++ /usr/libexec/abrt-action-generate-machine-id	2019-03-13 11:13:22.634114171 -0400
@@ -52,7 +52,11 @@ def generate_machine_id_dmidecode():
 
     # Run dmidecode command
     for k in keys:
-        data = check_output(["dmidecode", "-s", k]).strip()
+        try:
+            data = check_output(["dmidecode", "-s", k]).strip()
+        except OSError as ex:
+            raise RuntimeError("Could not execute dmidecode due to: {0}"
+                    .format(str(ex)))
 
         # Update the hash as we find the fields we are looking for
         machine_id.update(data)

** I tested the above issue on RHEL 8 Beta as well, but the crash loop doesn't occur, but /usr/libexec/abrt-action-generate-machine-id will still throw the same error.

Comment 10 ekulik 2019-08-13 14:56:35 UTC
Oops, wrong component.

Comment 14 errata-xmlrpc 2020-03-31 19:42:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1040