Bug 1535485

Summary: RetraceWorker._fail method called twice for failing vmcores
Product: [Fedora] Fedora EPEL Reporter: Dave Wysochanski <dwysocha>
Component: retrace-serverAssignee: Dave Wysochanski <dwysocha>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: el6CC: abrt-devel-list, jakub, michal.toman, mkutlak, mmarusak, msuchy
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: retrace-server-1.18.0-1.fc27 retrace-server-1.18.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-13 17:52:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dave Wysochanski 2018-01-17 14:08:38 UTC
Description of problem:
There's a bug where the _fail method is called from two different places.  I noticed this when testing patches for https://bugzilla.redhat.com/show_bug.cgi?id=1440383  Originally I thought maybe that patchset introduced the problem but then after reverting to a previous build that didn't have any of those patches I still had the problem.  So then I put in traceback.print_stack() at the top of the _fail() method and found the following two places _fail is called:

==> /var/log/httpd/error_log <==
  File "/usr/lib/python2.6/site-packages/retrace/retrace_worker.py", line 743, in start_vmcore
    self._fail()
  File "/usr/lib/python2.6/site-packages/retrace/retrace_worker.py", line 101, in _fail
    traceback.print_stack()
  File "/usr/bin/retrace-server-worker", line 67, in <module>
    worker.start(kernelver=kernelver, arch=cmdline.arch)
  File "/usr/lib/python2.6/site-packages/retrace/retrace_worker.py", line 955, in start
    self._fail()
  File "/usr/lib/python2.6/site-packages/retrace/retrace_worker.py", line 101, in _fail
    traceback.print_stack()

Here's the code for the first call:
    544     def start_vmcore(self, custom_kernelver=None):
    545         self.hook_start()
    546 
    547         task = self.task
...
    581         if task.use_mock(kernelver):
...
    734         else:
    735             try:
    736                 self.hook_pre_prepare_debuginfo()
    737                 crash_cmd = task.get_crash_cmd().split()
    738                 vmlinux = task.prepare_debuginfo(vmcore, kernelver=kernelver, crash_cmd=crash_cmd)
    739                 task.set_crash_cmd(' '.join(crash_cmd))
    740                 self.hook_post_prepare_debuginfo()
    741             except Exception as ex:
    742                 log_error("prepare_debuginfo failed: %s" % str(ex))
    743                 self._fail()   <--------------- 1st call to _fail


code for the second:
    902     def start(self, kernelver=None, arch=None):
...
    947             if tasktype in [TASK_RETRACE, TASK_DEBUG, TASK_RETRACE_INTERACTIVE]:
    948                 self.start_retrace(custom_arch=arch)
    949             elif tasktype in [TASK_VMCORE, TASK_VMCORE_INTERACTIVE]:
    950                 self.start_vmcore(custom_kernelver=kernelver)  <---------- raises Exception on 741 above
    951             else:
    952                 raise Exception("Unsupported task type")
    953         except Exception as ex:   <----------- second Exception raised
    954             log_error(str(ex))
    955             self._fail()   <------------- 2nd call to _fail


Version-Release number of selected component (if applicable):
retrace-server-1.17.0.111+gb07f-1.el6.noarch
rpm -q retrace-server --changelog | head
* Wed Aug 16 2017 Dave Wysochanski <dwysocha> 1.17.0.111+gb07f-1
- Build at upstream b07f88d Merge pull request #159 from DaveWysochanskiRH/rhbz1481433-crash-directory-unreadable

* Thu Mar 30 2017 Matej Marusak <mmarusak> 1.17.0-1
- Enable creating releases with makefile
- Introduce gen-version
- Do not use fedorahosted.org as source
- Include md5sum of original archive in summary page
- Do not try to get default time when ftptask
- Fix character escape typo

How reproducible:
everytime

Steps to Reproduce:
1. submit a vmcore that will fail to retrace due to some reason (failing debuginfo, etc)

Actual results:
_fail called twice

Expected results:
_fail only called once

Additional info:
There's also an odd 'None' log line in the retrace-log file.  Example:
   2018-01-17 08:55:15 prepare_debuginfo failed: Unable to find debuginfo package and no cached vmlinux file
    2018-01-17 08:55:15 Sending e-mail to dwysocha
   2018-01-17 08:55:15 None    <--------------- Looks like the "log_error" from line 954 above
    2018-01-17 08:55:15 Sending e-mail to dwysocha  <--- duplicate

Comment 1 Dave Wysochanski 2018-01-17 15:11:49 UTC
https://github.com/abrt/retrace-server/pull/169

Comment 2 Fedora Update System 2018-02-01 15:52:11 UTC
retrace-server-1.18.0-1.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-e5741ca105

Comment 3 Fedora Update System 2018-02-01 15:52:39 UTC
retrace-server-1.18.0-1.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2018-bc35ca9028

Comment 4 Fedora Update System 2018-02-02 18:24:42 UTC
retrace-server-1.18.0-1.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2018-bc35ca9028

Comment 5 Fedora Update System 2018-02-02 18:47:58 UTC
retrace-server-1.18.0-1.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-e5741ca105

Comment 6 Fedora Update System 2018-02-13 17:52:08 UTC
retrace-server-1.18.0-1.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.

Comment 7 Fedora Update System 2018-02-20 16:24:05 UTC
retrace-server-1.18.0-1.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.