Bug 1266769
Summary: | RFE: refactor existing crash commands like kmem and 'foreach bt' in src/lib/retrace_worker.py into a post_retrace hook | ||
---|---|---|---|
Product: | [Fedora] Fedora EPEL | Reporter: | Dave Wysochanski <dwysocha> |
Component: | retrace-server | Assignee: | Dave Wysochanski <dwysocha> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | epel7 | CC: | abrt-devel-list, mbrysa, mgandhi, michal.toman |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-02-21 11:30:01 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dave Wysochanski
2015-09-27 15:15:15 UTC
After some time, we now have a retrace-server build for the hooks bz. So this should probably be the first attempt at using it. We hopefully can refactor and create a "built-in" post_retrace hook which we can include in retrace-server as an example. If we factor these out, I think we still need a step inside retrace-server to run crash. This would address https://bugzilla.redhat.com/show_bug.cgi?id=1232019 If crash fails (exits with error code) and cannot be loaded, we should consider marking the task 'failed' or some form which indicates the vmcore file is likely not useful. Such vmcores come up enough to address them since we lose time if people think the vmcore is useable when it's not. Now that I look at this, factoring out the various crash commands looks non-trivial due to the optional use of mock. Right now we either run the crash commands through mock or through crash directly, and save the output to a variable. Then after all the crash commands are run, each variable is saved into a file inside 'misc'. If we make a post_retrace hooks script to do this then we'd need to either ditch mock there or we'd need to import some default post_retrace hook script into the mock environment. I still like the idea of removing these post-setup crash commands since it shortens the time to complete a retrace-server task. Ideally retrace-server should send a notification that the task is ready as soon as it has the kernel identified, the kernel-debuginfo symbols setup, and crash runs without error. We shouldn't have to wait for many other crash commands to complete before getting a notification of success on the task. Need to think about how to refactor this without creating regression. We need to start using hooks but there are a few problems. I think the first step is to complete this bug. This may mean other patches such as allowing the hook script(s) to fork while allowing retrace to finish with a notification that the vmcore can be loaded and the backtrace is available. Here are my current thoughts about this bug. I think to support both the use case of users that just want immediate access to a vmcore / backtrace, as well as those that want to wait for all "automated analysis" via post-retrace hooks, we need to break up the existing behavior into two phases: Phase 1. Once kernelver detection is done, the kernel-debuginfo file is setup, and the backtrace is available, we should send a notification that the vmcore is able to be loaded. Phase 2. After the notification, we run the existing crash commands as a "built-in" post-retrace hook script. Then once all post-retrace hooks are complete, we can send a second notification. Removal of crash commands and bt_filter merged https://github.com/abrt/retrace-server/pull/252 For now the extra crash commands will go into external hook code. It is possible we could ship an retrace-server-hooks package with some default commands but it is more work and cost / benefit analysis unclear. Fixed in retrace-server-1.21.0-1.el8 |