Bug 1232019
Summary: | retrace-server should fail tasks where crash exits with an error and retrace_backtrace file is tiny and contains "log: seek error" | ||
---|---|---|---|
Product: | [Fedora] Fedora EPEL | Reporter: | Dave Wysochanski <dwysocha> |
Component: | retrace-server | Assignee: | Dave Wysochanski <dwysocha> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | low | Docs Contact: | |
Priority: | medium | ||
Version: | el6 | CC: | abrt-devel-list, bubrown, jberan |
Target Milestone: | --- | Keywords: | Patch |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-12-21 15:42:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Dave Wysochanski
2015-06-15 20:37:40 UTC
Marking medium priorty / low severity. Assuming it's not too invasive, it would be good to remove tasks like this which we know are useless. I have not tried a patch yet. This has become a problem again as we have received larger 'vmem' files that take up our excess space margin. They 'succeed' because our kernelver detection can find a kernelver but they are useless because crash fails to load them. These files need converted with a 3rd party vmware 'vmss2core' tool to be useful. The end result today is that we keep these vmem files around longer than we should (we should remove these based on DeleteFailedTaskAfter not DeleteTaskAfter). Probably this is simple - just look for non-zero 'crash' exit code. But will require a decent amount of testing to make sure no corner cases exist (mark something not useful we should retain, crash exit code correct, etc). Also we should think whether we need another status code or just re-use 'failed' FWIW, example of the failure from comment #2 - crash output has "not a supported file format", and this is fairly common: crash: /cores/retrace/tasks/784434840/crash/vmcore: not a supported file format Usage: crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS] (dumpfile form) crash [OPTION]... [NAMELIST] (live system form) Enter "crash -h" for details. Created attachment 1403242 [details]
v3: fail a task if crash 'sys' command exits with non-zero status and size of kernellog is less than 1024 bytes
Created attachment 1404350 [details]
v4: fail a task if crash 'sys' command exits with non-zero status and size of kernellog is less than 1024 bytes
Created attachment 1404809 [details]
v5: fail a task if crash 'sys' command exits with non-zero status and size of kernellog is less than 1024 bytes
After much testing, pull request from comment #9 has been merged and has been deployed in production. So far it looks good. $ git tag --contains daea8e8 1.19.0 |