Bug 1519538
| Summary: | Smartstate request taking too long is killed because Worker Monitoring Code incorrectly thinks the busy Smartproxy Worker is not responding | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Jerry Keselman <jkeselma> | ||||||
| Component: | SmartState Analysis | Assignee: | Jerry Keselman <jkeselma> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Dave Johnson <dajohnso> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 5.8.0 | CC: | jhardy, obarenbo, simaishi | ||||||
| Target Milestone: | GA | Keywords: | TestOnly | ||||||
| Target Release: | 5.10.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | 5.10.0.0 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1531299 1532854 (view as bug list) | Environment: | |||||||
| Last Closed: | 2018-06-21 20:47:36 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1531299, 1532854 | ||||||||
| Attachments: |
|
||||||||
|
Description
Jerry Keselman
2017-11-30 21:07:23 UTC
Created attachment 1361227 [details]
An example of this issue in the evm.log file.
Created attachment 1361228 [details]
Another example of the issue.
As a note of info the two cases shown in the attachments are both for Linux Instances. New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/906ed996fbf690844d9a9382c3019a31fb4482da commit 906ed996fbf690844d9a9382c3019a31fb4482da Author: Jerry Keselman <jerryk> AuthorDate: Tue Dec 19 10:10:07 2017 -0500 Commit: Jerry Keselman <jerryk> CommitDate: Thu Dec 21 15:48:26 2017 -0500 Add Heartbeat Thread to SmartProxy Worker In order to fix an issue where long-running Smartstate jobs get killed under the mistaken assumption that they are being unresponsive when they are actually quite busy, a separate thread is being added to the SmartProxy Worker which just heartbeats every 30 seconds. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1519538 app/models/miq_smart_proxy_worker/runner.rb | 66 +++++++++++++++++++++++++++++ config/settings.yml | 3 +- 2 files changed, 68 insertions(+), 1 deletion(-) |