Bug 1457979

Summary: After killing reporting worker, report status still says Running
Product: Red Hat CloudForms Management Engine Reporter: Satoe Imaishi <simaishi>
Component: ReportingAssignee: Yuri Rudman <yrudman>
Status: CLOSED ERRATA QA Contact: Tasos Papaioannou <tpapaioa>
Severity: high Docs Contact:
Priority: high    
Version: 5.6.0CC: cpelland, hkataria, jhardy, jocarter, mpovolny, obarenbo, simaishi, tpapaioa, yrudman
Target Milestone: GAKeywords: ZStream
Target Release: 5.7.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: report:worker
Fixed In Version: 5.7.4.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1397600 Environment:
Last Closed: 2017-12-18 20:25:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:
Bug Depends On: 1397600    
Bug Blocks:    

Comment 2 Yuri Rudman 2017-06-01 19:36:06 UTC
PR: https://github.com/ManageIQ/manageiq/pull/15277

Comment 3 Tasos Papaioannou 2017-06-13 15:53:20 UTC
Tested on 5.7.3.1 with the following steps:

****
1.) Set task timeout check frequency and timeout values:

:task_timeout_check_frequency: 600

:active_task_timeout: 10.minutes

2.) Queue a bunch of reports.

3.) Kill the MiqReportingWorker pid(s).

4.) Repeat #2 and #3 a couple times, until one of the reports gets stuck with a Running status.

5.) After ~10 minutes, see entries like the following in evm.log, and verify that the reports show a status of Error in the web UI.

[----] I, [2017-06-12T16:05:14.491076 #18861:3bd134]  INFO -- : MIQ(MiqTask#update_status) Task: [213] [Finished] [Error] [Task [213] timed out - not active for more than 10.minutes]
[----] I, [2017-06-12T16:05:14.518118 #18861:3bd134]  INFO -- : MIQ(MiqTask#update_status) Task: [214] [Finished] [Error] [Task [214] timed out - not active for more than 10.minutes]
****

Not a big problem, but I think the error message should print the timeout as "10 minutes" instead of as the settings value "10.minutes". Also, it doesn't look like anything deletes the miq_tasks entries from the db. Is that the intended behavior? The only tasks that ever get deleted are 'performance rollup' tasks.

Comment 4 Yuri Rudman 2017-06-14 17:22:09 UTC
PR: https://github.com/ManageIQ/manageiq/pull/15370

Comment 6 CFME Bot 2017-06-30 20:35:59 UTC
New commit detected on ManageIQ/manageiq/euwe:
https://github.com/ManageIQ/manageiq/commit/8935ee4b9c5cc61b39745fb7a8de10f80bf75b0b

commit 8935ee4b9c5cc61b39745fb7a8de10f80bf75b0b
Author:     Gregg Tanzillo <gtanzill>
AuthorDate: Fri Jun 16 14:16:01 2017 -0400
Commit:     Satoe Imaishi <simaishi>
CommitDate: Fri Jun 30 16:29:56 2017 -0400

    Merge pull request #15370 from yrudman/fix-time-format-in-logging-wjen-task-expired
    
    Format time interval for log message
    (cherry picked from commit 0f95dd196c38d68e6a078822c3f8c3086c42e07b)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1457979

 app/models/miq_task.rb       | 2 +-
 spec/models/miq_task_spec.rb | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

Comment 7 Tasos Papaioannou 2017-09-19 15:07:59 UTC
Verified on 5.7.4.0. With active_task_timeout set to 10.minutes, a reporting task that is stuck at Running because of a killed MiqReportingWorker process will time out:

****
[----] I, [2017-09-19T10:54:03.751367 #36182:1165138]  INFO -- : MIQ(MiqTask#update_status) Task: [163] [Finished] [Error] [Task [163] timed out - not active for more than 600 seconds]
****

Comment 10 errata-xmlrpc 2017-12-18 20:25:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3484