Bug 1535237

Summary: [RFE][S-3] Log the Worker ID of the Previous Appliance/Process that Executed an Automate Task
Product: Red Hat CloudForms Management Engine Reporter: myoder
Component: AutomateAssignee: Lucy Fu <lufu>
Status: CLOSED ERRATA QA Contact: Dmitry Misharov <dmisharo>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.8.0CC: cpelland, gblomqui, gmccullo, lufu, mfeifer, mkanoor, myoder, obarenbo, simaishi, tfitzger
Target Milestone: MVPKeywords: FutureFeature
Target Release: 5.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 5.10.0.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-07 23:00:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: Feature
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1555371    

Description myoder 2018-01-16 22:11:55 UTC
Description of problem:

Automate tasks can get put back on the queue and picked back up several times during an automate task.  When the task is picked back up, it would be helpful to include the worker ID of the worker that was working on the automate task prior to being picked back up.


For example, there may be 2 appliances with the automate role enabled, (appliance 1 and 2).

If an automate task is picked up by appliance 1, the task may get put back on the queue and picked up later by appliance 2.  There should be a log line that indicates that appliance 1 was previously working on the task, once appliance 2 picks up the task.

This would also apply to the re-arch of CloudForms 4.6.  Any time an automate task is being worked and put back on the queue, we would want to keep track of the worker ID of the generic/priority worker.  So when a new worker process picks up the task, we would know the prior worker ID.  This will significantly help with debugging automate tasks in a podified environment.

Version-Release number of selected component (if applicable):
all

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

This is to help the CloudForms Support Team trouble shoot automate tasks.

Comment 7 CFME Bot 2018-06-19 21:00:04 UTC
New commits detected on ManageIQ/manageiq-automation_engine/master:

https://github.com/ManageIQ/manageiq-automation_engine/commit/a223594b6a156223db3d2c017f9132de41667760
commit a223594b6a156223db3d2c017f9132de41667760
Author:     Lucy Fu <lufu>
AuthorDate: Mon May 21 09:39:08 2018 -0400
Commit:     Lucy Fu <lufu>
CommitDate: Mon May 21 09:39:08 2018 -0400

    Keep track of the worker id that has worked on the automate task.

    https://bugzilla.redhat.com/show_bug.cgi?id=1535237

 lib/miq_automation_engine/engine/miq_ae_engine.rb | 15 +
 spec/miq_ae_engine_spec.rb | 1 +
 2 files changed, 16 insertions(+)


https://github.com/ManageIQ/manageiq-automation_engine/commit/ddd7affa343cc1d9f0e1c7d768aba1abd4c1e8a4
commit ddd7affa343cc1d9f0e1c7d768aba1abd4c1e8a4
Author:     Lucy Fu <lufu>
AuthorDate: Wed May 23 15:08:26 2018 -0400
Commit:     Lucy Fu <lufu>
CommitDate: Wed May 23 15:08:26 2018 -0400

    Keep track of the server ids where the automate task has been processed.

    https://bugzilla.redhat.com/show_bug.cgi?id=1535237

 lib/miq_automation_engine/engine/miq_ae_engine.rb | 16 +-
 spec/miq_ae_engine_spec.rb | 1 -
 2 files changed, 1 insertion(+), 16 deletions(-)

Comment 8 CFME Bot 2018-06-19 21:06:27 UTC
New commits detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/679623f048b81878b450090968118206d02a512b
commit 679623f048b81878b450090968118206d02a512b
Author:     Lucy Fu <lufu>
AuthorDate: Wed May 23 15:05:16 2018 -0400
Commit:     Lucy Fu <lufu>
CommitDate: Wed May 23 15:05:16 2018 -0400

    Keep track of the server ids where the automate task has been processed on.

    https://bugzilla.redhat.com/show_bug.cgi?id=1535237

 app/models/miq_provision/state_machine.rb | 1 +
 app/models/miq_request_task/state_machine.rb | 1 +
 app/models/mixins/miq_request_mixin.rb | 6 +
 3 files changed, 8 insertions(+)


https://github.com/ManageIQ/manageiq/commit/96ab0711b9e8f3686123993ca5e39f47b191dfad
commit 96ab0711b9e8f3686123993ca5e39f47b191dfad
Author:     Lucy Fu <lufu>
AuthorDate: Wed May 23 15:04:16 2018 -0400
Commit:     Lucy Fu <lufu>
CommitDate: Wed May 23 15:04:16 2018 -0400

    Add test cases.

    https://bugzilla.redhat.com/show_bug.cgi?id=1535237

 spec/models/service_template_provision_task_spec.rb | 18 +
 1 file changed, 18 insertions(+)

Comment 9 Dmitry Misharov 2018-07-18 08:47:01 UTC
How can it be verified? I have two appliances in one zone with both automate role enabled. I provisioned a vm and I didn't find in the logs any mention of server id or "executed_on_servers" option.

Comment 10 Greg McCullough 2018-07-24 12:58:10 UTC
Per the PR: "This PR tries to store the different server ids where a task was executed in the "options" hash. Which can then be used for debugging."

The ID is written to the options hash in the :executed_on_servers key as an array of service IDs.

Comment 11 Dmitry Misharov 2018-07-25 06:34:15 UTC
Verified in 5.10.0.4.20180712211305_e6e4542. The ID is in the options hash.

Comment 12 errata-xmlrpc 2019-02-07 23:00:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0212