Bug 1570950

Summary: Service and VM retirement are non-deterministic, running parallel
Product: Red Hat CloudForms Management Engine Reporter: Satoe Imaishi <simaishi>
Component: AutomateAssignee: drew uhlmann <duhlmann>
Status: CLOSED NEXTRELEASE QA Contact: Dave Johnson <dajohnso>
Severity: high Docs Contact:
Priority: high    
Version: 5.8.0CC: cpelland, dmisharo, duhlmann, gekis, jkrocil, mkanoor, obarenbo, smallamp, tfitzger
Target Milestone: GAKeywords: ZStream
Target Release: 5.9.2Flags: gekis: needinfo?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 5.9.2.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1563627 Environment:
all CFME (bug should be cloned)
Last Closed: 2018-05-03 20:34:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: Unknown Target Upstream Version:
Embargoed:
Bug Depends On: 1563627    
Bug Blocks:    

Comment 2 CFME Bot 2018-04-23 20:21:37 UTC
New commit detected on ManageIQ/manageiq/gaprindashvili:

https://github.com/ManageIQ/manageiq/commit/b8bc7bed2cd1a3fdbd81454cb8e176c7ec5c02d5
commit b8bc7bed2cd1a3fdbd81454cb8e176c7ec5c02d5
Author:     Greg McCullough <gmccullo>
AuthorDate: Thu Apr 19 08:47:41 2018 -0400
Commit:     Greg McCullough <gmccullo>
CommitDate: Thu Apr 19 08:47:41 2018 -0400

    Merge pull request #17280 from d-m-u/adding_lock_to_retire_now

    Add lock to retire_now start
    (cherry picked from commit 2deffa58399461396e3bbd57b93fc744f0c64e88)

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1570950

 app/models/mixins/retirement_mixin.rb | 43 +-
 spec/models/service/retirement_management_spec.rb | 23 +-
 spec/models/vm/retirement_management_spec.rb | 19 +
 3 files changed, 67 insertions(+), 18 deletions(-)

Comment 4 Tina Fitzgerald 2018-05-03 11:53:03 UTC
This is difficult to reproduce, but probably the best setup would be: 

1. An environment where there are 4 zones.
2. A Service with a retire_on date of today.

Each of the schedule workers should initiate the Service retirement at the same time, causing the request_service_retire event to be raised 4 times.
You should be able to see the events raised and the service retirement state machine start_retirement Automate method running.
The more zones, the better the odds of reproducing the customer scenario.

Comment 5 Sudhir Mallamprabhakara 2018-05-03 20:34:42 UTC
Closing this one as it is hard for QE to validate.