Bug 1748353
Summary: | multiple workers start the same retirement when retirement date is reached | |||
---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Felix Dewaleyne <fdewaley> | |
Component: | Automate | Assignee: | Tina Fitzgerald <tfitzger> | |
Status: | CLOSED NOTABUG | QA Contact: | Devidas Gaikwad <dgaikwad> | |
Severity: | medium | Docs Contact: | Red Hat CloudForms Documentation <cloudforms-docs> | |
Priority: | medium | |||
Version: | 5.10.8 | CC: | dmetzger, fdewaley, ghubale, gmccullo, mkanoor, obarenbo, simaishi, tfitzger | |
Target Milestone: | GA | Keywords: | TestOnly, ZStream | |
Target Release: | 5.12.0 | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1764197 1767824 (view as bug list) | Environment: | ||
Last Closed: | 2020-06-10 12:33:21 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | Bug | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | CFME Core | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1764197, 1767824 |
Description
Felix Dewaleyne
2019-09-03 12:22:00 UTC
Retirement as a request(RaaR) was introduced in 5.10. Part of the RaaR work included creating a new method to initiate the retirement process. The new method was created because we wanted to keep the retire_now method for backward compatibility. Retirement should work properly with the old/new way, depending on the Automate model in use. Most of the details of these changes aren't significant to the reported issue, but, what is important is the timing of entering the retirement state machine. retire_now is event based, a request_*_event is raised when the method is called, and the event processing includes a call to the retirement state machine. The objects retirement_state is set to retiring in the first step of the state machine. The retirement_state == retiring will prevent the scheduler from trying to retire the object again because it considers it already retiring. Retirement as a request has a much different workflow. A retire request is created when the method is called, the request has to be approved, then tasks are created to process each of the retireable objects. The tasks then enter the retirement state machine. The objects retirement_state is set to retiring in the first step of the state machine. At this point, the scheduler will not try to retire the object again because it considers it already retiring. Service retirement has an additional challenge in that Services are not zone based. Retirement of VM's are queued to the VM's zone, but Services are queued without a zone which means that service retirement will run anywhere a worker is available to pick up the work. We recently made a change to have Service retirement checked at the region level instead of zone. If you had a region that had 3 zones, you could have 3 retirement requests for the same service. With this change, you will only have 1 retirement request dor the service. https://github.com/ManageIQ/manageiq/pull/19143 Hi Felix, The target release was just changed to 5.11.1. Thanks, Tina Hi Dennis, The PR referenced in comment 4 will resolve this issue. The PR has been merged, but is not tagged for backporting to Hammer. Should we add the hammer label and change the target release? Please advise. Thanks, Tina Updating this for inclusion in 5.10.12 |