Description of problem: ae_max_retries does not show the correct value after one retry Version-Release number of selected component (if applicable): 5.8.2 How reproducible: customer environemnt Steps to Reproduce: 1. create a state machine that references an instance calling a method 2. make the max_retry of the call to that instance an arbitrary number such as 3 3. in the method, make sure you write into the logs the value of ae_max_retries Actual results: ae_max_retries jumps to 200 at the second run Expected results: ae_max_retries remains the same Additional info:
confirmed : the higher number of attempts seen on the second run comes from another automate that was calling this one. it seems the bug is rather that the ae_max_retries gets overwritten by the top parent on the first failure.
Investigating issue.... I see the max_retry change from 3 to 200. before retry: [----] I, [2017-12-20T10:13:13.068479 #9345:102cdc0] INFO -- : <AutomationEngine> <AEMethod vcenter_service_assigned> /v.b/Integration/VMware/VCenter/Methods/VcenterServiceAssigned - state_retries: 0, max_retries: 3, ae_state_retries: 0, ae_state_max_retries: 3 [----] I, [2017-12-20T10:13:13.098374 #9345:102cdc0] INFO -- : <AutomationEngine> <AEMethod vcenter_service_assigned> /v.b/Integration/VMware/VCenter/Methods/VcenterServiceAssigned - VM 'test_fqdn' is not still part of any service, count '0', retrying... [----] I, [2017-12-20T10:13:13.121685 #9345:eb7134] INFO -- : <AutomationEngine> <AEMethod [/v.b/Integration/VMware/VCenter/Methods/vcenter_service_assigned]> Ending first retry: ----] I, [2017-12-20T10:14:09.369384 #10492:f6db28] INFO -- : <AutomationEngine> <AEMethod vcenter_service_assigned> /v.b/Integration/VMware/VCenter/Methods/VcenterServiceAssigned - state_retries: 2, max_retries: 200, ae_state_retries: 2, ae_state_max_retries: 200 [----] I, [2017-12-20T10:14:09.401014 #10492:f6db28] INFO -- : <AutomationEngine> <AEMethod vcenter_service_assigned> /v.b/Integration/VMware/VCenter/Methods/VcenterServiceAssigned - VM 'test_fqdn' is not still part of any service, count '2', retrying...
Due to new data from the customer I wish to make an update on the steps to reproduce. you need to call two layers of state machines, the topmost using a max_retry of 50 for example, calling a state machine that will use a max_retry of 2 calling the instance that will call the method. The result is that on the second retry the value will jump to the count of the topmost and not continue from 2.
Hi Felix, Thanks for the additional information. Although the $evm.root['ae_state_max_retries'] value is incorrect, the state machine is processed with the correct max retries value. Customer log shows incorrect value of ae_state_max_retries = 200: [----] I, [2017-12-20T10:14:38.009888 #9345:102c758] INFO -- : <AEMethod vcenter_service_assigned> /v.b/Integration/VMware/VCenter/Methods/VcenterServiceAssigned - state_retries: 3, max_retries: 200, ae_state_retries: 3, ae_state_max_retries: 200 [----] I, [2017-12-20T10:14:38.017976 #9345:102c758] INFO -- : <AEMethod vcenter_service_assigned> /v.b/Integration/VMware/VCenter/Methods/VcenterServiceAssigned - VM 'test_fqdn' is not still part of any service, count '3', retrying... [----] I, [2017-12-20T10:14:38.037778 #9345:eb7134] INFO -- : <AEMethod [/v.b/Integration/VMware/VCenter/Methods/vcenter_service_assigned]> Ending [----] I, [2017-12-20T10:14:38.037900 #9345:eb7134] INFO -- : Method exited with rc=MIQ_OK .... State machine correctly aborts resolution due to the max_retries = 3: ----] I, [2017-12-20T10:15:03.083971 #9345:eb7134] INFO -- : Processing State=[state2] [----] E, [2017-12-20T10:15:03.084187 #9345:eb7134] ERROR -- : State=<state2> running raised exception: <number of retries <4> exceeded maximum of <3>> [----] W, [2017-12-20T10:15:03.084310 #9345:eb7134] WARN -- : Error in State=[state2] I'm working on a fix for the incorrect root object value. Let me know if you have any questions. Thanks, Tina
https://github.com/ManageIQ/manageiq-automation_engine/pull/137
New commit detected on ManageIQ/manageiq-automation_engine/master: https://github.com/ManageIQ/manageiq-automation_engine/commit/eb92959679fbb419d14e1625b6f201550e4189d0 commit eb92959679fbb419d14e1625b6f201550e4189d0 Author: Tina Fitzgerald <tfitzger> AuthorDate: Wed Jan 3 13:18:10 2018 -0500 Commit: Tina Fitzgerald <tfitzger> CommitDate: Wed Jan 3 17:37:54 2018 -0500 Fix incorrect root['ae_state_max_retries'] value. https://bugzilla.redhat.com/show_bug.cgi?id=1529735 .../engine/miq_ae_engine/miq_ae_state_info.rb | 2 +- .../engine/miq_ae_engine/miq_ae_state_machine.rb | 2 +- spec/engine/miq_ae_state_machine_multi_spec.rb | 10 ++++++---- 3 files changed, 8 insertions(+), 6 deletions(-)