Bug 1468635

Summary: unable to retire a bundle containing ansible tower service in 5.8
Product: Red Hat CloudForms Management Engine Reporter: Felix Dewaleyne <fdewaley>
Component: ProvidersAssignee: Bill Wei <bilwei>
Status: CLOSED CURRENTRELEASE QA Contact: Nandini Chandra <nachandr>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.8.0CC: ahoness, bilwei, cpelland, dajohnso, dberger, dluong, duhlmann, fdewaley, gblomqui, gekis, gmccullo, jfrey, jhardy, jwarnica, kkulkarn, nachandr, obarenbo, simaishi, tfitzger, vdulava
Target Milestone: GAKeywords: Reopened, TestOnly, ZStream
Target Release: 5.10.0   
Hardware: All   
OS: All   
Whiteboard: provider:azure:storage
Fixed In Version: 5.10.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1600670 (view as bug list) Environment:
Last Closed: 2019-02-11 13:55:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: Azure Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1600670    
Attachments:
Description Flags
service rake script none

Description Felix Dewaleyne 2017-07-07 15:18:36 UTC
Description of problem:
unable to remove storage that was part of an azure orchestration in 5.8 - using the technique of https://access.redhat.com/knowledge/solutions/2868121

Version-Release number of selected component (if applicable):
5.8

How reproducible:
all the time

Steps to Reproduce:
1. set up the service and retirement to use https://access.redhat.com/knowledge/solutions/2868121
2. provision
3. retire

Actual results:
the storage isn't removed, a trace is left in the logs :

[----] E, [2017-07-05T09:41:52.984458 #25321:53396f4] ERROR -- : <AutomationEngine> <AEMethod remove_from_provider> The following error occurred during method evaluation:
[----] E, [2017-07-05T09:41:52.985598 #25321:53396f4] ERROR -- : <AutomationEngine> <AEMethod remove_from_provider>   NotImplementedError: raw_delete_stack must be implemented in a subclass
[----] E, [2017-07-05T09:41:52.991095 #25321:53396f4] ERROR -- : <AutomationEngine> <AEMethod remove_from_provider>   (druby://127.0.0.1:39489) /var/www/miq/vmdb/app/models/orchestration_stack.rb:121:in `raw_delete_stack'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `public_send'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `block in object_send'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:285:in `ar_method'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:295:in `ar_method'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:264:in `object_send'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:119:in `block (2 levels) in expose'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1624:in `perform_without_block'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1584:in `perform'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1657:in `block (2 levels) in main_loop'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `loop'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `block in main_loop'
[----] E, [2017-07-05T09:41:53.002412 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR: (druby://127.0.0.1:39489) /var/www/miq/vmdb/app/models/orchestration_stack.rb:121:in `raw_delete_stack': raw_delete_stack must be implemented in a subclass (NotImplementedError)
[----] E, [2017-07-05T09:41:53.003596 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `public_send'
[----] E, [2017-07-05T09:41:53.004662 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `block in object_send'
[----] E, [2017-07-05T09:41:53.005719 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:285:in `ar_method'
[----] E, [2017-07-05T09:41:53.010551 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:295:in `ar_method'
[----] E, [2017-07-05T09:41:53.011961 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:264:in `object_send'
[----] E, [2017-07-05T09:41:53.013182 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:119:in `block (2 levels) in expose'
[----] E, [2017-07-05T09:41:53.014298 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1624:in `perform_without_block'
[----] E, [2017-07-05T09:41:53.015283 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1584:in `perform'
[----] E, [2017-07-05T09:41:53.016358 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1657:in `block (2 levels) in main_loop'
[----] E, [2017-07-05T09:41:53.017366 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `loop'
[----] E, [2017-07-05T09:41:53.018339 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `block in main_loop'
[----] E, [2017-07-05T09:41:53.019378 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from /Rotterdam_Retirement_21_06_2017/Cloud/Orchestration/Retirement/StateMachines/Methods/remove_from_provider:20:in `<main>'
[----] I, [2017-07-05T09:41:53.083484 #25321:e1314c]  INFO -- : <AutomationEngine> <AEMethod [/Rotterdam_Retirement_21_06_2017/Cloud/Orchestration/Retirement/StateMachines/Methods/remove_from_provider]> Ending


Expected results:
the removal from azure of the storage is performed as expected

Additional info:
customer data provided in private note

Comment 5 Greg McCullough 2017-07-19 13:40:43 UTC
Drew, I suggest we add a noop raw_delete_stack method for the Ansible Tower subclass of OrchestrationStack.

Comment 6 Tina Fitzgerald 2017-07-31 16:20:41 UTC
Hi Felix,

We're still looking into the issue, but we haven't been able to reproduce it. 

I've attached a rake script for the customer to run on the Services in question. The script logs all of the Service information.

Could you ask the customer the following:
1. Copy the attached evm_service.rake file to vmdb/lib/tasks folder.
2. Run the inspect_service rake command for 2 of the Orchestration Services that exhibit the reported behavior.
3. Send us the rake task output.

Two ways to inspect a service, by service or request id:
bin/rake evm:service:inspect_service SERVICE_ID=13
bin/rake evm:service:inspect_service REQUEST_ID=208

Thanks,
Tina

Comment 7 Tina Fitzgerald 2017-07-31 16:23:15 UTC
Created attachment 1307201 [details]
service rake script

Comment 12 Felix Dewaleyne 2017-08-29 08:39:17 UTC
the issue is still present and the customer has re-started communicating with us - they provided new logs.

Comment 17 Tina Fitzgerald 2017-09-20 17:16:25 UTC
Hi Felix,

Based on the case history here with the refresh issues, I'm not confident of the state of the provisioned Services as it relates to retirement. 

I'd like the customer to start by provisioning a Service, then immediately retire the Service, sending us the logs which include the full provisioning and retirement processes along with the Service ID/name and Request ID. 

Thanks,
Tina

Comment 18 Felix Dewaleyne 2017-09-26 10:13:38 UTC
(In reply to Tina Fitzgerald from comment #17)
> Hi Felix,
> 
> Based on the case history here with the refresh issues, I'm not confident of
> the state of the provisioned Services as it relates to retirement. 
> 
> I'd like the customer to start by provisioning a Service, then immediately
> retire the Service, sending us the logs which include the full provisioning
> and retirement processes along with the Service ID/name and Request ID. 
> 
> Thanks,
> Tina

the customer said they would provide the data today, I'll share it as soon as it is available.

Comment 19 drew uhlmann 2017-10-02 19:25:43 UTC
Hey Felix! Is there any update on this issue?

Comment 20 Felix Dewaleyne 2017-10-03 14:11:44 UTC
(In reply to drew uhlmann from comment #19)
> Hey Felix! Is there any update on this issue?

I have the data! let me see how I can share it...

Comment 22 drew uhlmann 2017-10-03 16:01:59 UTC
Hey Felix! I'm so sorry about this, but the timestamps on the evm log do not make much sense to me. The event of interest starts at 20:19 on 9/27 but the evm log times only cover 3:11 to 5:23 and thus don't include the pertinent information about the event. Could you please send us a copy of the relevant logs?

Comment 23 Felix Dewaleyne 2017-10-10 08:54:58 UTC
(In reply to drew uhlmann from comment #22)
> Hey Felix! I'm so sorry about this, but the timestamps on the evm log do not
> make much sense to me. The event of interest starts at 20:19 on 9/27 but the
> evm log times only cover 3:11 to 5:23 and thus don't include the pertinent
> information about the event. Could you please send us a copy of the relevant
> logs?

the logs I shared on the private server are all I have. I can request to get new data, the complete set of logs from the customer is available in the parent folder

Comment 24 Greg McCullough 2017-10-10 10:55:33 UTC
*** Bug 1497175 has been marked as a duplicate of this bug. ***

Comment 25 Felix Dewaleyne 2017-10-11 08:24:07 UTC
please let me know if I absolutely need to request new data based on the inconsistency.

Comment 26 Bill Wei 2017-10-16 21:08:52 UTC
In customer's setup they created a bundle which included three Ansible Tower services and one Azure Orchestration Stack service. When the bundle was retried, automate attempted to retire all services. Two problems have been reported from the retirement operation:

1. Error: method raw_stack_delete not implemented. It came from retiring Ansible Tower services. Ansible Tower Job, although a subclass of OrchestratioStack does not support retirement. We can fix the issue by skipping Ansible Tower Job retirement.

2. Disk not removed. The reason should be found in azure.log. I have posted the analysis in BZ 149715.

Comment 27 Bill Wei 2017-10-16 21:18:52 UTC
Sorry, The above BZ should be BZ 1497175

Comment 30 Tina Fitzgerald 2018-02-23 15:29:59 UTC
Hi Greg,

Yes, we would have to change the retirement code for the Ansible Tower job.

I'm going to assign it to Bill.

Thanks,
Tina

Comment 33 CFME Bot 2018-03-06 16:53:23 UTC
New commit detected on ManageIQ/manageiq-providers-ansible_tower/master:

https://github.com/ManageIQ/manageiq-providers-ansible_tower/commit/91048c5b4f7f2a6abc0e0509e06b3508215c52e3
commit 91048c5b4f7f2a6abc0e0509e06b3508215c52e3
Author:     Bill Wei <bilwei>
AuthorDate: Tue Mar  6 10:49:46 2018 -0500
Commit:     Bill Wei <bilwei>
CommitDate: Tue Mar  6 10:49:46 2018 -0500

    Move #retire_now to shared code

    https://bugzilla.redhat.com/show_bug.cgi?id=1468635

 app/models/manageiq/providers/ansible_tower/shared/automation_manager/job.rb | 4 +
 spec/support/ansible_shared/automation_manager/job.rb | 7 +
 2 files changed, 11 insertions(+)

Comment 34 CFME Bot 2018-03-06 16:56:06 UTC
New commit detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/6f62e35a59e806fa5948ab28e7bc01f4605a8cc2
commit 6f62e35a59e806fa5948ab28e7bc01f4605a8cc2
Author:     Bill Wei <bilwei>
AuthorDate: Tue Mar  6 10:53:31 2018 -0500
Commit:     Bill Wei <bilwei>
CommitDate: Tue Mar  6 10:53:31 2018 -0500

    Delete #retire_now since it has been moved to shared code

    fixes https://bugzilla.redhat.com/show_bug.cgi?id=1468635

 app/models/manageiq/providers/embedded_ansible/automation_manager/job.rb | 5 -
 spec/models/manageiq/providers/embedded_ansible/automation_manager/job_spec.rb | 5 -
 2 files changed, 10 deletions(-)

Comment 37 Dave Johnson 2018-06-26 06:12:30 UTC
No requestee for needinfo set, can you take a look and determine where this should go?

Comment 38 Dave Johnson 2018-07-03 06:22:42 UTC
No requestee for needinfo set, can you take a look and determine where this should go?

Comment 45 Tina Fitzgerald 2018-07-18 14:32:58 UTC
*** Bug 1565418 has been marked as a duplicate of this bug. ***

Comment 46 Nandini Chandra 2018-11-06 23:23:49 UTC
Verified in 5.10.0.22