Bug 1468635 - unable to retire a bundle containing ansible tower service in 5.8
unable to retire a bundle containing ansible tower service in 5.8
Status: ON_QA
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers (Show other bugs)
5.8.0
All All
medium Severity medium
: GA
: 5.10.0
Assigned To: Bill Wei
Dave Johnson
provider:azure:storage
: Reopened, TestOnly, ZStream
Depends On:
Blocks: 1600670
  Show dependency treegraph
 
Reported: 2017-07-07 11:18 EDT by Felix Dewaleyne
Modified: 2018-07-12 17:55 EDT (History)
16 users (show)

See Also:
Fixed In Version: 5.10.0.0
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1600670 (view as bug list)
Environment:
Last Closed: 2017-08-14 10:17:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: Azure


Attachments (Terms of Use)
service rake script (3.06 KB, text/plain)
2017-07-31 12:23 EDT, Tina Fitzgerald
no flags Details

  None (edit)
Description Felix Dewaleyne 2017-07-07 11:18:36 EDT
Description of problem:
unable to remove storage that was part of an azure orchestration in 5.8 - using the technique of https://access.redhat.com/knowledge/solutions/2868121

Version-Release number of selected component (if applicable):
5.8

How reproducible:
all the time

Steps to Reproduce:
1. set up the service and retirement to use https://access.redhat.com/knowledge/solutions/2868121
2. provision
3. retire

Actual results:
the storage isn't removed, a trace is left in the logs :

[----] E, [2017-07-05T09:41:52.984458 #25321:53396f4] ERROR -- : <AutomationEngine> <AEMethod remove_from_provider> The following error occurred during method evaluation:
[----] E, [2017-07-05T09:41:52.985598 #25321:53396f4] ERROR -- : <AutomationEngine> <AEMethod remove_from_provider>   NotImplementedError: raw_delete_stack must be implemented in a subclass
[----] E, [2017-07-05T09:41:52.991095 #25321:53396f4] ERROR -- : <AutomationEngine> <AEMethod remove_from_provider>   (druby://127.0.0.1:39489) /var/www/miq/vmdb/app/models/orchestration_stack.rb:121:in `raw_delete_stack'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `public_send'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `block in object_send'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:285:in `ar_method'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:295:in `ar_method'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:264:in `object_send'
(druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:119:in `block (2 levels) in expose'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1624:in `perform_without_block'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1584:in `perform'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1657:in `block (2 levels) in main_loop'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `loop'
(druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `block in main_loop'
[----] E, [2017-07-05T09:41:53.002412 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR: (druby://127.0.0.1:39489) /var/www/miq/vmdb/app/models/orchestration_stack.rb:121:in `raw_delete_stack': raw_delete_stack must be implemented in a subclass (NotImplementedError)
[----] E, [2017-07-05T09:41:53.003596 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `public_send'
[----] E, [2017-07-05T09:41:53.004662 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:266:in `block in object_send'
[----] E, [2017-07-05T09:41:53.005719 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:285:in `ar_method'
[----] E, [2017-07-05T09:41:53.010551 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:295:in `ar_method'
[----] E, [2017-07-05T09:41:53.011961 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:264:in `object_send'
[----] E, [2017-07-05T09:41:53.013182 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /var/www/miq/vmdb/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:119:in `block (2 levels) in expose'
[----] E, [2017-07-05T09:41:53.014298 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1624:in `perform_without_block'
[----] E, [2017-07-05T09:41:53.015283 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1584:in `perform'
[----] E, [2017-07-05T09:41:53.016358 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1657:in `block (2 levels) in main_loop'
[----] E, [2017-07-05T09:41:53.017366 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `loop'
[----] E, [2017-07-05T09:41:53.018339 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from (druby://127.0.0.1:39489) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `block in main_loop'
[----] E, [2017-07-05T09:41:53.019378 #25321:53396f4] ERROR -- : <AutomationEngine> Method STDERR:      from /Rotterdam_Retirement_21_06_2017/Cloud/Orchestration/Retirement/StateMachines/Methods/remove_from_provider:20:in `<main>'
[----] I, [2017-07-05T09:41:53.083484 #25321:e1314c]  INFO -- : <AutomationEngine> <AEMethod [/Rotterdam_Retirement_21_06_2017/Cloud/Orchestration/Retirement/StateMachines/Methods/remove_from_provider]> Ending


Expected results:
the removal from azure of the storage is performed as expected

Additional info:
customer data provided in private note
Comment 5 Greg McCullough 2017-07-19 09:40:43 EDT
Drew, I suggest we add a noop raw_delete_stack method for the Ansible Tower subclass of OrchestrationStack.
Comment 6 Tina Fitzgerald 2017-07-31 12:20:41 EDT
Hi Felix,

We're still looking into the issue, but we haven't been able to reproduce it. 

I've attached a rake script for the customer to run on the Services in question. The script logs all of the Service information.

Could you ask the customer the following:
1. Copy the attached evm_service.rake file to vmdb/lib/tasks folder.
2. Run the inspect_service rake command for 2 of the Orchestration Services that exhibit the reported behavior.
3. Send us the rake task output.

Two ways to inspect a service, by service or request id:
bin/rake evm:service:inspect_service SERVICE_ID=13
bin/rake evm:service:inspect_service REQUEST_ID=208

Thanks,
Tina
Comment 7 Tina Fitzgerald 2017-07-31 12:23 EDT
Created attachment 1307201 [details]
service rake script
Comment 12 Felix Dewaleyne 2017-08-29 04:39:17 EDT
the issue is still present and the customer has re-started communicating with us - they provided new logs.
Comment 17 Tina Fitzgerald 2017-09-20 13:16:25 EDT
Hi Felix,

Based on the case history here with the refresh issues, I'm not confident of the state of the provisioned Services as it relates to retirement. 

I'd like the customer to start by provisioning a Service, then immediately retire the Service, sending us the logs which include the full provisioning and retirement processes along with the Service ID/name and Request ID. 

Thanks,
Tina
Comment 18 Felix Dewaleyne 2017-09-26 06:13:38 EDT
(In reply to Tina Fitzgerald from comment #17)
> Hi Felix,
> 
> Based on the case history here with the refresh issues, I'm not confident of
> the state of the provisioned Services as it relates to retirement. 
> 
> I'd like the customer to start by provisioning a Service, then immediately
> retire the Service, sending us the logs which include the full provisioning
> and retirement processes along with the Service ID/name and Request ID. 
> 
> Thanks,
> Tina

the customer said they would provide the data today, I'll share it as soon as it is available.
Comment 19 drew uhlmann 2017-10-02 15:25:43 EDT
Hey Felix! Is there any update on this issue?
Comment 20 Felix Dewaleyne 2017-10-03 10:11:44 EDT
(In reply to drew uhlmann from comment #19)
> Hey Felix! Is there any update on this issue?

I have the data! let me see how I can share it...
Comment 22 drew uhlmann 2017-10-03 12:01:59 EDT
Hey Felix! I'm so sorry about this, but the timestamps on the evm log do not make much sense to me. The event of interest starts at 20:19 on 9/27 but the evm log times only cover 3:11 to 5:23 and thus don't include the pertinent information about the event. Could you please send us a copy of the relevant logs?
Comment 23 Felix Dewaleyne 2017-10-10 04:54:58 EDT
(In reply to drew uhlmann from comment #22)
> Hey Felix! I'm so sorry about this, but the timestamps on the evm log do not
> make much sense to me. The event of interest starts at 20:19 on 9/27 but the
> evm log times only cover 3:11 to 5:23 and thus don't include the pertinent
> information about the event. Could you please send us a copy of the relevant
> logs?

the logs I shared on the private server are all I have. I can request to get new data, the complete set of logs from the customer is available in the parent folder
Comment 24 Greg McCullough 2017-10-10 06:55:33 EDT
*** Bug 1497175 has been marked as a duplicate of this bug. ***
Comment 25 Felix Dewaleyne 2017-10-11 04:24:07 EDT
please let me know if I absolutely need to request new data based on the inconsistency.
Comment 26 Bill Wei 2017-10-16 17:08:52 EDT
In customer's setup they created a bundle which included three Ansible Tower services and one Azure Orchestration Stack service. When the bundle was retried, automate attempted to retire all services. Two problems have been reported from the retirement operation:

1. Error: method raw_stack_delete not implemented. It came from retiring Ansible Tower services. Ansible Tower Job, although a subclass of OrchestratioStack does not support retirement. We can fix the issue by skipping Ansible Tower Job retirement.

2. Disk not removed. The reason should be found in azure.log. I have posted the analysis in BZ 149715.
Comment 27 Bill Wei 2017-10-16 17:18:52 EDT
Sorry, The above BZ should be BZ 1497175
Comment 30 Tina Fitzgerald 2018-02-23 10:29:59 EST
Hi Greg,

Yes, we would have to change the retirement code for the Ansible Tower job.

I'm going to assign it to Bill.

Thanks,
Tina
Comment 33 CFME Bot 2018-03-06 11:53:23 EST
New commit detected on ManageIQ/manageiq-providers-ansible_tower/master:

https://github.com/ManageIQ/manageiq-providers-ansible_tower/commit/91048c5b4f7f2a6abc0e0509e06b3508215c52e3
commit 91048c5b4f7f2a6abc0e0509e06b3508215c52e3
Author:     Bill Wei <bilwei@redhat.com>
AuthorDate: Tue Mar  6 10:49:46 2018 -0500
Commit:     Bill Wei <bilwei@redhat.com>
CommitDate: Tue Mar  6 10:49:46 2018 -0500

    Move #retire_now to shared code

    https://bugzilla.redhat.com/show_bug.cgi?id=1468635

 app/models/manageiq/providers/ansible_tower/shared/automation_manager/job.rb | 4 +
 spec/support/ansible_shared/automation_manager/job.rb | 7 +
 2 files changed, 11 insertions(+)
Comment 34 CFME Bot 2018-03-06 11:56:06 EST
New commit detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/6f62e35a59e806fa5948ab28e7bc01f4605a8cc2
commit 6f62e35a59e806fa5948ab28e7bc01f4605a8cc2
Author:     Bill Wei <bilwei@redhat.com>
AuthorDate: Tue Mar  6 10:53:31 2018 -0500
Commit:     Bill Wei <bilwei@redhat.com>
CommitDate: Tue Mar  6 10:53:31 2018 -0500

    Delete #retire_now since it has been moved to shared code

    fixes https://bugzilla.redhat.com/show_bug.cgi?id=1468635

 app/models/manageiq/providers/embedded_ansible/automation_manager/job.rb | 5 -
 spec/models/manageiq/providers/embedded_ansible/automation_manager/job_spec.rb | 5 -
 2 files changed, 10 deletions(-)
Comment 37 Dave Johnson 2018-06-26 02:12:30 EDT
No requestee for needinfo set, can you take a look and determine where this should go?
Comment 38 Dave Johnson 2018-07-03 02:22:42 EDT
No requestee for needinfo set, can you take a look and determine where this should go?

Note You need to log in before you can comment on or make changes to this bug.