Bug 1788053

Summary:	Azure Managed Disk not removed with retirement
Product:	Red Hat CloudForms Management Engine	Reporter:	Mihir Lele <mlele>
Component:	Providers	Assignee:	Daniel Berger <dberger>
Status:	CLOSED ERRATA	QA Contact:	Jaroslav Henner <jhenner>
Severity:	low	Docs Contact:	Red Hat CloudForms Documentation <cloudforms-docs>
Priority:	low
Version:	5.10.10	CC:	akarol, dberger, dmetzger, gmccullo, jfrey, jhardy, jocarter, lufu, mkanoor, nansari, obarenbo, simaishi, tfitzger, tonay
Target Milestone:	GA	Keywords:	ZStream
Target Release:	5.11.5	Flags:	pm-rhel: cfme-5.11.z+
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	5.11.5.0	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-05-05 13:43:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	Bug
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	CFME Core	Target Upstream Version:
Embargoed:

Description Mihir Lele 2020-01-06 09:13:43 UTC

Description of problem:
Managed disk not removed as part of Azure VM

Version-Release number of selected component (if applicable):
5.10

How reproducible:
all the time

Steps to Reproduce:
1. use the Retirement automate to retire an azure VM
2.
3.

Actual results:
Storage is left behind

Expected results:
storage should be retired

Additional info:

Comment 2 Lucy Fu 2020-01-06 21:31:59 UTC

To clarify the issue, is the retirement for an orchestration service, an azure instance or azure stack?
Thanks.

Comment 3 Mihir Lele 2020-01-07 04:58:20 UTC

Hello,

The object that is being retired is an individual VM.

Thanks!

Comment 4 Daniel Berger 2020-01-07 19:06:42 UTC

If memory serves, we didn't delete storage by default because we planned on adding a storage manager that would let you manage the storage/disks separately, but that never materialized. So, we didn't delete storage (or network security groups) by default.

As per discussion with Lucy, I -think- it might be a matter of updating this line:

  https://github.com/ManageIQ/manageiq-providers-azure/blob/master/app/models/manageiq/providers/azure/cloud_manager/vm/operations.rb#L9

To this:

  provider_service.delete_associated_resources(name, resource_group.name, :storage_account => true)

Comment 5 Daniel Berger 2020-01-08 15:59:03 UTC

Actually, after thinking about this overnight, I think I remember why we don't delete the disk. The issue is that storage can be shared by multiple VM's. So, like Azure itself, we don't delete the associated storage when a VM is deleted.

Now, I'm pretty sure a deletion attempt would fail if there is anything else attached to it, so we could try to delete it and, if it fails, just log and/or raise an error. But, this is something that needs to be discussed.

Comment 6 Lucy Fu 2020-01-13 16:42:00 UTC

Tested azure VM provision and service provision with 5.10.10.0.
  VM was provisioned in both cases.

Tested VM/instance retirement and service retirement. 
  Disk was removed from azure portal in both cases.
  VM was archived for both cases.
  Service was retired for 2nd case.

So the disk was removed when VM got retired. Can't reproduce the issue locally.

Comment 7 Lucy Fu 2020-01-13 16:47:27 UTC

1. How often does the customer run into the issue? 
2. Is there any customized code in automate? Does it work if all customized automate code is disabled?

If it still fails with all customization code disabled, please send us the following information
1. the screen shots of the VM state before and after the retirement
2. the screen shot of the failed request
3. complete set of logs showing how the request was created, and showing the failure.

Comment 13 Lucy Fu 2020-01-15 21:05:27 UTC

From the log files, the customer did service provisioning and service retirement, not an individual VM as stated in comment #3.

Local tests of service retirement with 5.10.10 works well. 

[----] I, [2020-01-14T14:09:53.415149 #3872:fe0f60]  INFO -- : Q-task_id([r1000000000404_service_retire_task_1000000000410]) Invoking [inline] method [/IAAS/Service/Retirement/StateMachines/Methods/retire_service] with inputs [{}]

Customer used customized code in domain /IAAS which might have caused the issue. 

Please test with all customized automate code disabled.

If it fails, please send us the following information.
1. the screen shots of the service state before and after the retirement
2. the screen shot of the failed service request
3. complete set of logs showing how the request was created, and showing the failure.

Thanks.

Comment 18 Lucy Fu 2020-01-21 19:18:25 UTC

This sounds like an Azure disks add/remove issue instead of an automate retirement issue.
Forward the BZ to provider team for debugging. 
Please return back the BZ if I'm wrong here. 
Thanks.

Comment 19 Daniel Berger 2020-01-21 22:23:16 UTC

Alright, my memory was playing tricks on me. We DO delete the OS disk by default for managed storage, but currently the azure-armrest gem doesn't delete attached data disks.

I will have to update the azure-armrest gem, then integrate that change into the core code.

Comment 22 Daniel Berger 2020-02-17 13:44:38 UTC

https://github.com/ManageIQ/manageiq-providers-azure/pull/377

Comment 25 CFME Bot 2020-03-25 17:35:37 UTC

New commit detected on ManageIQ/manageiq-providers-azure/ivanchuk:

https://github.com/ManageIQ/manageiq-providers-azure/commit/cfeeb130051bb6aa5875bec40cc98822f0a5dcd1
commit cfeeb130051bb6aa5875bec40cc98822f0a5dcd1
Author:     Adam Grare <agrare>
AuthorDate: Fri Jan 24 18:36:19 2020 +0000
Commit:     Adam Grare <agrare>
CommitDate: Fri Jan 24 18:36:19 2020 +0000

    Merge pull request #377 from djberg96/attached_disks

    Delete attached data disks on vm destroy

    (cherry picked from commit d67792c0c4d2b333c0e78e91c8099a53af6e0686)

    https://bugzilla.redhat.com/show_bug.cgi?id=1788053

 app/models/manageiq/providers/azure/cloud_manager/vm/operations.rb | 2 +-
 manageiq-providers-azure.gemspec | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comment 26 Jaroslav Henner 2020-04-20 21:23:34 UTC

I created a two VMs and attached a volume (managed disk) to each.

I retired a VM using these CFME version appliances

5.11.5.1.20200415152414_39b433a
5.11.4.2.20200309205646_632ff59

The disk of VM retired with the newer version was removed
The disk of VM retired with the older version was not removed.

I am not sure this is a good idea, but it works as customer expects.

Should this happen for volumes on OpenStack and other Cloud providers as well?

Comment 27 Daniel Berger 2020-04-21 12:42:23 UTC

Jaroslav, I don't know. This would have to be discussed either in the individual provider channels, or the core channel. If so, separate issues should be created for each provider.

Comment 30 errata-xmlrpc 2020-05-05 13:43:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2020