Bug 1325167 - Retired instance can be resumed from provider side and it is not powered off.
Summary: Retired instance can be resumed from provider side and it is not powered off.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Automate
Version: 5.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: GA
: 5.7.0
Assignee: Lucy Fu
QA Contact: Kyrylo Zvyagintsev
URL:
Whiteboard: vm:retirement
Depends On:
Blocks: 1327603 1327606 1346909
TreeView+ depends on / blocked
 
Reported: 2016-04-08 12:24 UTC by Nikhil Gupta
Modified: 2019-10-10 11:49 UTC (History)
11 users (show)

Fixed In Version: 5.7.0.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1327603 1327606 1346909 (view as bug list)
Environment:
Last Closed: 2017-01-11 20:15:30 UTC
Category: Bug
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Nikhil Gupta 2016-04-08 12:24:19 UTC
Description of problem:
Retired instance can be resume from provider side and it is not powered off.

Retire an instance from CloudForms running on Openstack provider, it goes into 'Suspended' state on OpenStack UI and in 'Retired' state on CFME UI. 

Now,go back to the provider (Openstack in this case) and resume the instance, the instance gets back to the active state. However the same state is not reflected in the Cloudforms UI, even though "refreshing the relationship and power states" is done several times. The state for that instance remains 'Retired'.


Version-Release number of selected component (if applicable):
cfme-5.4.4.2-1.el6cf.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Retire instance (provisioned from OpenStack UI) from CloudForms UI, instance state will change to 'Suspended'.
2. From OpenStack UI, resume this instance.

Actual results:
Instance gets back to active (Running) state but on CFME UI it will remain as 'Retired'.

Expected results:
Either CFME shouldn't allow instance to start/power on OR current power state should be updated in CFME UI.

Comment 2 Tina Fitzgerald 2016-04-08 18:02:57 UTC
Hi Nikhil,

The default retirement state machine powers off the vm(if necessary), removes the vm from the provider, and deletes the vm from the vmdb.

I suspect the reason retirement is not removing the vm is because we had code that prevented the vm removal unless, the vm was provisioned by us, or the vm was tagged with lifecycle/full.

1. Was the vm provisioned there?
2. Was the retirement code changed not to remove the vm? 

There are 3 retirement settings shown in the vm summary: retired(which would be true/blank), retires_on(the date), and retirement_state(retiring, retired, error, or blank).  

3. What is the value of those settings for the vm?
4. What is the power state of the vm afterwards?

Thanks,
Tina

Comment 3 Nikhil Gupta 2016-04-12 06:00:03 UTC
Hi Tina,

I know, CFME only delete the instance completely from provider if it is provisioned by CFME or tagged as retire_full:
~~~
# Get vm from root object
vm = $evm.root['vm']
category = "lifecycle"
tag = "retire_full"
~~~

Note that in this case,the instance was created using the openstack console and it is not tagged with retire_full option. 

We tried to retire using the Cloudforms console. As expected the instance went into a "Suspended" state. 

Retirement code is not changed, it is default.

After resuming this instance, the power state of the VM in the cloudforms is "Retired" and in the openstack is "active".

As per my understanding, once the instance state is changed to "Retired" on CloudForms, even if someone tries to resume it from provider side, it should not resume or CloudForms status should be updated to "ON".

Regards,
Nikhil

Comment 4 Tina Fitzgerald 2016-04-12 19:06:06 UTC
Hi Nikhil,

Thanks for the information.

"After resuming this instance, the power state of the VM in the cloudforms is "Retired" and in the openstack is "active"."

Cloudforms should show the VM in a "Retired" state after retirement. What is the actual CloudForms power state after retirement? Is it still "Suspended"?

After resuming the instance, what is the CloudForms power state?

We have a builtin policy that should deny any power on events and power off any retired VMs.

I'm going to run some tests here. If you have a test environment setup, could you send me the details so I can check it out? 

Thanks,
Tina

Comment 8 Tina Fitzgerald 2016-04-18 19:12:58 UTC
Hi Nikhil,

What instance can I look at that exhibits the "retired", and "on" state?

Thanks,
Tina

Comment 9 Tina Fitzgerald 2016-04-18 19:15:57 UTC
Hi Nikhil,

If there isn't an instance in that state, is there an instance I use for testing?

Thanks,
Tina

Comment 12 Tina Fitzgerald 2016-05-02 22:03:47 UTC
Hi Nikhil,

Could you check out the environment?

I'm getting the fog error below:

Thanks,
Tina

[----] I, [2016-05-02T17:30:28.606783 #20655:130b988]  INFO -- : MIQ(MiqQueue#deliver) Message id: [99000002328516], Delivering...
[----] I, [2016-05-02T17:30:28.618542 #20655:130b988]  INFO -- : MIQ(ManageIQ::Providers::Openstack::CloudManager::Refresher#refresh) Refreshing all targets...
[----] I, [2016-05-02T17:30:28.618755 #20655:130b988]  INFO -- : MIQ(ManageIQ::Providers::Openstack::CloudManager::Refresher#refresh) EMS: [OSP7], id: [99000000000018] Refreshing targets for EMS: [OSP7], id: [99000000000018]...
[----] I, [2016-05-02T17:30:28.618869 #20655:130b988]  INFO -- : MIQ(ManageIQ::Providers::Openstack::CloudManager::Refresher#refresh) EMS: [OSP7], id: [99000000000018]   ManageIQ::Providers::Openstack::CloudManager [OSP7] id [99000000000018]
[----] E, [2016-05-02T17:30:29.543178 #20655:130b988] ERROR -- : <Fog> excon.error     #<Excon::Errors::SocketError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol (OpenSSL::SSL::SSLError)>

[----] W, [2016-05-02T17:30:30.313597 #20652:1151994]  WARN -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::RefreshParser.host_inv_to_ip) IP lookup for host in VIM inventory data...Failed. Falling back to reverse lookup.
[----] I, [2016-05-02T17:30:30.991523 #63427:e15988]  INFO -- : MIQ(MiqScheduleWorker::Runner#do_work) Number of scheduled items to be processed: 2.

Comment 13 Tina Fitzgerald 2016-05-12 16:00:05 UTC
Hi Nikhil,

After further investigation, it appears there are multiple reasons the retired VM is not getting powered off.

1. I didn't see events coming from the Openstack provider. There is an Openstack configuration change that will resolve this issue.

2. Since the VM is suspended instead of powered off, our built in policies would not stop this VM.  
We need to map the incoming event:
compute.instance.resume.end --> add builtin policy to suspend the VM.
compute.instance.unpause.end --> add builtin policy to pause the VM.
 
Assigning the ticket to Lucy for event and policy changes.

Regards,
Tina

Comment 14 Nikhil Gupta 2016-05-19 05:30:16 UTC
Thanks Tina!

Comment 16 CFME Bot 2016-06-14 13:26:02 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/4a847416e666cb53dd5a1f828f39aa032f5bfd59

commit 4a847416e666cb53dd5a1f828f39aa032f5bfd59
Author:     Lucy Fu <lufu>
AuthorDate: Fri May 20 17:07:59 2016 -0400
Commit:     Lucy Fu <lufu>
CommitDate: Mon Jun 13 16:14:07 2016 -0400

    Add built in policy for event vm_resume.
    
    Retired instances may be left in suspended state on Openstack server.
    The build in policy will keep the retired instance in suspended state if it is powered on from the Openstack server.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1325167

 product/policy/built_in_policies.yml | 14 ++++++++++++++
 spec/models/miq_policy_spec.rb       | 30 ++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

Comment 18 CFME Bot 2016-06-15 12:50:00 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=a710139802b28b5387b00d088ce63022b938ed9a

commit a710139802b28b5387b00d088ce63022b938ed9a
Author:     Lucy Fu <lufu>
AuthorDate: Fri May 20 17:07:59 2016 -0400
Commit:     Lucy Fu <lufu>
CommitDate: Tue Jun 14 16:46:42 2016 -0400

    Add built in policy for event vm_resume.
    
    Retired instances may be left in suspended state on Openstack server.
    The build in policy will keep the retired instance in suspended state if it is powered on from the Openstack server.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1325167

 app/models/miq_policy.rb       | 10 ++++++++++
 spec/models/miq_policy_spec.rb | 30 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)


Note You need to log in before you can comment on or make changes to this bug.