Bug 1183757
Summary: | RHOS: Unable to start a suspended instance after relationships & power states refresh | |||
---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Jan Krocil <jkrocil> | |
Component: | Insight | Assignee: | Greg Blomquist <gblomqui> | |
Status: | CLOSED ERRATA | QA Contact: | Jan Krocil <jkrocil> | |
Severity: | high | Docs Contact: | ||
Priority: | medium | |||
Version: | 5.3.0 | CC: | dclarizi, gblomqui, mfeifer, nachandr | |
Target Milestone: | GA | |||
Target Release: | 5.4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | 5.4.0.0.11 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1187770 (view as bug list) | Environment: | ||
Last Closed: | 2015-06-16 12:47:58 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1187770 |
Description
Jan Krocil
2015-01-19 17:20:10 UTC
Here's a related, closed bug: https://bugzilla.redhat.com/show_bug.cgi?id=1135606 Just a note: If you do a suspend, let the instance's power state be manually set by CFME and then go for "Power > Start" without refreshing the relationships and power states (and letting the power state change to "off") the instance will start back up. There are a few problems going on here: 1) relying on "status" from openstack is problematic because "status" can refer to transitional states as well; such as: "stopping". 2) there seems to be a discrepancy between the openstack "status" and the openstack "power_state" in this case. The "status" shows "suspended", but the "power_state" shows "shutdown". I'm guessing this is because of the underlying driver (libvirt) not correctly handling the "suspended" power status. And, this is a problem because... 3) Nova is not currently handling "status"/"power_state" discrepancies when the "status" is "suspended": https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5912-L5918 Taking psav's original suggestion of just handling "shutdown" as if it were "suspended" is apparently the right approach. This is risky, because I'm not altogether sure what "shutdown" is really supposed to mean in openstack. There's already a "stopped" power state. So, this may come back to bite us later. d'oh, gave credit to psav, when it was jan all along! my bad New commit detected on manageiq/master: https://github.com/ManageIQ/manageiq/commit/67fea97e94384c622e0642c23b3b662805e89ad9 commit 67fea97e94384c622e0642c23b3b662805e89ad9 Author: Greg Blomquist <gblomqui> AuthorDate: Mon Feb 16 16:33:23 2015 -0500 Commit: Greg Blomquist <gblomqui> CommitDate: Mon Feb 16 17:31:46 2015 -0500 Resume "shutdown" instances in OpenStack There's a number of problems leading to the cause of this bug. This fix attempts to handle one of those problems. But, it's possible that this problem will creep up again, and we may have to deal with this same problem again later. The issue is because of three things happening at the same time: 1) OpenStack does not correctly set the power_state for suspended instances. The vm_state is set to "suspended" while the power_state ends up as "shutdown". It's possible that this is specific to libvirt instances. I'm not altogether certain yet. 2) OpenStack does not handle discrepancies between vm_status and power_state when the vm_status is "suspended": https://github.com/openstack/nova/blob/47fc1a6e5674fadecca253629d36430ceb5c8471/nova/compute/manager.py#L5912-L5918 3) ManageIQ did not previously handle starting a "shutdown" openstack instance. With this patch, ManageIQ handles "shutdown" openstack instances as if they are "suspended" to match the way OpenStack handled suspended libvirt instances. https://bugzilla.redhat.com/show_bug.cgi?id=1183757 vmdb/app/models/vm_openstack/operations/power.rb | 1 + .../ems_refresh/refreshers/openstack_refresher_rhos_havana_spec.rb | 1 + 2 files changed, 2 insertions(+) New commit detected on cfme/5.3.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=72dff993d5f8cbe9cf08aaf68218945a58e4689d commit 72dff993d5f8cbe9cf08aaf68218945a58e4689d Author: Greg Blomquist <gblomqui> AuthorDate: Mon Feb 16 16:33:23 2015 -0500 Commit: Greg Blomquist <gblomqui> CommitDate: Tue Feb 17 12:30:02 2015 -0500 Resume "shutdown" instances in OpenStack There's a number of problems leading to the cause of this bug. This fix attempts to handle one of those problems. But, it's possible that this problem will creep up again, and we may have to deal with this same problem again later. The issue is because of three things happening at the same time: 1) OpenStack does not correctly set the power_state for suspended instances. The vm_state is set to "suspended" while the power_state ends up as "shutdown". It's possible that this is specific to libvirt instances. I'm not altogether certain yet. 2) OpenStack does not handle discrepancies between vm_status and power_state when the vm_status is "suspended": https://github.com/openstack/nova/blob/47fc1a6e5674fadecca253629d36430ceb5c8471/nova/compute/manager.py#L5912-L5918 3) ManageIQ did not previously handle starting a "shutdown" openstack instance. With this patch, ManageIQ handles "shutdown" openstack instances as if they are "suspended" to match the way OpenStack handled suspended libvirt instances. https://bugzilla.redhat.com/show_bug.cgi?id=1183757 vmdb/app/models/vm_openstack/operations/power.rb | 1 + .../ems_refresh/refreshers/openstack_refresher_rhos_havana_spec.rb | 1 + 2 files changed, 2 insertions(+) Verified fixed in 5.4.0.0.22 - 5.4.0.0.22.20150420163946_26004d1. running > suspend > suspended (no rel. refresh) > start > on (OK) running > suspend > suspended > rel. refresh > off > start > on (OK) Note: There is also this (more general) bug, related to openstack power control: https://bugzilla.redhat.com/show_bug.cgi?id=1115557 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1100.html |