Bug 1238592 - [RFE] Nova to get valid server/instance state
Summary: [RFE] Nova to get valid server/instance state
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 6.0 (Juno)
Hardware: x86_64
OS: Linux
high
low
Target Milestone: ga
: 9.0 (Mitaka)
Assignee: Eoghan Glynn
QA Contact: Prasanth Anbalagan
URL: https://blueprints.launchpad.net/nova...
Whiteboard: upstream_milestone_mitaka-2 upstream_...
: 1294771 (view as bug list)
Depends On:
Blocks: 1335634 1339506 1371975 1371976
TreeView+ depends on / blocked
 
Reported: 2015-07-02 08:46 UTC by Daniel Messer
Modified: 2020-04-15 14:14 UTC (History)
18 users (show)

Fixed In Version: openstack-nova-13.0.0-1.el7ost
Doc Type: Enhancement
Doc Text:
Previously, the "nova list" command displayed instances as running when a compute node failed. Now, the instance state is updated when the hosting compute node is down. As a result, users can trust the "nova list" output for uptime monitoring.
Clone Of:
: 1371975 1371976 (view as bug list)
Environment:
Last Closed: 2016-08-24 12:51:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:1761 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 9 General Availability Advisory 2016-08-24 16:49:52 UTC

Internal Links: 1496145

Description Daniel Messer 2015-07-02 08:46:23 UTC
Description of problem: In an OSP 6 HA deployment two hypervisors are deployed each with multiple instances running. When one of the hypervisors is forcefully shutdown it's instances still show up as running in nova-list.


Version-Release number of selected component (if applicable): 6.0 A3


How reproducible:
1. Deploy RHEL OSP6 with Staypuft on 3 controllers, 2 hypervisors
2. Run multiple instances


Steps to Reproduce:
1. Power off a hypervisor
2. nova service-list reports the hypervisor as down
3. nova list

Actual results:
nova list shows all instances are running, including those on the hypervisor which was shutdown


Expected results:
nova list shows instances on the hypervisor shutdown as OFF


Additional info:

Comment 4 Eoghan Glynn 2015-07-10 13:41:40 UTC
Note that nova's tracking of the state of the compute service, and the power state of the VMs, are quite separate by design.

It sounds like you're envisaging a "host power state" or some-such additional attribute to be included in the state of the instance, which is not the case currently.

Also note that there's a new mechanism[1] being implemented for Liberty (RHEL-OSP 8) to expose an API to allow marking a host as down, in advance of it being detected as down by the servicegroups. Again, this will not cause the power state of the VMs on that host to be changed.

As far as I'm aware, there are no plans upstream for Liberty to make substanive changes to the way VM state is represented in these circumstances. Given that the nova-specs deadline for Liberty has already passed, the earliest such changes could relaitsically be made upstream would be the M* cycle. 

Finally note that Staypuft will not be supported in any newer version of RHEL-OSP after version 6. (It has been superceded by a new deployment tool called RHEL-OSP director).


[1] https://blueprints.launchpad.net/nova/+spec/mark-host-down

Comment 5 Dan Smith 2015-07-17 14:37:57 UTC
Here is the upstream spec to provide this behavior, but we likely won't see it implemented until at least M:

https://review.openstack.org/#/c/192246/

Comment 6 Daniel Messer 2015-07-22 09:22:53 UTC
(In reply to Eoghan Glynn from comment #4)
> Note that nova's tracking of the state of the compute service, and the power
> state of the VMs, are quite separate by design.

Agreed, this became evident during testing because the host power state was correctly determined.

> It sounds like you're envisaging a "host power state" or some-such
> additional attribute to be included in the state of the instance, which is
> not the case currently.

Yes and this is why I probably should change this BZ to be an RFE instead of a bug. Right now the behaviour is to be expected.

However the user story here plays a role as well and I think it's safe to assume that when a user is using `nova list' to check instance state he should be able to trust the output in any case. With the current implementation this is not possible.

Comment 7 Nikola Dipanov 2015-07-31 14:21:14 UTC
It seems that the spec linked in comment #5 is how upstream plans to solve this particular warth.

Moving this to an RFE and targeting for future (likely RHOS 9)

Comment 8 Stephen Gordon 2016-02-02 19:33:55 UTC
*** Bug 1294771 has been marked as a duplicate of this bug. ***

Comment 9 Mike McCune 2016-03-28 22:33:16 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 10 Edu Alcaniz 2016-06-14 16:24:18 UTC
could we have an update about this RFE please

Comment 15 Edu Alcaniz 2016-08-16 19:12:31 UTC
Could you update this BZ please.
Thanks very much

Comment 17 Prasanth Anbalagan 2016-08-18 20:47:31 UTC
[root@serverX]# yum list installed | grep openstack-nova
openstack-nova-api.noarch            1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-cert.noarch           1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-common.noarch         1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-compute.noarch        1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-conductor.noarch      1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-console.noarch        1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-novncproxy.noarch     1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
openstack-nova-scheduler.noarch      1:13.1.1-2.el7ost       @rhelosp-9.0-puddle
[root@serverX]# 


[root@serverX]# curl -i -X GET -H 'X-Auth-Token: 5f2c2ee7485b48a4925b7e0833d086ce' -H 'X-OpenStack-Nova-API-Version: 2.16' http://X.X.X.X:8774/v2.1/1e8409b372294934841eeb5e5ef5cde4/servers/5af72428-05cf-4992-af34-ef38f86b1add | grep host
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2009  100  2009    0     0   5421      0 --:--:-- --:--:-- --:--:--  5429
{"server": {"status": "ACTIVE", "OS-EXT-SRV-ATTR:ramdisk_id": "", "updated": "2016-08-18T19:36:43Z", "hostId": "e9f1f41f2beb6c49d5cbd50bc55ffbdd72d234f61684666eae3ab950", "OS-EXT-SRV-ATTR:host": "serverB", "addresses": {"public": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:d7:fd:85", "version": 4, "addr": "172.24.4.229", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://X.X.X.X:8774/v2.1/1e8409b372294934841eeb5e5ef5cde4/servers/5af72428-05cf-4992-af34-ef38f86b1add", "rel": "self"}, {"href": "http://X.X.X.X:8774/1e8409b372294934841eeb5e5ef5cde4/servers/5af72428-05cf-4992-af34-ef38f86b1add", "rel": "bookmark"}], "key_name": null, "image": {"id": "9c63f09c-dffc-467d-bb86-c926a822f6a4", "links": [{"href": "http://X.X.X.X:8774/1e8409b372294934841eeb5e5ef5cde4/images/9c63f09c-dffc-467d-bb86-c926a822f6a4", "rel": "bookmark"}]}, "OS-EXT-SRV-ATTR:user_data": null, "OS-EXT-STS:task_state": null, "OS-EXT-STS:vm_state": "active", "OS-EXT-SRV-ATTR:instance_name": "instance-00000057", "OS-EXT-SRV-ATTR:root_device_name": "/dev/vda", "OS-SRV-USG:launched_at": "2016-08-18T19:36:43.000000", "locked": false, "flavor": {"id": "1", "links": [{"href": "http://X.X.X.X:8774/1e8409b372294934841eeb5e5ef5cde4/flavors/1", "rel": "bookmark"}]}, "id": "5af72428-05cf-4992-af34-ef38f86b1add", "security_groups": [{"name": "default"}], "OS-SRV-USG:terminated_at": null, "OS-EXT-SRV-ATTR:kernel_id": "", "host_status": "UP", "OS-EXT-AZ:availability_zone": "nova", "user_id": "e62260f2a0b94f519e1e9cd9239c14b7", "name": "vm0", "OS-EXT-SRV-ATTR:launch_index": 0, "created": "2016-08-18T19:36:27Z", "tenant_id": "1e8409b372294934841eeb5e5ef5cde4", "OS-DCF:diskConfig": "MANUAL", "OS-EXT-SRV-ATTR:hypervisor_hostname": "serverB", "os-extended-volumes:volumes_attached": [], "accessIPv4": "", "accessIPv6": "", "OS-EXT-SRV-ATTR:reservation_id": "r-ynqstlia", "OS-EXT-SRV-ATTR:hostname": "vm0", "progress": 0, "OS-EXT-STS:power_state": 1, "config_drive": "", "metadata": {}}}


******************
AFTER HOST IS DOWN
******************

[root@serverX]# 
[root@serverX]# curl -i -X GET -H 'X-Auth-Token: 5f2c2ee7485b48a4925b7e0833d086ce' -H 'X-OpenStack-Nova-API-Version: 2.16' http://X.X.X.X:8774/v2.1/1e8409b372294934841eeb5e5ef5cde4/servers/5af72428-05cf-4992-af34-ef38f86b1add
HTTP/1.1 200 OK
Content-Length: 2014
Content-Type: application/json
X-Openstack-Nova-Api-Version: 2.16
Vary: X-OpenStack-Nova-API-Version
X-Compute-Request-Id: req-bcf0a485-3066-42ad-943b-af8343235f8f
Date: Thu, 18 Aug 2016 20:34:27 GMT

{"server": {"status": "ACTIVE", "OS-EXT-SRV-ATTR:ramdisk_id": "", "updated": "2016-08-18T19:36:43Z", "hostId": "e9f1f41f2beb6c49d5cbd50bc55ffbdd72d234f61684666eae3ab950", "OS-EXT-SRV-ATTR:host": "serverB", "addresses": {"public": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:d7:fd:85", "version": 4, "addr": "172.24.4.229", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://X.X.X.X:8774/v2.1/1e8409b372294934841eeb5e5ef5cde4/servers/5af72428-05cf-4992-af34-ef38f86b1add", "rel": "self"}, {"href": "http://X.X.X.X:8774/1e8409b372294934841eeb5e5ef5cde4/servers/5af72428-05cf-4992-af34-ef38f86b1add", "rel": "bookmark"}], "key_name": null, "image": {"id": "9c63f09c-dffc-467d-bb86-c926a822f6a4", "links": [{"href": "http://X.X.X.X:8774/1e8409b372294934841eeb5e5ef5cde4/images/9c63f09c-dffc-467d-bb86-c926a822f6a4", "rel": "bookmark"}]}, "OS-EXT-SRV-ATTR:user_data": null, "OS-EXT-STS:task_state": null, "OS-EXT-STS:vm_state": "active", "OS-EXT-SRV-ATTR:instance_name": "instance-00000057", "OS-EXT-SRV-ATTR:root_device_name": "/dev/vda", "OS-SRV-USG:launched_at": "2016-08-18T19:36:43.000000", "locked": false, "flavor": {"id": "1", "links": [{"href": "http://X.X.X.X:8774/1e8409b372294934841eeb5e5ef5cde4/flavors/1", "rel": "bookmark"}]}, "id": "5af72428-05cf-4992-af34-ef38f86b1add", "security_groups": [{"name": "default"}], "OS-SRV-USG:terminated_at": null, "OS-EXT-SRV-ATTR:kernel_id": "", "host_status": "UNKNOWN", "OS-EXT-AZ:availability_zone": "nova", "user_id": "e62260f2a0b94f519e1e9cd9239c14b7", "name": "vm0", "OS-EXT-SRV-ATTR:launch_index": 0, "created": "2016-08-18T19:36:27Z", "tenant_id": "1e8409b372294934841eeb5e5ef5cde4", "OS-DCF:diskConfig": "MANUAL", "OS-EXT-SRV-ATTR:hypervisor_hostname": "serverB", "os-extended-volumes:volumes_attached": [], "accessIPv4": "", "accessIPv6": "", "OS-EXT-SRV-ATTR:reservation_id": "r-ynqstlia", "OS-EXT-SRV-ATTR:hostname": "vm0", "progress": 0, "OS-EXT-STS:power_state": 1, "config_drive": "", "metadata": {}}}[root@serverX]# 
[root@serverX]#

Comment 19 errata-xmlrpc 2016-08-24 12:51:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1761.html


Note You need to log in before you can comment on or make changes to this bug.