1672972 – [RFE] After a Nova compute shutdown, the instances' state held by the hypervisor is not updated

Bug 1672972 - [RFE] After a Nova compute shutdown, the instances' state held by the hypervisor is not updated

Summary: [RFE] After a Nova compute shutdown, the instances' state held by the hypervi...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	10.0 (Newton)
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	high
Target Milestone:	ga
Target Release:	17.1
Assignee:	melanie witt
QA Contact:	OSP DFG:Compute
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1994072 2150082
TreeView+	depends on / blocked

Reported:	2019-02-06 11:15 UTC by vivek koul
Modified:	2023-10-20 20:07 UTC (History)
CC List:	22 users (show)
Fixed In Version:	openstack-nova-23.2.2-1.20221021161917.a9e8162.el9ost
Doc Type:	Enhancement
Doc Text:	This enhancement helps cloud users determine if the reason they are unable to access an "ACTIVE" instance is because the Compute node that hosts the instance is unreachable. RHOSP administrators can now configure the following parameters to enable a custom policy that provides a status in the `host_status` field to cloud users when they run the `openstack show server details` command, if the host Compute node is unreachable: + * `NovaApiHostStatusPolicy`: Specifies the role the custom policy applies to. * `NovaShowHostStatus`: Specifies the level of host status to show to the cloud user, for example, "UNKNOWN".
Clone Of:
Clones:	1994072 (view as bug list)
Environment:
Last Closed:	2023-08-16 01:09:21 UTC
Target Upstream Version:	Ussuri
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	679181	'None'	MERGED	Add new policy rule for viewing host status UNKNOWN	2023-05-16 19:08:53 UTC
Red Hat Issue Tracker	OSP-3146	None	None	None	2022-03-25 18:59:13 UTC
Red Hat Product Errata	RHEA-2023:4577	None	None	None	2023-08-16 01:10:48 UTC

Description vivek koul 2019-02-06 11:15:25 UTC

Description of problem: 
The customer noticed that after a Nova compute shutdown, the instances' state held by the hypervisor is not updated and it seems that they're active and running when in fact they aren't.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Spawn an instance.
2. Shutdown the compute node on which instance got spawned
     # shutdown -h now
3. In nova list the instances are still in active state.

Actual results:
Instance are still in active state

Expected results:
Instance should be in shutdown state.

Additional info:

Comment 2 Matthew Booth 2019-02-08 15:40:51 UTC

This is something we could look to improve cosmetically in Nova. Possibly marking instances unknown if the host state is also unknown?

Comment 10 melanie witt 2019-05-08 16:25:21 UTC

As discussed in earlier comments on this BZ, the host status is available in its own field:

$ nova list --fields name,status,task_state,power_state,host_status

+--------------------------------------+---------+---------+------------+-------------+-------------+
| ID                                   | Name    | Status  | Task State | Power State | Host Status |
+--------------------------------------+---------+---------+------------+-------------+-------------+
| 4xxx750b-6ba5-4284-xxxd-128c0bfda783 | vivek   | ACTIVE  | None       | Running     | UNKNOWN     |
+--------------------------------------+---------+---------+------------+-------------+-------------+

with the 'host_status' field being controlled by policy. If you want non-admins to be able to view the 'host_status' field, the policy.json must be adjusted.

Hypervisor details cannot be known if nova-api is not able to communicate with nova-compute. At best, the status is "unknown". The 'host_status' field contains this information. Policy defaults to not revealing hypervisor details in server fields and such details are only available to admin. If the customer wishes to reveal this information to end users, they must adjust their policy.json.

Comment 11 melanie witt 2019-06-06 22:26:02 UTC

To add more detail, the way that server status is shown is expected behavior and not a bug. There is code for detecting the power state of a server and updating the status accordingly, but that code runs on the compute host, in the nova-compute service. There is a periodic task that queries libvirt for the virtual machine state, and if the state is found to be powered down, the nova server status is updated. If the periodic task happens to run while the compute host is powering down, after the libvirt domain is shutdown but before nova-compute is stopped, you will see the server status updated as SHUTOFF (this is fortunate timing when this happens).

Today, the only way to get additional information about the server is via the 'host_status' field, which is accessible by running the 'nova list --fields name,status,task_state,power_state,host_status' command.

With 'host_status', the host status can reflect the state UNKNOWN, which indicates that the host could have been powered down. Host status will also reveal whether the hypervisor is UP or DOWN (forced_down for maintenance). Note that the policy.json must be adjusted if it is considered acceptable to expose 'host_status' to allow non-admin users:

  "os_compute_api:servers:show:host_status": "rule:admin_api"

If the policy is not set to allow the user, they will get the error: "ERROR (CommandError): Non-existent fields are specified: [u'host_status']".

Comment 12 melanie witt 2019-06-06 22:33:24 UTC

I started a thread to get feedback from the upstream community about changing the way server status is shown:

http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006526.html

There was some amount of support for the idea of showing server status as UNKNOWN if the underlying host status is UNKNOWN. This would not leak any details about hypervisors to end users.

If the community accepts the proposal, it would require a new nova compute API microversion to add the functionality. So, the change would not be backportable.

I'm working on a draft for a nova spec proposal presently.

Comment 13 melanie witt 2019-06-18 22:37:17 UTC

Spec has been proposed here: https://review.opendev.org/666181

Comment 18 melanie witt 2019-09-09 21:03:11 UTC

This blueprint got deferred to U upstream:

https://blueprints.launchpad.net/nova/+spec/policy-rule-for-host-status-unknown

Comment 20 melanie witt 2019-11-04 23:20:53 UTC

https://review.opendev.org/679181 has merged upstream.

Comment 27 spower 2022-07-05 15:08:41 UTC

TRAC team have stated there will be no RFEs in zstreams for OSP 17.0 so moving this to 17.1. Any questions please contact rhos-trac.

Comment 49 Joanne O'Flynn 2023-06-09 09:30:14 UTC

This enhancement helps cloud users determine if the reason they are unable to access an "ACTIVE" instance is because the Compute node that hosts the instance is unreachable. RHOSP administrators can now configure the `NovaShowHostStatus` and `NovaApiHostStatusPolicy` parameters to enable a custom policy that displays the "UNKNOWN" `host_status` to cloud users when they run the `openstack show server details` command, if the host Compute node is unreachable.

Comment 57 errata-xmlrpc 2023-08-16 01:09:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577

Note You need to log in before you can comment on or make changes to this bug.

alifshit
bdobreli
dasmith
egallen
eglynn
greartes
igallagh
jamsmith
jhakimra
joflynn
jparker
kchamart
lyarwood
mariel
mwitt
sbauza
scohen
sgordon
skovili
spower
stephenfin
vromanso