Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2309703

Summary: [OSP17.1] Invalid state of instance files on shared storage when nova host (or single) evacuate
Product: Red Hat OpenStack Reporter: Ricardo Ramos Thomas <riramos>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED DUPLICATE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: unspecified    
Version: 17.1 (Wallaby)CC: dasmith, dhill, eglynn, enothen, jhakimra, kchamart, sbauza, sgordon, vromanso
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-09-06 11:58:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ricardo Ramos Thomas 2024-09-04 12:32:16 UTC
Description of problem:

- OSP 17.1
- no instanceHA
- Ceph as the backend

When we evacuate a host or a single instance, we have the following error on the currend compute-hci and doesn't migrate: 
| OS-EXT-STS:vm_state                 | error                                                                                                                                                                                                                                                                                          
| fault : "...InvalidSharedStorage: Invalid state of instance files on shared storage"
| status                              | ERROR                                                                                



~~~
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:06.723 2 ERROR nova.compute.manager [req-a4e3e17f-6ec9-4db8-a669-e1a5334f29a8 3d92a319387141e5bfc8217d7c60c167 2eef95fe9e5a4ca890be4a54b8c69cab - default default] [instance: 15bba80f-4273-4bd0-848f-bd58faae8076] Setting instance vm_state to ERROR: nova.exception.InvalidSharedStorage: Invalid state of instance files on shared storage
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:06.723 2 ERROR nova.compute.manager [instance: 15bba80f-4273-4bd0-848f-bd58faae8076]     raise exception.InvalidSharedStorage(
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:06.723 2 ERROR nova.compute.manager [instance: 15bba80f-4273-4bd0-848f-bd58faae8076] nova.exception.InvalidSharedStorage: Invalid state of instance files on shared storage
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:06.953 2 ERROR oslo_messaging.rpc.server [req-a4e3e17f-6ec9-4db8-a669-e1a5334f29a8 3d92a319387141e5bfc8217d7c60c167 2eef95fe9e5a4ca890be4a54b8c69cab - default default] Exception during message handling: nova.exception.InvalidSharedStorage: Invalid state of instance files on shared storage
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:06.953 2 ERROR oslo_messaging.rpc.server     raise exception.InvalidSharedStorage(
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:06.953 2 ERROR oslo_messaging.rpc.server nova.exception.InvalidSharedStorage: Invalid state of instance files on shared storage
0020-sosreport-cpt-hci-04-03918220-2024-08-29-zntirea.tar.xz/sosreport-cpt-hci-04-03918220-2024-08-29-zntirea/var/log/containers/nova/nova-compute.log:2024-08-29 16:53:07.887 2 ERROR nova.compute.manager [req-ac2f3fe5-05ff-4b69-8cd9-143e2a66a0f1 3d92a319387141e5bfc8217d7c60c167 2eef95fe9e5a4ca890be4a54b8c69cab - default default] [instance: 996ed549-bc1b-4532-9a52-81e120a5661d] Setting instance vm_state to ERROR: nova.exception.InvalidSharedStorage: Invalid state of instance files on shared storage
~~~


3 cases test


1 - nova evacuate & nova host-evacuate  - Fail

2 - openstack server evacuate  - Fail

3- openstack server evacuate --shared-storage - OK/success



Should no be needed use --shared-storage

Version-Release number of selected component (if applicable):

Red Hat OpenStack Platform release 17.1.2 (Wallaby)

How reproducible:

try to evacuate i

Evacuation of host and server success.

Additional info:

- Test info 
- SOS report with Nova Debug mode enable (issue replicated)  availablenstance

Steps to Reproduce:
1.try to evacuate instance
2.
3.

Actual results:
- openstack server evacuate Evacuation fail if the command do not include --shared-storage

- nova evacuate always fail

Expected results:

Evacuation of host and server success.

Additional info:

- Test info 
- SOS report with Nova Debug mode enable (issue replicated)  available

Comment 2 David Hill 2024-09-06 00:53:29 UTC
The issue is that overcloudrc is requestion nova microversion 2.10 (or is it 2.1?) and the auto calculated in 2.14 and later .  I'm pretty sure if you set 2.14 or 2.15 in the overcloud rc OS_COMPUTE_API_VERSION , it will work automatically.

Comment 3 David Hill 2024-09-06 01:13:13 UTC
[dhill@knox tripleo_common]$ git show df8edc662
commit df8edc662feebd6700618a5d68d7c966ecae9256
Author: Alexon Oliveira <alolivei>
Date:   Wed Aug 23 12:11:57 2023 -0300

    Changed the DEFAULT_COMPUTE_API_VERSION value in the constants.py
    
    The default value defined for the variable DEFAULT_COMPUTE_API_VERSION
    in the ~/.overcloudrc file generated by overcloudrc.py, which pulls
    this variable value from constants.py, is '2.latest', but this
    generates a warning message everytime a user uses this standard config
    file, like this one below:
    
    /usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452:
    UserWarning: You have a configured API_VERSION with 'latest' in it.
    In the context of openstacksdk this doesn't make any sense.
      warnings.warn(
    
    To avoid that, I changed the value '2.latest' to simply '2.1',
    which is the current API version in use after a fresh install.
    
    Change-Id: Idbcfbf29626d9a66427b78067813915caf0e9cd5
    Signed-off-by: Alexon Oliveira <alolivei>

diff --git a/tripleo_common/constants.py b/tripleo_common/constants.py
index 9406dae8..ffc7b306 100644
--- a/tripleo_common/constants.py
+++ b/tripleo_common/constants.py
@@ -120,7 +120,7 @@ DO_NOT_ROTATE_LIST = (
 DEFAULT_IDENTITY_API_VERSION = '3'
 
 # The default version of the Compute API to set in overcloudrc.
-DEFAULT_COMPUTE_API_VERSION = '2.latest'
+DEFAULT_COMPUTE_API_VERSION = '2.1'
 
 # The default version of the Image API to set in overcloudrc.
 DEFAULT_IMAGE_API_VERSION = '2'

Comment 5 David Hill 2024-09-06 11:58:31 UTC

*** This bug has been marked as a duplicate of bug 2259167 ***