Bug 2161733
Summary: | Make sure errors of nova-manage attachment refresh command are shown | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Artom Lifshitz <alifshit> |
Component: | openstack-nova | Assignee: | Amit Uniyal <auniyal> |
Status: | CLOSED ERRATA | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
Severity: | low | Docs Contact: | |
Priority: | medium | ||
Version: | 16.2 (Train) | CC: | dasmith, eglynn, jhakimra, kchamart, sbauza, sgordon, udesale, vromanso |
Target Milestone: | z6 | Keywords: | Patch, Triaged |
Target Release: | 16.2 (Train on RHEL 8.4) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-nova-20.6.2-2.20230713165111.8a24acd.el8ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-11-08 19:18:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Comment 2
Artom Lifshitz
2023-01-17 19:51:57 UTC
Work item: fix handling of instance locking. From the email thread:
>> 3- VMs in locked state
>>
>> This may be by design, but I'll say it here and let the compute team
>> decide on the correct behavior.
>>
>> On some failures, like the one from step #1, the refresh script leaves
>> the instance in a locked state instead of clearing it.
>
> ya that kind of a bug.
> we put it in the locked state to make sure the end user cannot make any action like hard rebooting the instace
> while we are messing with the db. that is also why we require the vm to be off so that they cant power it off
> by sshing in.
>
> regardless of the success or failure the reshsh command shoudl restore the lock state
>
> so if it was locked before leave it locked and if it was unlocked leave it unlocked.
> so this sound like a bug in our error handeling and clean up
+1
Work item: disconnecting the volume from the correct host. From the email thread:
>> 5- Disconnecting from the wrong host
>>
>> There were cases where the instance said to live in compute#1 but the
>> connection_info in the BDM record was for compute#2, and when the script
>> called `remote_volume_connection` then nova would call os-brick on
>> compute#1 (the wrong node) and try to detach it.
>>
>> In some case os-brick would mistakenly think that the volume was
>> attached (because the target and lun matched an existing volume on the
>> host) and would try to disconnect, resulting in errors on the compute
>> logs.
>>
>> It wasn't a problem (besides creating some confusion and noise) because
>> the removal of the multipath failed since it was in use by an instance.
>>
>> I believe it may be necessary to change the code here:
>>
>> compute_rpcapi.remove_volume_connection(
>> cctxt, instance, volume_id, instance.host)
>>
>> To use the "host" from the connector properties in the
>> bdb.connection_info if it is present.
>
> ya that also sound like a clear bug
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:6307 |