Bug 2161733
| Summary: | Make sure errors of nova-manage attachment refresh command are shown | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Artom Lifshitz <alifshit> |
| Component: | openstack-nova | Assignee: | Amit Uniyal <auniyal> |
| Status: | ON_QA --- | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | 16.2 (Train) | CC: | dasmith, eglynn, jhakimra, kchamart, sbauza, sgordon, udesale, vromanso |
| Target Milestone: | z6 | Keywords: | Patch, Triaged |
| Target Release: | 16.2 (Train on RHEL 8.4) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-nova-20.6.2-2.20230713165111.8a24acd.el8ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 2
Artom Lifshitz
2023-01-17 19:51:57 UTC
Work item: fix handling of instance locking. From the email thread:
>> 3- VMs in locked state
>>
>> This may be by design, but I'll say it here and let the compute team
>> decide on the correct behavior.
>>
>> On some failures, like the one from step #1, the refresh script leaves
>> the instance in a locked state instead of clearing it.
>
> ya that kind of a bug.
> we put it in the locked state to make sure the end user cannot make any action like hard rebooting the instace
> while we are messing with the db. that is also why we require the vm to be off so that they cant power it off
> by sshing in.
>
> regardless of the success or failure the reshsh command shoudl restore the lock state
>
> so if it was locked before leave it locked and if it was unlocked leave it unlocked.
> so this sound like a bug in our error handeling and clean up
+1
Work item: disconnecting the volume from the correct host. From the email thread:
>> 5- Disconnecting from the wrong host
>>
>> There were cases where the instance said to live in compute#1 but the
>> connection_info in the BDM record was for compute#2, and when the script
>> called `remote_volume_connection` then nova would call os-brick on
>> compute#1 (the wrong node) and try to detach it.
>>
>> In some case os-brick would mistakenly think that the volume was
>> attached (because the target and lun matched an existing volume on the
>> host) and would try to disconnect, resulting in errors on the compute
>> logs.
>>
>> It wasn't a problem (besides creating some confusion and noise) because
>> the removal of the multipath failed since it was in use by an instance.
>>
>> I believe it may be necessary to change the code here:
>>
>> compute_rpcapi.remove_volume_connection(
>> cctxt, instance, volume_id, instance.host)
>>
>> To use the "host" from the connector properties in the
>> bdb.connection_info if it is present.
>
> ya that also sound like a clear bug
|