Bug 2178507

Summary: 17.1 - #5- Disconnecting from the wrong host - Fix issues with nova-manage volume_attachment subcommand
Product: Red Hat OpenStack Reporter: Amit Uniyal <auniyal>
Component: openstack-novaAssignee: Amit Uniyal <auniyal>
Status: CLOSED DUPLICATE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: alifshit, dasmith, eglynn, jelynch, jhakimra, kchamart, sbauza, sgordon, vromanso
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: compute host is not correct in connector file. Consequence: volume refresh fails. Fix: verify and let user know, if given compute host in connector file is correct. Result: volume refresh should only run for correct compute host.
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-11 15:39:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amit Uniyal 2023-03-15 06:24:26 UTC
This bug was initially created as a copy of Bug #2178501

I am copying this bug because: 



Work item: disconnecting the volume from the correct host. From the email thread:

>> 5- Disconnecting from the wrong host
>>
>> There were cases where the instance said to live in compute#1 but the
>> connection_info in the BDM record was for compute#2, and when the script
>> called `remote_volume_connection` then nova would call os-brick on
>> compute#1 (the wrong node) and try to detach it.
>>
>> In some case os-brick would mistakenly think that the volume was
>> attached (because the target and lun matched an existing volume on the
>> host) and would try to disconnect, resulting in errors on the compute
>> logs.
>>
>> It wasn't a problem (besides creating some confusion and noise) because
>> the removal of the multipath failed since it was in use by an instance.
>>
>> I believe it may be necessary to change the code here:
>>
>>                  compute_rpcapi.remove_volume_connection(
>>                      cctxt, instance, volume_id, instance.host)
>>
>> To use the "host" from the connector properties in the
>> bdb.connection_info if it is present.
> 
> ya that also sound like a clear bug

This bug was initially created as a copy of Bug #2161733

I am copying this bug because: 



Description of problem:

Gorka had to make heavy use of the `nova-manage volume_attachment` commands in resolving an escalation for Ericsson, and he had some feedback for us. We'd like to implement that feedback.

Version-Release number of selected component (if applicable):

From master all the way down to 16.2.

How reproducible:

N/A

Steps to Reproduce:

N/A

Actual results:

N/A

Expected results:

N/A

Additional info:

The thread where Gorka explains his feedback is at [1]. I'll try to break it down into specific fixes/work items in subsequent comments in this BZ.

[1] https://lists.corp.redhat.com/archives/rhos-compute/2022-December/000883.html