Bug 1752734

Summary: Invalid bdm record remains when reserve_block_device_name rpc call times out
Product: Red Hat OpenStack Reporter: Takashi Kajinami <tkajinam>
Component: openstack-novaAssignee: Lee Yarwood <lyarwood>
Status: CLOSED WONTFIX QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: dasmith, eglynn, jhakimra, kchamart, lyarwood, sbauza, sgordon, stephenfin, vromanso
Target Milestone: z12Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-06 18:10:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takashi Kajinami 2019-09-17 06:22:10 UTC
Description of problem:

When attaching a volume to an instance, nova updates bdm record based on the following flow.

 1. nova-api requests to nova-compute over rpc to reserve_block_device_name
 2. nova-compute decide block device name and create a new bdm record without connection info
 3. nova-api again requests to nova-compute to do attaching operation
 4. nova-compute updates bdm with retrieved connection info
 5. if some error like timeout detected between 3-4, remove bdm record created at 2

https://github.com/openstack/nova/blob/stable/queens/nova/compute/api.py#L4000-L4022

The problem that, if rpc timeout happens at step 1-2, for example because of busy nova-compute,
it will ends up with invalid bdm with connection_info:None remains.

Once this bdm record is created, we can not remove it from api and should update bdm record
manually inside database.

How reproducible:
 Always

Steps to Reproduce:
1. Attach a volume to an instance, with causing timeout in rpm communication between nova-api and nova-compute
2. See "openstack server show <instance>"

Actual results:
Invalid bdm record remains and we see volumes_attached added by the operation

Expected results:
Invalid bdm record is deleted we don't see any change in volumes_attached

Additional info: