1752734 – Invalid bdm record remains when reserve_block_device_name rpc call times out

Bug 1752734 - Invalid bdm record remains when reserve_block_device_name rpc call times out

Summary: Invalid bdm record remains when reserve_block_device_name rpc call times out

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	13.0 (Queens)
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	z12
Target Release:	13.0 (Queens)
Assignee:	Lee Yarwood
QA Contact:	OSP DFG:Compute
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-09-17 06:22 UTC by Takashi Kajinami
Modified:	2023-09-07 20:41 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-03-06 18:10:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1425352	None	None	None	2020-01-22 11:53:11 UTC
OpenStack gerrit	693537	'None'	MERGED	compute: Use long_rpc_timeout in reserve_block_device_name	2020-12-16 01:25:15 UTC
Red Hat Issue Tracker	OSP-28268	None	None	None	2023-09-07 20:41:41 UTC

Description Takashi Kajinami 2019-09-17 06:22:10 UTC

Description of problem:

When attaching a volume to an instance, nova updates bdm record based on the following flow.

 1. nova-api requests to nova-compute over rpc to reserve_block_device_name
 2. nova-compute decide block device name and create a new bdm record without connection info
 3. nova-api again requests to nova-compute to do attaching operation
 4. nova-compute updates bdm with retrieved connection info
 5. if some error like timeout detected between 3-4, remove bdm record created at 2

https://github.com/openstack/nova/blob/stable/queens/nova/compute/api.py#L4000-L4022

The problem that, if rpc timeout happens at step 1-2, for example because of busy nova-compute,
it will ends up with invalid bdm with connection_info:None remains.

Once this bdm record is created, we can not remove it from api and should update bdm record
manually inside database.

How reproducible:
 Always

Steps to Reproduce:
1. Attach a volume to an instance, with causing timeout in rpm communication between nova-api and nova-compute
2. See "openstack server show <instance>"

Actual results:
Invalid bdm record remains and we see volumes_attached added by the operation

Expected results:
Invalid bdm record is deleted we don't see any change in volumes_attached

Additional info:

Note You need to log in before you can comment on or make changes to this bug.