Bug 1752734 - Invalid bdm record remains when reserve_block_device_name rpc call times out
Summary: Invalid bdm record remains when reserve_block_device_name rpc call times out
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: z12
: 13.0 (Queens)
Assignee: Lee Yarwood
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-17 06:22 UTC by Takashi Kajinami
Modified: 2023-09-07 20:41 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-06 18:10:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1425352 0 None None None 2020-01-22 11:53:11 UTC
OpenStack gerrit 693537 0 'None' MERGED compute: Use long_rpc_timeout in reserve_block_device_name 2020-12-16 01:25:15 UTC
Red Hat Issue Tracker OSP-28268 0 None None None 2023-09-07 20:41:41 UTC

Description Takashi Kajinami 2019-09-17 06:22:10 UTC
Description of problem:

When attaching a volume to an instance, nova updates bdm record based on the following flow.

 1. nova-api requests to nova-compute over rpc to reserve_block_device_name
 2. nova-compute decide block device name and create a new bdm record without connection info
 3. nova-api again requests to nova-compute to do attaching operation
 4. nova-compute updates bdm with retrieved connection info
 5. if some error like timeout detected between 3-4, remove bdm record created at 2

https://github.com/openstack/nova/blob/stable/queens/nova/compute/api.py#L4000-L4022

The problem that, if rpc timeout happens at step 1-2, for example because of busy nova-compute,
it will ends up with invalid bdm with connection_info:None remains.

Once this bdm record is created, we can not remove it from api and should update bdm record
manually inside database.

How reproducible:
 Always

Steps to Reproduce:
1. Attach a volume to an instance, with causing timeout in rpm communication between nova-api and nova-compute
2. See "openstack server show <instance>"

Actual results:
Invalid bdm record remains and we see volumes_attached added by the operation

Expected results:
Invalid bdm record is deleted we don't see any change in volumes_attached

Additional info:


Note You need to log in before you can comment on or make changes to this bug.