DescriptionTakashi Kajinami
2019-09-17 06:22:10 UTC
Description of problem:
When attaching a volume to an instance, nova updates bdm record based on the following flow.
1. nova-api requests to nova-compute over rpc to reserve_block_device_name
2. nova-compute decide block device name and create a new bdm record without connection info
3. nova-api again requests to nova-compute to do attaching operation
4. nova-compute updates bdm with retrieved connection info
5. if some error like timeout detected between 3-4, remove bdm record created at 2
https://github.com/openstack/nova/blob/stable/queens/nova/compute/api.py#L4000-L4022
The problem that, if rpc timeout happens at step 1-2, for example because of busy nova-compute,
it will ends up with invalid bdm with connection_info:None remains.
Once this bdm record is created, we can not remove it from api and should update bdm record
manually inside database.
How reproducible:
Always
Steps to Reproduce:
1. Attach a volume to an instance, with causing timeout in rpm communication between nova-api and nova-compute
2. See "openstack server show <instance>"
Actual results:
Invalid bdm record remains and we see volumes_attached added by the operation
Expected results:
Invalid bdm record is deleted we don't see any change in volumes_attached
Additional info: