Created attachment 1504725 [details] Cinder and Nova logs Description of problem: Version-Release number of selected component (if applicable): puppet-cinder-13.3.1-0.20181013114719.25b1ba3.el7ost.noarch python2-os-brick-2.5.3-0.20180816081254.641337b.el7ost.noarch openstack-cinder-13.0.1-0.20181013185427.31ff628.el7ost.noarch python2-cinderclient-4.0.1-0.20180809133302.460229c.el7ost.noarch python-cinder-13.0.1-0.20181013185427.31ff628.el7ost.noarch How reproducible: Unsure Steps to Reproduce: 1. Boot an instance (AZ=nova) 2. Create a triple0 (lvm AZ=nova) volume cinder create 1 --volume-type tripleo --availability-zone nova --name vol-lvm2 +--------------------------------+---------------------------------------+ | Property | Value | +--------------------------------+---------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2018-11-12T11:40:30.000000 | | description | None | | encrypted | False | | id | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | | metadata | {} | | migration_status | None | | multiattach | False | | name | vol-lvm2 | | os-vol-host-attr:host | hostgroup@tripleo_iscsi#tripleo_iscsi | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 50bc97ff576b4a60b81eca7830eee529 | | replication_status | None | | size | 1 | | snapshot_id | None | | source_volid | None | | status | available | | updated_at | 2018-11-12T11:40:31.000000 | | user_id | 1015400afd2f4c429bb2241282b297c9 | | volume_type | tripleo | +--------------------------------+---------------------------------------+ 3. Attach vol to instance nova volume-attach 2d300b8d-8805-4677-a9e0-bf33386963a7 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f auto +----------+--------------------------------------+ | Property | Value | +----------+--------------------------------------+ | device | /dev/vdb | | id | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | | serverId | 2d300b8d-8805-4677-a9e0-bf33386963a7 | | volumeId | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | +----------+--------------------------------------+ Vol get's attached: | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | in-use | vol-lvm2 | 1 | tripleo | false | 2d300b8d-8805-4677-a9e0-bf33386963a7 4. Migrate to nfs (otherAZ) cinder migrate 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f controller-0@nfs Request to migrate volume 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f has been accepted. 5. Migration works, new nfs vol is in use, but source vol remains available. cinder list +--------------------------------------+-----------+----------+------+-------------+----------+--------------------------------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+----------+------+-------------+----------+--------------------------------------+ | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | available | vol-lvm2 | 1 | tripleo | false | | | 73ed2a59-a0cf-4110-9b9f-091c1cb24b29 | in-use | vol-lvm2 | 1 | tripleo | false | 2d300b8d-8805-4677-a9e0-bf33386963a7 | Details of volumes: Source tripl0 lvm vol: cinder show 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f +--------------------------------+---------------------------------------+ | Property | Value | +--------------------------------+---------------------------------------+ | attached_servers | [] | | attachment_ids | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2018-11-12T11:40:30.000000 | | description | None | | encrypted | False | | id | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | | metadata | | | migration_status | migrating | | multiattach | False | | name | vol-lvm2 | | os-vol-host-attr:host | hostgroup@tripleo_iscsi#tripleo_iscsi | | os-vol-mig-status-attr:migstat | migrating | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 50bc97ff576b4a60b81eca7830eee529 | | replication_status | None | | size | 1 | | snapshot_id | None | | source_volid | None | | status | available | | updated_at | 2018-11-12T11:43:21.000000 | | user_id | 1015400afd2f4c429bb2241282b297c9 | | volume_type | tripleo | +--------------------------------+---------------------------------------+ Target nfs volume cinder show 73ed2a59-a0cf-4110-9b9f-091c1cb24b29 +--------------------------------+---------------------------------------------+ | Property | Value | +--------------------------------+---------------------------------------------+ | attached_servers | ['2d300b8d-8805-4677-a9e0-bf33386963a7'] | | attachment_ids | ['f1eaed7a-81ec-4e3e-90d6-7cc9ac96088f'] | | availability_zone | dc2 | | bootable | false | | consistencygroup_id | None | | created_at | 2018-11-12T11:40:30.000000 | | description | None | | encrypted | False | | id | 73ed2a59-a0cf-4110-9b9f-091c1cb24b29 | | metadata | | | migration_status | target:23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | | multiattach | False | | name | vol-lvm2 | | os-vol-host-attr:host | controller-0@nfs#nfs | | os-vol-mig-status-attr:migstat | target:23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 50bc97ff576b4a60b81eca7830eee529 | | replication_status | None | | size | 1 | | snapshot_id | None | | source_volid | None | | status | in-use | | updated_at | 2018-11-12T11:43:16.000000 | | user_id | 1015400afd2f4c429bb2241282b297c9 | | volume_type | tripleo | +--------------------------------+---------------------------------------------+ Actual results: Volume is migrated\attached, but original volume remains available. If I do migrate an unattached volume, source volume gets deleted as expected. Expected results: Expect source volume to get deleted, as happens with migrate of a none attached volume. Additional info: Not sure this might be an OS-brick issue. Another tip notice triple0 (lvm) backend's volume_backend_name={} is empty maybe this is related in some odd way. A none attached volume worked fine back and forth same LVM\NFS backends.
When the volume is attached, cinder calls nova's update_server_volume so that nova can copy the data and swap the volumes. Afterwards, nova is responsible for calling cinder's migrate_volume_completion, which ultimately cleans up the source volume. The logs show nova executing a swap_volume, but there's no sign of cinder's migrate_volume_completion being invoked. Need nova folk to take a look.
Due to Alan's post, adding Nova's versions. python-nova-18.0.3-0.20181011032835.d1243fe.el7ost.noarch openstack-nova-api-18.0.3-0.20181011032835.d1243fe.el7ost.noarch python2-novaclient-11.0.0-0.20180809174649.f1005ce.el7ost.noarch puppet-nova-13.3.1-0.20181013120141.8ab435c.el7ost.noarch openstack-nova-common-18.0.3-0.20181011032835.d1243fe.el7ost.noarch python-novajoin-1.0.20-0.20181011121757.b9098eb.el7ost.noarch
Firstly, note that the design of this api is indefensible, so I won't try to defend it. It is what it is. It's also an external API, and specifically an interface between cinder and nova, which means that while not strictly set in stone, in practise changing it would take multiple releases. It's not an option here. It looks like Nova doesn't recognise it as a cinder-initiated volume migration, and consequently doesn't call the callback. In ComputeManager.swap_volume we do: # Yes this is a tightly-coupled state check of what's going on inside # cinder, but we need this while we still support old (v1/v2) and # new style attachments (v3.44). Once we drop support for old style # attachments we could think about cleaning up the cinder-initiated # swap volume API flows. is_cinder_migration = ( True if old_volume['status'] in ('retyping', 'migrating') else False) I suspect that the status of the source volume is something other than 'retyping' or 'migrating'. I can't find anywhere in the logs that we directly log volume.status. Please can you look for some evidence of what the source volume status is when calling nova's volume-update? It could be that we need to update our tightly-coupled state check.
Based on what I see at [1], cinder sets the 'migration_status' to 'migrating', not the 'status'. And based on what I see at L2370, I think your is_cinder_migration should be True when either: - old_volume['status'] == 'retyping' - old_volume['migration_status'] == 'migrating' [1] https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L2373
The gerrit review history of this change. Originally introduced here: https://review.openstack.org/#/c/456971/8/nova/compute/manager.py Pulled out of initial POC work by jgriffith here: https://review.openstack.org/#/c/330285/50/nova/compute/manager.py@5094 It has been upstream since (16.0.0, Pike, OSP12).
The only tempest test I can see for this is test_volume_migrate_attached, which doesn't test the resulting state of the migrated volume after migration.
Fix the wrong links - BZ pointed to the abandoned changes.
*** This bug has been marked as a duplicate of bug 1670344 ***