Bug 1648931

Summary: Migrating an attached volume to other AZ leaves original volume intact.
Product: Red Hat OpenStack Reporter: Tzach Shefi <tshefi>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED DUPLICATE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: medium    
Version: 14.0 (Rocky)CC: abishop, dasmith, eglynn, jhakimra, kchamart, lyarwood, mbooth, mkopec, sbauza, sgordon, vromanso
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-13 09:36:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Cinder and Nova logs none

Description Tzach Shefi 2018-11-12 13:31:17 UTC
Created attachment 1504725 [details]
Cinder and Nova logs

Description of problem: 


Version-Release number of selected component (if applicable):
puppet-cinder-13.3.1-0.20181013114719.25b1ba3.el7ost.noarch
python2-os-brick-2.5.3-0.20180816081254.641337b.el7ost.noarch
openstack-cinder-13.0.1-0.20181013185427.31ff628.el7ost.noarch
python2-cinderclient-4.0.1-0.20180809133302.460229c.el7ost.noarch
python-cinder-13.0.1-0.20181013185427.31ff628.el7ost.noarch


How reproducible:
Unsure

Steps to Reproduce:
1. Boot an instance (AZ=nova)
2. Create a triple0 (lvm AZ=nova) volume
cinder create 1 --volume-type tripleo   --availability-zone nova --name vol-lvm2
+--------------------------------+---------------------------------------+
| Property                       | Value                                 |
+--------------------------------+---------------------------------------+
| attachments                    | []                                    |
| availability_zone              | nova                                  |
| bootable                       | false                                 |
| consistencygroup_id            | None                                  |
| created_at                     | 2018-11-12T11:40:30.000000            |
| description                    | None                                  |
| encrypted                      | False                                 |
| id                             | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f  |
| metadata                       | {}                                    |
| migration_status               | None                                  |
| multiattach                    | False                                 |
| name                           | vol-lvm2                              |
| os-vol-host-attr:host          | hostgroup@tripleo_iscsi#tripleo_iscsi |
| os-vol-mig-status-attr:migstat | None                                  |
| os-vol-mig-status-attr:name_id | None                                  |
| os-vol-tenant-attr:tenant_id   | 50bc97ff576b4a60b81eca7830eee529      |
| replication_status             | None                                  |
| size                           | 1                                     |
| snapshot_id                    | None                                  |
| source_volid                   | None                                  |
| status                         | available                             |
| updated_at                     | 2018-11-12T11:40:31.000000            |
| user_id                        | 1015400afd2f4c429bb2241282b297c9      |
| volume_type                    | tripleo                               |
+--------------------------------+---------------------------------------+

3. Attach vol to instance
nova volume-attach 2d300b8d-8805-4677-a9e0-bf33386963a7 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f auto 
+----------+--------------------------------------+
| Property | Value                                |
+----------+--------------------------------------+
| device   | /dev/vdb                             |
| id       | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f |
| serverId | 2d300b8d-8805-4677-a9e0-bf33386963a7 |
| volumeId | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f |
+----------+--------------------------------------+

Vol get's attached:
| 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | in-use    | vol-lvm2 | 1    | tripleo     | false    | 2d300b8d-8805-4677-a9e0-bf33386963a7


4. Migrate to nfs (otherAZ)
cinder migrate 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f controller-0@nfs 
Request to migrate volume 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f has been accepted. 

5. Migration works, new nfs vol is in use, but source vol remains available.


cinder list                                                                                     
+--------------------------------------+-----------+----------+------+-------------+----------+--------------------------------------+
| ID                                   | Status    | Name     | Size | Volume Type | Bootable | Attached to                          |
+--------------------------------------+-----------+----------+------+-------------+----------+--------------------------------------+
| 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f | available | vol-lvm2 | 1    | tripleo     | false    |                                      |

| 73ed2a59-a0cf-4110-9b9f-091c1cb24b29 | in-use    | vol-lvm2 | 1    | tripleo     | false    | 2d300b8d-8805-4677-a9e0-bf33386963a7 |


Details of volumes:
Source tripl0 lvm vol:
cinder show 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f                                                
+--------------------------------+---------------------------------------+                                                          
| Property                       | Value                                 |                                                          
+--------------------------------+---------------------------------------+                                                          
| attached_servers               | []                                    |                                                          
| attachment_ids                 | []                                    |                                                          
| availability_zone              | nova                                  |                                                          
| bootable                       | false                                 |                                                          
| consistencygroup_id            | None                                  |                                                          
| created_at                     | 2018-11-12T11:40:30.000000            |                                                          
| description                    | None                                  |                                                          
| encrypted                      | False                                 |
| id                             | 23ecbc2a-8942-4ebf-92fe-5ac9d781f74f  |
| metadata                       |                                       |
| migration_status               | migrating                             |
| multiattach                    | False                                 |
| name                           | vol-lvm2                              |
| os-vol-host-attr:host          | hostgroup@tripleo_iscsi#tripleo_iscsi |
| os-vol-mig-status-attr:migstat | migrating                             |
| os-vol-mig-status-attr:name_id | None                                  |
| os-vol-tenant-attr:tenant_id   | 50bc97ff576b4a60b81eca7830eee529      |
| replication_status             | None                                  |
| size                           | 1                                     |
| snapshot_id                    | None                                  |
| source_volid                   | None                                  |
| status                         | available                             |
| updated_at                     | 2018-11-12T11:43:21.000000            |
| user_id                        | 1015400afd2f4c429bb2241282b297c9      |
| volume_type                    | tripleo                               |
+--------------------------------+---------------------------------------+

Target nfs volume
cinder show 73ed2a59-a0cf-4110-9b9f-091c1cb24b29
+--------------------------------+---------------------------------------------+
| Property                       | Value                                       |
+--------------------------------+---------------------------------------------+
| attached_servers               | ['2d300b8d-8805-4677-a9e0-bf33386963a7']    |
| attachment_ids                 | ['f1eaed7a-81ec-4e3e-90d6-7cc9ac96088f']    |
| availability_zone              | dc2                                         |
| bootable                       | false                                       |
| consistencygroup_id            | None                                        |
| created_at                     | 2018-11-12T11:40:30.000000                  |
| description                    | None                                        |
| encrypted                      | False                                       |
| id                             | 73ed2a59-a0cf-4110-9b9f-091c1cb24b29        |
| metadata                       |                                             |
| migration_status               | target:23ecbc2a-8942-4ebf-92fe-5ac9d781f74f |
| multiattach                    | False                                       |
| name                           | vol-lvm2                                    |
| os-vol-host-attr:host          | controller-0@nfs#nfs                        |
| os-vol-mig-status-attr:migstat | target:23ecbc2a-8942-4ebf-92fe-5ac9d781f74f |
| os-vol-mig-status-attr:name_id | None                                        |
| os-vol-tenant-attr:tenant_id   | 50bc97ff576b4a60b81eca7830eee529            |
| replication_status             | None                                        |
| size                           | 1                                           |
| snapshot_id                    | None                                        |
| source_volid                   | None                                        |
| status                         | in-use                                      |
| updated_at                     | 2018-11-12T11:43:16.000000                  |
| user_id                        | 1015400afd2f4c429bb2241282b297c9            |
| volume_type                    | tripleo                                     |
+--------------------------------+---------------------------------------------+

 


Actual results:
Volume is migrated\attached, but original volume remains available.
If I do migrate an unattached volume, source volume gets deleted as expected. 

Expected results:
Expect source volume to get deleted, as happens with migrate of a none attached volume. 

Additional info:
Not sure this might be an OS-brick issue. 
Another tip notice triple0 (lvm) backend's volume_backend_name={} is empty maybe this is related in some odd way. A none attached volume worked fine back and forth same LVM\NFS backends.

Comment 2 Alan Bishop 2018-11-12 17:27:46 UTC
When the volume is attached, cinder calls nova's update_server_volume so that nova can copy the data and swap the volumes. Afterwards, nova is responsible for calling cinder's migrate_volume_completion, which ultimately cleans up the source volume.

The logs show nova executing a swap_volume, but there's no sign of cinder's migrate_volume_completion being invoked. Need nova folk to take a look.

Comment 3 Tzach Shefi 2018-11-13 06:29:23 UTC
Due to Alan's post, adding Nova's versions. 

python-nova-18.0.3-0.20181011032835.d1243fe.el7ost.noarch
openstack-nova-api-18.0.3-0.20181011032835.d1243fe.el7ost.noarch
python2-novaclient-11.0.0-0.20180809174649.f1005ce.el7ost.noarch
puppet-nova-13.3.1-0.20181013120141.8ab435c.el7ost.noarch
openstack-nova-common-18.0.3-0.20181011032835.d1243fe.el7ost.noarch
python-novajoin-1.0.20-0.20181011121757.b9098eb.el7ost.noarch

Comment 4 Matthew Booth 2018-11-16 12:06:45 UTC
Firstly, note that the design of this api is indefensible, so I won't try to defend it. It is what it is. It's also an external API, and specifically an interface between cinder and nova, which means that while not strictly set in stone, in practise changing it would take multiple releases. It's not an option here.

It looks like Nova doesn't recognise it as a cinder-initiated volume migration, and consequently doesn't call the callback. In ComputeManager.swap_volume we do:

        # Yes this is a tightly-coupled state check of what's going on inside
        # cinder, but we need this while we still support old (v1/v2) and
        # new style attachments (v3.44). Once we drop support for old style
        # attachments we could think about cleaning up the cinder-initiated
        # swap volume API flows.
        is_cinder_migration = (
            True if old_volume['status'] in ('retyping',
                                             'migrating') else False)

I suspect that the status of the source volume is something other than 'retyping' or 'migrating'. I can't find anywhere in the logs that we directly log volume.status. Please can you look for some evidence of what the source volume status is when calling nova's volume-update? It could be that we need to update our tightly-coupled state check.

Comment 5 Alan Bishop 2018-11-16 13:15:54 UTC
Based on what I see at [1], cinder sets the 'migration_status' to 'migrating', not the 'status'. And based on what I see at L2370, I think your is_cinder_migration should be True when either:

- old_volume['status'] == 'retyping'
- old_volume['migration_status'] == 'migrating'

[1] https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L2373

Comment 6 Matthew Booth 2018-11-16 14:18:34 UTC
The gerrit review history of this change. Originally introduced here:

https://review.openstack.org/#/c/456971/8/nova/compute/manager.py

Pulled out of initial POC work by jgriffith here:

https://review.openstack.org/#/c/330285/50/nova/compute/manager.py@5094

It has been upstream since (16.0.0, Pike, OSP12).

Comment 7 Matthew Booth 2018-11-16 14:26:02 UTC
The only tempest test I can see for this is test_volume_migrate_attached, which doesn't test the resulting state of the migrated volume after migration.

Comment 10 Martin Kopec 2019-06-06 20:47:55 UTC
Fix the wrong links - BZ pointed to the abandoned changes.

Comment 12 Lee Yarwood 2019-06-13 09:36:36 UTC

*** This bug has been marked as a duplicate of bug 1670344 ***