Bug 1655490
| Summary: | Cinder retype does not use driver assisted volume migration | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Gregory Charot <gcharot> |
| Component: | openstack-cinder | Assignee: | Gorka Eguileor <geguileo> |
| Status: | CLOSED ERRATA | QA Contact: | Tzach Shefi <tshefi> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 14.0 (Rocky) | CC: | abishop, eharney, geguileo, gfidente, jobernar, ltoscano, nlevinki, pgrist, scohen, senrique, tshefi |
| Target Milestone: | Alpha | Keywords: | Reopened, Triaged |
| Target Release: | 17.0 | ||
| Hardware: | x86_64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-cinder-18.2.1-0.20220605050357.9a473fd.el9ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-21 12:07:43 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Gregory Charot
2018-12-03 09:55:17 UTC
I believe this is a limitation of how the features were designed.
Currently the driver optimized retype is only called if the volume is is not encrypted and if the backend doesn't change.
When the backend is different and we enable migrations then we call the normal migration, but telling it it has a new type, and this prevents the manager calling the driver optimized migration due to this check in the volume manager's `migrate_volume` method:
if not force_host_copy and new_type_id is None:
This is because a driver retype doesn't migrate a volume, and a migration doesn't retype a volume.
From a quick look I see 2 alternatives:
- Allow driver to opt in on a 2 step process: optimized migration and then driver specific retype.
- Call optimized migration on retype when the only difference between the types is the destination backend.
In my opinion we should enable the second option in any case, and it shouldn't bee too complicated.
I have opened a bug upstream for the second case I mentioned on comment #9 and I have proposed a fix that should resolve most of the use cases, as it will call driver assisted migration when the volume types only change the backend, which is the case for almost all cases described in this BZ's scenario. Verified on:
openstack-cinder-18.2.1-0.20220605050357.9a473fd.el9ost.noarch
1. Defined two backends, each using it's dedicated Ceph backend/rbd pool
(overcloud) [stack@undercloud-0 ~]$ cinder type-list
+--------------------------------------+-----------------+-------------+-----------+
| ID | Name | Description | Is_Public |
+--------------------------------------+-----------------+-------------+-----------+
| aeb5337d-7090-42d7-9067-29f99b336066 | tripleo_default | - | True |
| ec3de734-a560-4c3e-b442-2c301b1c83b6 | tripleo2 | - | True |
+--------------------------------------+-----------------+-------------+-----------+
(overcloud) [stack@undercloud-0 ~]$ cinder extra-specs-list
+--------------------------------------+-----------------+----------------------------------------------+
| ID | Name | extra_specs |
+--------------------------------------+-----------------+----------------------------------------------+
| aeb5337d-7090-42d7-9067-29f99b336066 | tripleo_default | {} | -> the default one use the default "tripleo_ceph"
| ec3de734-a560-4c3e-b442-2c301b1c83b6 | tripleo2 | {'volume_backend_name': 'tripleo_ceph_vol2'} |
+--------------------------------------+-----------------+----------------------------------------------+
Cinder service-list:
..
| cinder-volume | hostgroup@tripleo_ceph | nova | enabled | up | 2022-07-07T12:08:26.000000 | - |
| cinder-volume | hostgroup@tripleo_ceph_vol2 | nova | enabled | up | 2022-07-07T12:08:26.000000 | - |
2. Created two volumes one on each backend:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| 2d627d92-f101-41ab-82e2-c1bfbec4ebc0 | available | tripleo_default_volA | 1 | tripleo_default | false | |
| fe5fd6b7-ade9-4171-9db1-c3392a996a4f | available | tripleo2_volB | 1 | tripleo2 | false | |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
3. Now try to migrate each one to the other type/backend:
(overcloud) [stack@undercloud-0 ~]$ cinder retype --migration-policy on-demand tripleo_default_volA tripleo2
And the second one too
(overcloud) [stack@undercloud-0 ~]$ cinder retype --migration-policy on-demand tripleo2_volB tripleo_default
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| 2d627d92-f101-41ab-82e2-c1bfbec4ebc0 | available | tripleo_default_volA | 1 | tripleo2 | false | |
| fe5fd6b7-ade9-4171-9db1-c3392a996a4f | available | tripleo2_volB | 1 | tripleo_default | false | |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
As can be seen both of them were retyped switched over to the other volume type/backend.
If we check the logs here is one of them being "Successful RBD assisted volume migration"
2022-07-07 12:13:05.456 11 DEBUG cinder.volume.manager [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Issue driver.migrate_volume. migrate_volume /usr/lib/python3.9/site-packages/cinder/volume/manager.py:2609
2022-07-07 12:13:05.457 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Attempting RBD assisted volume migration. volume: 2d627d92-f101-41ab-82e2-c1bfbec4ebc0, host: {'host': 'hostgroup@tripleo_ceph_vol2#tripleo_ceph_vol2', 'cluster_name': None, 'capabilities': {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 133.06, 'free_capacity_gb': 133.06, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:fd7ad824-a913-5b2f-acfc-fabb670e0ebc:openstack:vol2', 'backend_state': 'up', 'volume_backend_name': 'tripleo_ceph_vol2', 'replication_enabled': False, 'allocated_capacity_gb': 1, 'filter_function': None, 'goodness_function': None, 'timestamp': '2022-07-07T12:12:45.394331'}}, status=retyping. migrate_volume /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1924
2022-07-07 12:13:05.458 11 DEBUG os_brick.initiator.linuxrbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] opening connection to ceph cluster (timeout=-1). connect /usr/lib/python3.9/site-packages/os_brick/initiator/linuxrbd.py:70
2022-07-07 12:13:05.482 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] connecting to openstack@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:480
2022-07-07 12:13:05.504 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] connecting to openstack@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:480
2022-07-07 12:13:05.689 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] connecting to openstack@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:480
2022-07-07 12:13:05.731 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] volume has no backup snaps _delete_backup_snaps /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1104
2022-07-07 12:13:05.732 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Volume volume-2d627d92-f101-41ab-82e2-c1bfbec4ebc0 is not a clone. _get_clone_info /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1127
2022-07-07 12:13:05.736 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] deleting rbd volume volume-2d627d92-f101-41ab-82e2-c1bfbec4ebc0 delete_volume /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1247
2022-07-07 12:13:05.863 11 INFO cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Successful RBD assisted volume migration.
2022-07-07 12:13:05.879 11 INFO cinder.volume.manager [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Migrate volume completed successfully.
Good to verify, this time using the RBD assisted driver migration rather than previously used generic driver.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543 |