1655490 – Cinder retype does not use driver assisted volume migration

Bug 1655490 - Cinder retype does not use driver assisted volume migration

Summary: Cinder retype does not use driver assisted volume migration

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-cinder
Sub Component:
Version:	14.0 (Rocky)
Hardware:	x86_64
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	Alpha
Target Release:	17.0
Assignee:	Gorka Eguileor
QA Contact:	Tzach Shefi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-03 09:55 UTC by Gregory Charot
Modified:	2022-09-21 12:08 UTC (History)
CC List:	11 users (show)
Fixed In Version:	openstack-cinder-18.2.1-0.20220605050357.9a473fd.el9ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-09-21 12:07:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1886543	None	None	None	2020-07-06 16:31:43 UTC
OpenStack gerrit	739548	None	MERGED	Driver assisted migration on retype when it's safe	2021-07-15 14:28:00 UTC
Red Hat Issue Tracker	OSP-2615	None	None	None	2022-06-22 14:55:46 UTC
Red Hat Product Errata	RHEA-2022:6543	None	None	None	2022-09-21 12:08:48 UTC

Description Gregory Charot 2018-12-03 09:55:17 UTC

Description of problem:

OSP14 introduces RBD cinder migration driver (BZ 1262068). When performing a cinder migrate, the driver is indeed used however when doing a cinder retype it is not, the generic driver is used instead (volume is migrated through the controllers).


Version-Release number of selected component (if applicable):
14

How reproducible:
Always

Steps to Reproduce:
1. Define two cinder types pointing to two different cinder backends hosted on the same ceph cluster (diff pools)
cinder type-list
+--------------------------------------+----------+----------------------+-----------+
| ID                                   | Name     | Description          | Is_Public |
+--------------------------------------+----------+----------------------+-----------+
| 7b51de01-37c1-419b-85ca-dd0de7df3b2e | fast     | Fast Volume Type     | True      |
| ec995825-5232-417b-aca6-3aab621c0f7f | standard | Standard Volume Type | True      |
+--------------------------------------+----------+----------------------+-----------+
2. Do a cinder retype
cinder retype --migration-policy on-demand test fast
 
3. The migration process use the generic driver

Actual results:

Migration is not driver assisted


Expected results:

Migration should use the relevant (RBD in this case) driver.

Additional info:

Cinder volume debug logs
http://pastebin.test.redhat.com/677486

Comment 9 Gorka Eguileor 2019-05-16 08:53:42 UTC

I believe this is a limitation of how the features were designed.

Currently the driver optimized retype is only called if the volume is is not encrypted and if the backend doesn't change.

When the backend is different and we enable migrations then we call the normal migration, but telling it it has a new type, and this prevents the manager calling the driver optimized migration due to this check in the volume manager's `migrate_volume` method: 

        if not force_host_copy and new_type_id is None:

This is because a driver retype doesn't migrate a volume, and a migration doesn't retype a volume.

From a quick look I see 2 alternatives:

- Allow driver to opt in on a 2 step process: optimized migration and then driver specific retype.
- Call optimized migration on retype when the only difference between the types is the destination backend.

In my opinion we should enable the second option in any case, and it shouldn't bee too complicated.

Comment 17 Gorka Eguileor 2020-07-06 16:37:24 UTC

I have opened a bug upstream for the second case I mentioned on comment #9 and I have proposed a fix that should resolve most of the use cases, as it will call driver assisted migration when the volume types only change the backend, which is the case for almost all cases described in this BZ's scenario.

Comment 25 Tzach Shefi 2022-07-07 12:49:31 UTC

Verified on:
openstack-cinder-18.2.1-0.20220605050357.9a473fd.el9ost.noarch

1. Defined two backends, each using it's dedicated Ceph backend/rbd pool
(overcloud) [stack@undercloud-0 ~]$ cinder type-list
+--------------------------------------+-----------------+-------------+-----------+
| ID                                   | Name            | Description | Is_Public |
+--------------------------------------+-----------------+-------------+-----------+
| aeb5337d-7090-42d7-9067-29f99b336066 | tripleo_default | -           | True      |
| ec3de734-a560-4c3e-b442-2c301b1c83b6 | tripleo2        | -           | True      |
+--------------------------------------+-----------------+-------------+-----------+


(overcloud) [stack@undercloud-0 ~]$ cinder  extra-specs-list
+--------------------------------------+-----------------+----------------------------------------------+
| ID                                   | Name            | extra_specs                                  |
+--------------------------------------+-----------------+----------------------------------------------+
| aeb5337d-7090-42d7-9067-29f99b336066 | tripleo_default | {}                                           |  -> the default one use the default "tripleo_ceph"
| ec3de734-a560-4c3e-b442-2c301b1c83b6 | tripleo2        | {'volume_backend_name': 'tripleo_ceph_vol2'} |
+--------------------------------------+-----------------+----------------------------------------------+

Cinder service-list:
..
| cinder-volume    | hostgroup@tripleo_ceph      | nova | enabled | up    | 2022-07-07T12:08:26.000000 | -               |
| cinder-volume    | hostgroup@tripleo_ceph_vol2 | nova | enabled | up    | 2022-07-07T12:08:26.000000 | -               |

2. Created two volumes one on each backend:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| ID                                   | Status    | Name                 | Size | Volume Type     | Bootable | Attached to |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| 2d627d92-f101-41ab-82e2-c1bfbec4ebc0 | available | tripleo_default_volA | 1    | tripleo_default | false    |             |
| fe5fd6b7-ade9-4171-9db1-c3392a996a4f | available | tripleo2_volB        | 1    | tripleo2        | false    |             |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+


3. Now try to migrate each one to the other type/backend:
(overcloud) [stack@undercloud-0 ~]$ cinder retype --migration-policy on-demand tripleo_default_volA tripleo2
And the second one too
(overcloud) [stack@undercloud-0 ~]$ cinder retype --migration-policy on-demand tripleo2_volB tripleo_default

(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| ID                                   | Status    | Name                 | Size | Volume Type     | Bootable | Attached to |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+
| 2d627d92-f101-41ab-82e2-c1bfbec4ebc0 | available | tripleo_default_volA | 1    | tripleo2        | false    |             |
| fe5fd6b7-ade9-4171-9db1-c3392a996a4f | available | tripleo2_volB        | 1    | tripleo_default | false    |             |
+--------------------------------------+-----------+----------------------+------+-----------------+----------+-------------+

As can be seen both of them were retyped switched over to the other volume type/backend. 
If we check the logs here is one of them being "Successful RBD assisted volume migration"

2022-07-07 12:13:05.456 11 DEBUG cinder.volume.manager [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Issue driver.migrate_volume. migrate_volume /usr/lib/python3.9/site-packages/cinder/volume/manager.py:2609
2022-07-07 12:13:05.457 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Attempting RBD assisted volume migration. volume: 2d627d92-f101-41ab-82e2-c1bfbec4ebc0, host: {'host': 'hostgroup@tripleo_ceph_vol2#tripleo_ceph_vol2', 'cluster_name': None, 'capabilities': {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 133.06, 'free_capacity_gb': 133.06, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:fd7ad824-a913-5b2f-acfc-fabb670e0ebc:openstack:vol2', 'backend_state': 'up', 'volume_backend_name': 'tripleo_ceph_vol2', 'replication_enabled': False, 'allocated_capacity_gb': 1, 'filter_function': None, 'goodness_function': None, 'timestamp': '2022-07-07T12:12:45.394331'}}, status=retyping. migrate_volume /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1924
2022-07-07 12:13:05.458 11 DEBUG os_brick.initiator.linuxrbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] opening connection to ceph cluster (timeout=-1). connect /usr/lib/python3.9/site-packages/os_brick/initiator/linuxrbd.py:70
2022-07-07 12:13:05.482 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] connecting to openstack@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:480
2022-07-07 12:13:05.504 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] connecting to openstack@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:480
2022-07-07 12:13:05.689 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] connecting to openstack@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:480
2022-07-07 12:13:05.731 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] volume has no backup snaps _delete_backup_snaps /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1104
2022-07-07 12:13:05.732 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Volume volume-2d627d92-f101-41ab-82e2-c1bfbec4ebc0 is not a clone. _get_clone_info /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1127
2022-07-07 12:13:05.736 11 DEBUG cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] deleting rbd volume volume-2d627d92-f101-41ab-82e2-c1bfbec4ebc0 delete_volume /usr/lib/python3.9/site-packages/cinder/volume/drivers/rbd.py:1247
2022-07-07 12:13:05.863 11 INFO cinder.volume.drivers.rbd [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Successful RBD assisted volume migration.
2022-07-07 12:13:05.879 11 INFO cinder.volume.manager [req-506a06c2-b1e7-4923-8c1a-6cf0ac37f9f9 85321f14fc344d228df218d103c178a8 78cfade4d5854083964e41ac71921565 - - -] Migrate volume completed successfully.


Good to verify, this time using the RBD assisted driver migration rather than previously used generic driver.

Comment 30 errata-xmlrpc 2022-09-21 12:07:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543

Note You need to log in before you can comment on or make changes to this bug.