Description of problem: cinder trype from netapp to hp3par is failing with the following error: |__Flow 'volume_create_manager' 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager Traceback (most recent call last): 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager File "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager result = task.execute(**arguments) 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager File "/usr/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py", line 866, in execute 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager context, volume, **volume_spec) 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager File "/usr/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py", line 493, in _create_from_source_volume 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager model_update = self.driver.create_cloned_volume(volume, srcvol_ref) 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 895, in trace_logging_wrapper 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager result = f(*args, **kwargs) 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/hpe/hpe_3par_fc.py", line 205, in create_cloned_volume 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager return common.create_cloned_volume(volume, src_vref) 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/hpe/hpe_3par_common.py", line 2072, in create_cloned_volume 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager raise exception.NotFound() 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager NotFound: Resource could not be found. 2020-05-12 13:26:58.969 650012 ERROR cinder.volume.manager 2020-05-12 13:26:58.972 650012 DEBUG cinder.volume.manager [req-ab122c05-6327-43a0-a7f2-f53edd6f5627 07330a460401434bb02fed5acb0e3d3c 1b6c62dd0a6a4f7c8917304292298cb0 - default default] Task 'cinder.volume.flows .manager.create_volume.CreateVolumeFromSpecTask;volume:create' (2ad0c70a-3d5f-456f-9c38-488fcbfb3822) transitioned into state 'REVERTING' from state 'FAILURE' _task_receiver /usr/lib/python2.7/site-packages/task flow/listeners/logging.py:189 Version-Release number of selected component (if applicable): openstack-cinder-9.1.4-3.el7ost.noarch How reproducible: Always Steps to Reproduce: 1. Configure a hp3par backend 2. Configure a netapp backend 3. cinder retype netapp -> hp3par Actual results: Fails Expected results: Succeeds Additional info:
From the logs and Newton code, i can see that 1) The create clone request reached hpe3par driver 2020-05-12 13:26:56.765 650012 DEBUG cinder.volume.drivers.hpe.hpe_3par_common [req-ab122c05-6327-43a0-a7f2-f53edd6f5627 07330a460401434bb02fed5acb0e3d3c 1b6c62dd0a6a4f7c8917304292298cb0 - default default] Creating a clone of volume, using online copy. create_cloned_volume /usr/lib/python2.7/site-packages/cinder/volume/drivers/hpe/hpe_3par_common.py:2020 https://github.com/openstack/cinder/blob/newton-eol/cinder/volume/drivers/hpe/hpe_3par_common.py#L2020 2) temporary snapshot creation started 2020-05-12 13:26:56.767 650012 INFO cinder.volume.drivers.hpe.hpe_3par_common [req-ab122c05-6327-43a0-a7f2-f53edd6f5627 07330a460401434bb02fed5acb0e3d3c 1b6c62dd0a6a4f7c8917304292298cb0 - default default] Creating temp snapshot tss-vhPqCxGOScWzTTqX.suqlA from volume osv-ZRKdaIjARTOuvmqkBz7jAA https://github.com/openstack/cinder/blob/newton-eol/cinder/volume/drivers/hpe/hpe_3par_common.py#L1986 3) The next debug log should've been in _copy_volume but before that the client disconnected and it's not seen in the logs (because some exception occurred) https://github.com/openstack/cinder/blob/newton-eol/cinder/volume/drivers/hpe/hpe_3par_common.py#L1904 I suspect there was some exception during these two operations[1] which caused the disconnect[2] Maybe the snapshot didn't create successfully and we weren't able to fetch the volume [1] https://github.com/openstack/cinder/blob/newton-eol/cinder/volume/drivers/hpe/hpe_3par_common.py#L1989-L1990 [2] https://github.com/openstack/cinder/blob/newton-eol/cinder/volume/drivers/hpe/hpe_3par_fc.py#L207
The problem , I think here, is that it's trying to clone a volume that's in another backend ... does that make sense ? Otherwise, function names aren't significative.
The expected migration flow is, create a new volume on destination and copy data. From the migration code, before the create volume call, i see the source_volid wasn't skipped during copying the volume properties and source_volid was also included due to which create_cloned_volume is called instead of create volume. With a little search i found this[1] fix and IMO this would solve the issue. [1] https://review.opendev.org/#/c/316086/
While it may not be obvious from the description, this is actually a duplicate of bug #1456355. The issue is the source volume's ID is not being filtered from the migration code that creates the destination volume, which in turn makes the system try create the destination volume (on the 3par) as a clone the source volume (on the netapp). The clone operation fails for obvious reasons, which causes the retype/migration to fail. Kudos to Rajat for pointing out the upstream patch in comment #4 that fixes this. That patch is what fixed bug #1456355, and it shipped in osp10z4. Unfortunately, this customer hit the same issue because they're running an even older z3. So, the customer's situation can be addressed by performing a minor update. *** This bug has been marked as a duplicate of bug 1456355 ***