Created attachment 1669332 [details] Cinder and Nova compute logs Description of problem: While verifying a Cinder bzhttps://bugzilla.redhat.com/show_bug.cgi?id=1752560#c5 Attaching a retyped Cinder volume fails, adding Alan's initial investigation: The core issue is retyping a volume results in it being assigned a new UUID internally, with metadata that maps it to its original UUID. Here, it seems nova is trying to use the original UUID without looking up its mapping. Version-Release number of selected component (if applicable): rhel7.7 openstack-nova-migration-14.1.0-50.el7ost.noarch openstack-nova-common-14.1.0-50.el7ost.noarch openstack-nova-conductor-14.1.0-50.el7ost.noarch python-nova-14.1.0-50.el7ost.noarch openstack-nova-api-14.1.0-50.el7ost.noarch openstack-nova-novncproxy-14.1.0-50.el7ost.noarch python-novaclient-6.0.2-2.el7ost.noarch openstack-nova-cert-14.1.0-50.el7ost.noarch puppet-nova-9.6.0-9.el7ost.noarch openstack-nova-console-14.1.0-50.el7ost.noarch openstack-nova-compute-14.1.0-50.el7ost.noarch openstack-nova-scheduler-14.1.0-50.el7ost.noarch How reproducible: Probably every time Steps to Reproduce: 1. In my case backend was NFS, both volume types "none" and Legacy use same NFS backend part of the Cinder bz verification steps. 2. Created and empty Cinder volume [stack@undercloud-0 ~]$ cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | 12a2dc46-687c-41c9-a390-49cddd77bb3e | available | EmptyVolType | 5 | - | false | | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ 3. Retype this volume from "none" to type Legacy volume type [stack@undercloud-0 ~]$ cinder retype 12a2dc46-687c-41c9-a390-49cddd77bb3e Legacy --migration-policy on-demand 4. Now once migration finished, attaching volume to instance fails [stack@undercloud-0 ~]$ cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | 12a2dc46-687c-41c9-a390-49cddd77bb3e | available | EmptyVolType | 5 | Legacy | false | | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ [stack@undercloud-0 ~]$ nova list +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ | 192ff28f-b592-4b4e-9c42-69d3beedca5c | inst1 | ACTIVE | - | Running | internal=192.168.0.16, 10.0.0.218 | +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ [stack@undercloud-0 ~]$ nova volume-attach 192ff28f-b592-4b4e-9c42-69d3beedca5c 12a2dc46-687c-41c9-a390-49cddd77bb3e auto +----------+--------------------------------------+ | Property | Value | +----------+--------------------------------------+ | device | /dev/vdb | | id | 12a2dc46-687c-41c9-a390-49cddd77bb3e | | serverId | 192ff28f-b592-4b4e-9c42-69d3beedca5c | | volumeId | 12a2dc46-687c-41c9-a390-49cddd77bb3e | +----------+--------------------------------------+ Actual results: Volume attachment fails /var/log/nova/nova-compute.log:25098:2020-03-11 11:06:31.639 44544 ERROR nova.virt.libvirt.driver [req-de56387d-a625-4ddd-bff0-031e8f1a09be bb83a7a259dc444ebf4da26964d011bc 36d3464abfe0476cad8f668b2261cd8e - - -] [instance: 192ff28f-b592-4b4e-9c42-69d3beedca5c] Failed to attach volume at mountpoint: /dev/vdb Expected results: Volume should attach without issues. Additional info:
BTW this new Nova issue doesn't happen on OSP15 It didn't hit me when I was verifying OSP15 clone of that original Cinder bz: https://bugzilla.redhat.com/show_bug.cgi?id=1749491#c5
This isn't an attached retype so Nova has no idea that the volume has moved to a new file on the backend. c-api provides the following bogus connection_info so n-cpu does the correct thing and attempts to attach the original file to the instance: 2020-03-11 11:06:30.654 44544 DEBUG cinderclient.v2.client [req-de56387d-a625-4ddd-bff0-031e8f1a09be bb83a7a259dc444ebf4da26964d011bc 36d3464abfe0476cad8f668b2261cd8e - - -] RESP: [200] X-Compute-Request-Id: req-d55babb2-6b43-4c00-b4ba-50be74e47300 Content-Type: application/json Content-Length: 281 X-Openstack-Request-Id: req-d55babb2-6b43-4c00-b4ba-50be74e47300 Date: Wed, 11 Mar 2020 11:06:30 GMT RESP BODY: {"connection_info": { "driver_volume_type": "nfs", "mount_point_base": "/var/lib/cinder/nfs", "data": {"name": "volume-12a2dc46-687c-41c9-a390-49cddd77bb3e", "encrypted": false, "qos_specs": null, "export": "10.35.160.111:/export/ins_cinder", "access_mode": "rw", "options": null}}} _http_log_response /usr/lib/python2.7/site-packages/keystoneauth1/session.py:390 [..] libvirtError: Cannot access storage file '/var/lib/nova/mnt/47266020eacec99097bdec49f2451d38/volume-12a2dc46-687c-41c9-a390-49cddd77bb3e': No such file or directory AFAIK there has never been any lookup on the n-cpu side, c-api is responsible for this lookup (cinder.volume.drivers.remotefs.RemoteFSDriver.get_active_image_from_info) and providing the correct filename via the connection_info returned to n-cpu from our call to initialize_connection. Looking at the share I think the issue is that the volume-12a2dc46-687c-41c9-a390-49cddd77bb3e.info file isn't even present for c-vol to find the correct filename given the volume_id of 12a2dc46-687c-41c9-a390-49cddd77bb3e: # ll /var/lib/cinder/nfs/47266020eacec99097bdec49f2451d38/volume-12a2dc46-687c-41c9-a390-49cddd77bb3e* ls: cannot access /var/lib/cinder/nfs/47266020eacec99097bdec49f2451d38/volume-12a2dc46-687c-41c9-a390-49cddd77bb3e*: No such file or directory Moving this back to openstack-cinder for review.
The cinder squad discussed this and will investigate. @Sofi, you can ping Eric or me for more context.
Retesting with a newer Cinder version: puppet-cinder-9.5.0-7.el7ost.noarch python-cinderclient-1.9.0-6.el7ost.noarch python-cinder-9.1.4-55.el7ost.noarch openstack-cinder-9.1.4-55.el7ost.noarch Seems to be working OK with newer version. Create a volume with empty type: [stack@undercloud-0 ~]$ cinder create 1 --name vol1 +--------------------------------+--------------------------------------+ | Property | Value | +--------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2020-03-19T09:09:47.000000 | | description | None | | encrypted | False | | id | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | | metadata | {} | | migration_status | None | | multiattach | False | | name | vol1 | | os-vol-host-attr:host | None | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 11ab865bd9414f29acfb761e1939cfb9 | | replication_status | disabled | | size | 1 | | snapshot_id | None | | source_volid | None | | status | creating | | updated_at | None | | user_id | 6ec9ce5dcc844a05ba55167fa7e0796e | | volume_type | None | Show volume details: [stack@undercloud-0 ~]$ cinder show 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf +--------------------------------+--------------------------------------+ | Property | Value | +--------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2020-03-19T09:09:47.000000 | | description | None | | encrypted | False | | id | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | | metadata | {} | | migration_status | None | | multiattach | False | | name | vol1 | | os-vol-host-attr:host | hostgroup@nfs#nfs | --> backend is NFS | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 11ab865bd9414f29acfb761e1939cfb9 | | replication_status | disabled | | size | 1 | | snapshot_id | None | | source_volid | None | | status | available | | updated_at | 2020-03-19T09:09:51.000000 | | user_id | 6ec9ce5dcc844a05ba55167fa7e0796e | | volume_type | None | -> volume type none/empty +--------------------------------+--------------------------------------+ So we have a "none" typed NFS backed volume, now lets retype it to Legacy volume type Legacy also on NFS. This here above only simulates the initial state of volume before we try to attach it. Retype volume to Legacy type [stack@undercloud-0 ~]$ cinder retype 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf Legacy --migration-policy on-demand during retype operation: [stack@undercloud-0 ~]$ cinder show 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf +--------------------------------+--------------------------------------+ | Property | Value | +--------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2020-03-19T09:09:47.000000 | | description | None | | encrypted | False | | id | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | | metadata | {} | | migration_status | completing | | multiattach | False | | name | vol1 | | os-vol-host-attr:host | hostgroup@nfs#nfs | | os-vol-mig-status-attr:migstat | completing | | os-vol-mig-status-attr:name_id | 460da76e-cabc-4627-b802-9ac8406771c4 | | os-vol-tenant-attr:tenant_id | 11ab865bd9414f29acfb761e1939cfb9 | | replication_status | disabled | | size | 1 | | snapshot_id | None | | source_volid | None | | status | retyping | | updated_at | 2020-03-19T09:17:13.000000 | | user_id | 6ec9ce5dcc844a05ba55167fa7e0796e | | volume_type | None | +--------------------------------+--------------------------------------+ After volume retype, the volume is Legacy/NFS [stack@undercloud-0 ~]$ cinder show 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf +--------------------------------+--------------------------------------+ | Property | Value | +--------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2020-03-19T09:09:47.000000 | | description | None | | encrypted | False | | id | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | | metadata | {} | | migration_status | success | | multiattach | False | | name | vol1 | | os-vol-host-attr:host | hostgroup@nfs#nfs | | os-vol-mig-status-attr:migstat | success | | os-vol-mig-status-attr:name_id | 460da76e-cabc-4627-b802-9ac8406771c4 | | os-vol-tenant-attr:tenant_id | 11ab865bd9414f29acfb761e1939cfb9 | | replication_status | disabled | | size | 1 | | snapshot_id | None | | source_volid | None | | status | available | | updated_at | 2020-03-19T09:17:13.000000 | | user_id | 6ec9ce5dcc844a05ba55167fa7e0796e | | volume_type | Legacy | Now lets try to attach the volume to an instance [stack@undercloud-0 ~]$ nova volume-attach 6d29edc6-8e14-4c54-bac6-fe7b7c491ff8 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf +----------+--------------------------------------+ | Property | Value | +----------+--------------------------------------+ | device | /dev/vdb | | id | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | | serverId | 6d29edc6-8e14-4c54-bac6-fe7b7c491ff8 | | volumeId | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | +----------+--------------------------------------+ [stack@undercloud-0 ~]$ cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+ | 8a9b6ee8-faf0-41c6-8bd5-ec28447d4faf | in-use | vol1 | 1 | Legacy | false | 6d29edc6-8e14-4c54-bac6-fe7b7c491ff8 | As can be seen the retyped volume now indeed does attach as expected. I'd verify this but it didn't pass the usual flow, didn't need to or some other Cinder fix fixed this along the way.
Good to hear that the package update fix the problem. When debugging tigris01 I've found that some backports were missed. For example Alan's patch[1] As Eric's suggested me on IRC, I will close this as NOTABUG since there's already a bug covering this fix and we don't want people reading this one hoping for info on this bz. [1]https://review.opendev.org/678278/