Bug 1741453 - [16.1] NFS snapshot deletion issue
Summary: [16.1] NFS snapshot deletion issue
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z8
: 16.1 (Train on RHEL 8.2)
Assignee: Sofia Enriquez
QA Contact: Tzach Shefi
RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-15 08:07 UTC by Tzach Shefi
Modified: 2022-03-24 10:59 UTC (History)
4 users (show)

Fixed In Version: openstack-cinder-15.4.0-1.20220107213400.58f0e73.el8ost
Doc Type: Bug Fix
Doc Text:
Before this update, the OpenStack NFS driver blocked attempts to delete snapshots in an error state when snapshot support is disabled. New or existing snapshots are placed in an error state when snapshot support is disabled, but users could not remove these failed snapshots. With this update, users can now remove NFS snapshots in error status.
Clone Of:
Environment:
Last Closed: 2022-03-24 10:59:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
c-vol log look towards the end (502.00 KB, text/plain)
2019-08-15 09:10 UTC, Tzach Shefi
no flags Details
c-vol log (223.06 KB, text/plain)
2019-08-19 15:36 UTC, Tzach Shefi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1842088 0 None None None 2019-08-30 17:50:47 UTC
OpenStack gerrit 679138 0 'None' MERGED Allow removing NFS snapshots in error status 2021-11-10 18:18:04 UTC
OpenStack gerrit 810406 0 None MERGED Allow removing NFS snapshots in error status 2021-11-10 18:18:07 UTC
Red Hat Issue Tracker OSP-701 0 None None None 2021-11-10 19:07:57 UTC
Red Hat Product Errata RHBA-2022:0986 0 None None None 2022-03-24 10:59:41 UTC

Description Tzach Shefi 2019-08-15 08:07:52 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Tzach Shefi 2019-08-15 09:08:58 UTC
^ a case of quick fingers on Enter key ;)

RFE (only if a minor change)  cinder snapshot-delete --force should also be able/allowed to handle snapshots in error_deleting state. 
Not just snapshots in available or error status which it handles today. 

OSP15 
Happens every time. 

While testing THT NFS snapshot support, on one of the cases snapshot support was set to false.
As such a snapshot was created in error state due to lack of backend snapshot support - expected. 
Just explaining how to get to a snapshot in an error state. 

When I tried to delete that snapshot it failed,
yes admit I should have used the --force right at this stage.
However I genuinely forgot about that option in the heat of the moment.
And so my tiny problem now worsened to a DH hacking exercise. 

This:
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+--------------------------------------+--------------------------------------+--------+------------+------+
| ID                                   | Volume ID                            | Status | Name       | Size |
+--------------------------------------+--------------------------------------+--------+------------+------+
| 6638c6db-e7af-4ce8-94e6-8f8fd811e41b | 5b021950-efe0-4a74-ac93-10042adc60ae | error  | ShouldFail | 1    |
+--------------------------------------+--------------------------------------+--------+------------+------+

after this:
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-delete 6638c6db-e7af-4ce8-94e6-8f8fd811e41b

Turned into a mess:
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+--------------------------------------+--------------------------------------+----------------+------------+------+
| ID                                   | Volume ID                            | Status         | Name       | Size |
+--------------------------------------+--------------------------------------+----------------+------------+------+
| 6638c6db-e7af-4ce8-94e6-8f8fd811e41b | 5b021950-efe0-4a74-ac93-10042adc60ae | error_deleting | ShouldFail | 1    |
+--------------------------------------+--------------------------------------+----------------+------------+------+ 

The above I can't delete even with --force (too late) it remains in the same status. 
And well can't remove source volume due to the snapshot.
Left with DB hacking should I wish to get rid of snap/volume. 


I'm  "privileged" (as QE) to know/care less about how things work and tend wear a "users" redhat if I may.
 
Lack of backend support didn't stop Cinder api (or whomever) to create a failed snapshot. 
Why they should lack of backend snap support arise as an issue during snapshot delete or --force delete ? 
IMHO system was "smart" enough to create this snap it should be smart enough to manage to delete it. 

c-vol log, 

2019-08-15 08:32:37.119 64 DEBUG cinder.manager [req-d0071aaf-042c-4d59-a3cf-2c41d829377f - - - - -] Notifying Schedulers of capabilities ... _publish_service_capabilities /usr/lib/python3.6/site-packages/cinder/manager.py:194
2019-08-15 08:32:54.721 64 DEBUG cinder.coordination [req-397a2162-54e9-4e0f-8785-11d7d8149a85 f70c599581b948f19ada5d2d9ae153dc 12b572011ebe42ac9f2ba28eb2cb9b43 - default default] Lock "/var/lib/cinder/cinder-6638c6db-e7af-4ce8-94e6-8f8fd811e41b-delete_snapshot" acquired by "delete_snapshot" :: waited 0.001s _synchronized /usr/lib/python3.6/site-packages/cinder/coordination.py:150
2019-08-15 08:32:54.757 64 DEBUG cinder.coordination [req-397a2162-54e9-4e0f-8785-11d7d8149a85 f70c599581b948f19ada5d2d9ae153dc 12b572011ebe42ac9f2ba28eb2cb9b43 - default default] Lock "/var/lib/cinder/cinder-nfs-5b021950-efe0-4a74-ac93-10042adc60ae" acquired by "delete_snapshot" :: waited 0.000s _synchronized /usr/lib/python3.6/site-packages/cinder/coordination.py:150
2019-08-15 08:32:54.758 64 DEBUG cinder.coordination [req-397a2162-54e9-4e0f-8785-11d7d8149a85 f70c599581b948f19ada5d2d9ae153dc 12b572011ebe42ac9f2ba28eb2cb9b43 - default default] Lock "/var/lib/cinder/cinder-nfs-5b021950-efe0-4a74-ac93-10042adc60ae" released by "delete_snapshot" :: held 0.001s _synchronized /usr/lib/python3.6/site-packages/cinder/coordination.py:162
2019-08-15 08:32:54.764 64 DEBUG cinder.coordination [req-397a2162-54e9-4e0f-8785-11d7d8149a85 f70c599581b948f19ada5d2d9ae153dc 12b572011ebe42ac9f2ba28eb2cb9b43 - default default] Lock "/var/lib/cinder/cinder-6638c6db-e7af-4ce8-94e6-8f8fd811e41b-delete_snapshot" released by "delete_snapshot" :: held 0.043s _synchronized /usr/lib/python3.6/site-packages/cinder/coordination.py:162
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server [req-397a2162-54e9-4e0f-8785-11d7d8149a85 f70c599581b948f19ada5d2d9ae153dc 12b572011ebe42ac9f2ba28eb2cb9b43 - default default] Exception during message handling: cinder.exception.VolumeDriverException: Volume driver reported an error: NFS driver snapshot support is disabled in cinder.conf.
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "<decorator-gen-239>", line 2, in delete_snapshot
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/cinder/coordination.py", line 151, in _synchronized
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     return f(*a, **k)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line 1225, in delete_snapshot
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     snapshot.save()
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     self.force_reraise()
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     raise value
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line 1215, in delete_snapshot
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     self.driver.delete_snapshot(snapshot)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "<decorator-gen-265>", line 2, in delete_snapshot
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/cinder/coordination.py", line 151, in _synchronized
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     return f(*a, **k)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/nfs.py", line 576, in delete_snapshot
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     self._check_snapshot_support()
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/nfs.py", line 554, in _check_snapshot_support
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server     raise exception.VolumeDriverException(message=msg)
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server cinder.exception.VolumeDriverException: Volume driver reported an error: NFS driver snapshot support is disabled in cinder.conf.
2019-08-15 08:32:54.764 64 ERROR oslo_messaging.rpc.server

Comment 2 Tzach Shefi 2019-08-15 09:10:39 UTC
Created attachment 1604027 [details]
c-vol log look towards the end

Comment 3 Tzach Shefi 2019-08-19 15:36:02 UTC
Created attachment 1605802 [details]
c-vol log

Not sure this is only and RFE, looks like cinder snapshot-delete --force is broken-ish, doesn't do much.  


While retesting am OSP14 clone of original nfs THT snapshot support bz. 

As expected created a failed status = error,
However this time around I did remember to use --force on initial snap-delete request. 
It didn't help much, snapshot failed to delete, status again reached error_deleting. 

(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list    
+--------------------------------------+--------------------------------------+--------+------+------+
| ID                                   | Volume ID                            | Status | Name | Size |
+--------------------------------------+--------------------------------------+--------+------+------+
| 8d6024f9-b98d-425d-b416-0b12ecbaf0ef | fe867488-bb38-4251-aada-b086f78a8a74 | error  | -    | 1    |
+--------------------------------------+--------------------------------------+--------+------+------+



(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-delete --force 8d6024f9-b98d-425d-b416-0b12ecbaf0ef
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+--------------------------------------+--------------------------------------+----------------+------+------+
| ID                                   | Volume ID                            | Status         | Name | Size |
+--------------------------------------+--------------------------------------+----------------+------+------+
| 8d6024f9-b98d-425d-b416-0b12ecbaf0ef | fe867488-bb38-4251-aada-b086f78a8a74 | error_deleting | -    | 1    |
+--------------------------------------+--------------------------------------+----------------+------+------+

Meaning cinder snapshot-delete --force isn't working great or least doesn't even handle all the cases were snap status is error (not just error_deleting).

Comment 4 Tzach Shefi 2019-08-20 04:37:35 UTC
Since comment above suggests snapshot-delete --force isn't working 
Removed RFE from title, --force doesn't have to be enhanced but rather fixed.

Comment 5 Eric Harney 2019-08-20 15:48:38 UTC
I don't think this bug is about "snapshot-delete --force", it's more that if NFS snapshot support is disabled, Cinder will register a snapshot (and then error it rather than creating it), and the deletion process doesn't account for this.

Comment 10 Sofia Enriquez 2021-11-10 19:42:03 UTC
- 16.2 https://bugzilla.redhat.com/show_bug.cgi?id=2022121

Comment 15 Tzach Shefi 2022-03-07 12:55:28 UTC
Verified on:
openstack-cinder-15.4.0-1.20220114193342.58f0e73.el8ost.noarch

On a deployment using generic NFS backend for Cinder, 
We've set nfs_snapshot_support=false. 

Lets create an empty volume:
(overcloud) [stack@undercloud-0 ~]$ cinder create 1 --name volA
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2022-03-07T12:48:57.000000           |
| description                    | None                                 |
| encrypted                      | False                                |
| id                             | 2182d2ab-181c-425b-a545-5dd56c62a0d8 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | volA                                 |
| os-vol-host-attr:host          | hostgroup@tripleo_nfs#tripleo_nfs    |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | 1d0c7f0ebdbb454785e5506fa6e92295     |
| replication_status             | None                                 |
| size                           | 1                                    |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2022-03-07T12:48:57.000000           |
| user_id                        | 0f74a44ac5e24ea08b27bddaedad7754     |
| volume_type                    | tripleo                              |
+--------------------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ID                                   | Status    | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| 2182d2ab-181c-425b-a545-5dd56c62a0d8 | available | volA | 1    | tripleo     | false    |             |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+

Now lets create a snapshot, it should fail as we disabled snapshot support:

(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-create 2182d2ab-181c-425b-a545-5dd56c62a0d8 --name snap1
+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2022-03-07T12:50:59.614944           |
| description | None                                 |
| id          | e9b05d85-0c0b-4d52-b5e2-1a76e27b77f4 |
| metadata    | {}                                   |
| name        | snap1                                |
| size        | 1                                    |
| status      | creating                             |
| updated_at  | None                                 |
| volume_id   | 2182d2ab-181c-425b-a545-5dd56c62a0d8 |
+-------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+--------------------------------------+--------------------------------------+--------+-------+------+
| ID                                   | Volume ID                            | Status | Name  | Size |
+--------------------------------------+--------------------------------------+--------+-------+------+
| e9b05d85-0c0b-4d52-b5e2-1a76e27b77f4 | 2182d2ab-181c-425b-a545-5dd56c62a0d8 | error  | snap1 | 1    |
+--------------------------------------+--------------------------------------+--------+-------+------+


Now lets try to delete this failed snapshot:
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-delete e9b05d85-0c0b-4d52-b5e2-1a76e27b77f4
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+----+-----------+--------+------+------+
| ID | Volume ID | Status | Name | Size |
+----+-----------+--------+------+------+
+----+-----------+--------+------+------+

Great, as apposed to before the fix, 
Now I did manage to delete the snapshot. 


Lets try again this time deleting with --force:
Create a new snapshot, again it should fail:
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-create 2182d2ab-181c-425b-a545-5dd56c62a0d8 --name snap2
+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2022-03-07T12:52:52.655385           |
| description | None                                 |
| id          | 93e97820-2e1b-442e-9c8c-a2200ee113be |
| metadata    | {}                                   |
| name        | snap2                                |
| size        | 1                                    |
| status      | creating                             |
| updated_at  | None                                 |
| volume_id   | 2182d2ab-181c-425b-a545-5dd56c62a0d8 |
+-------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+--------------------------------------+--------------------------------------+--------+-------+------+
| ID                                   | Volume ID                            | Status | Name  | Size |
+--------------------------------------+--------------------------------------+--------+-------+------+
| 93e97820-2e1b-442e-9c8c-a2200ee113be | 2182d2ab-181c-425b-a545-5dd56c62a0d8 | error  | snap2 | 1    |
+--------------------------------------+--------------------------------------+--------+-------+------+


Again lets try to delete this failed snapshot, this time lets use --force flag:
(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-delete --force  93e97820-2e1b-442e-9c8c-a2200ee113be 

(overcloud) [stack@undercloud-0 ~]$ cinder snapshot-list
+----+-----------+--------+------+------+
| ID | Volume ID | Status | Name | Size |
+----+-----------+--------+------+------+
+----+-----------+--------+------+------+


Yep good to verify, we successfully managed to delete a failed snapshot.

Comment 20 errata-xmlrpc 2022-03-24 10:59:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0986


Note You need to log in before you can comment on or make changes to this bug.