Bug 1777906 - cinder-manage fails to purge deleted volumes due to constraint
Summary: cinder-manage fails to purge deleted volumes due to constraint
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 13.0 (Queens)
Hardware: x86_64
OS: All
medium
high
Target Milestone: z12
: 13.0 (Queens)
Assignee: Rajat Dhasmana
QA Contact: Tzach Shefi
Chuck Copello
URL:
Whiteboard:
: 1565460 (view as bug list)
Depends On: 1565460
Blocks: 1794693 1796144
TreeView+ depends on / blocked
 
Reported: 2019-11-28 16:09 UTC by David Hill
Modified: 2024-02-04 04:25 UTC (History)
7 users (show)

Fixed In Version: openstack-cinder-12.0.10-5.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1794693 (view as bug list)
Environment:
Last Closed: 2020-06-24 11:51:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 696676 0 'None' MERGED Make volume soft delete more thorough 2021-02-03 22:21:13 UTC
Red Hat Issue Tracker OSP-5111 0 None None None 2022-02-01 23:43:34 UTC
Red Hat Knowledge Base (Solution) 4619831 0 None None None 2019-11-28 16:11:08 UTC
Red Hat Product Errata RHBA-2020:2722 0 None None None 2020-06-24 11:52:14 UTC

Description David Hill 2019-11-28 16:09:30 UTC
Description of problem:

()[root@overcloud-controller-01 /]# cat /var/spool/cron/cinder 
# HEADER: This file was autogenerated at 2019-11-16 08:33:00 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: cinder-manage db purge
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
1 0 * * * cinder-manage db purge 30 >>/var/log/cinder/cinder-rowsflush.log 2>&1
()[root@overcloud-controller-01 /]# cinder-manage db purge 30
Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Purge command failed, check cinder-manage logs for more details.
()[root@overcloud-controller-01 /]# cinder-manage logs errors
Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
DEPRECATED: The log commands are deprecated since Queens and are not maintained. They will be removed in an upcoming release.
/var/log/cinder/cinder-manage.log:-
Line 14 : 2019-11-28 14:19:10.181 1649 ERROR cinder.db.sqlalchemy.api [req-f9a046ec-1386-4485-948d-3318cc74982a - - - - -] DBError detected when purging from volumes: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_attachment`, CONSTRAINT `volume_attachment_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2019, 10, 29, 14, 19, 6, 707533)}] (Background on this error at: http://sqlalche.me/e/gkpj).: DBReferenceError: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_attachment`, CONSTRAINT `volume_attachment_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2019, 10, 29, 14, 19, 6, 707533)}] (Background on this error at: http://sqlalche.me/e/gkpj)


Version-Release number of selected component (if applicable):
Latest

How reproducible:
Now

Steps to Reproduce:
1. Don't know yet
2.
3.

Actual results:
Fails to archive deleted rows and they are accumulating until cinder-volume stops working

Expected results:
Never fails to archive deleted rows

Additional info:

Comment 1 Eric Harney 2019-12-02 14:31:28 UTC
Same issue as reported in bug 1565460.

Comment 4 Eric Harney 2019-12-13 15:58:48 UTC
*** Bug 1565460 has been marked as a duplicate of this bug. ***

Comment 6 Andreas Karis 2020-01-08 11:18:11 UTC
The customer said that they are not yet affected by https://access.redhat.com/solutions/4619831, but given that the described issue may be a direct result of this bug, is there an ETA for this specific BZ?

Comment 7 Andreas Karis 2020-01-08 11:24:07 UTC
I see that this wasn't merged upstream, yet:
https://review.opendev.org/#/c/696676/

Can we push for this to be merged and then backport it to OSP 13?

Thanks,

Andreas

Comment 9 Andreas Karis 2020-01-10 09:14:14 UTC
Hi,

I'd still like to push for some news.

THanks,

Andreas

Comment 10 Rajat Dhasmana 2020-01-13 07:30:04 UTC
Hi Andreas,

The patch is currently under review upstream and has a +2 on it.
When it merges (can't propose an ETA), we will backport it to queens.

Thanks and Regards
Rajat Dhasmana

Comment 11 Rajat Dhasmana 2020-02-10 09:19:05 UTC
smykhail: duplicate bug has been closed
akaris: patch has been merged upstream and currently in the upstream backport process to stable branches

Comment 18 Tzach Shefi 2020-06-09 12:45:35 UTC
Verified on:
openstack-cinder-12.0.10-10.el7ost.noarch

Create a volume:
(overcloud) [stack@undercloud-0 ~]$ cinder create 4 --name Tshefi
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-06-09T11:32:53.000000           |
| description                    | None                                 |
| encrypted                      | False                                |
| id                             | b760df67-d786-4a00-b1a7-c3b53f57b01a |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | Tshefi                               |
| os-vol-host-attr:host          | controller-0@nfs#nfs                 |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | db6d21ea62474d62811b2354371875ce     |
| replication_status             | None                                 |
| size                           | 4                                    |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-06-09T11:32:53.000000           |
| user_id                        | 9449585f76de4556b7c31d9d001f4618     |
| volume_type                    | tripleo                              |
+--------------------------------+--------------------------------------+


2. Set metadata on volume:
(overcloud) [stack@undercloud-0 ~]$ cinder metadata b760df67-d786-4a00-b1a7-c3b53f57b01a set a=b


3. Validate metadata set:
(overcloud) [stack@undercloud-0 ~]$ cinder show b760df67-d786-4a00-b1a7-c3b53f57b01a | grep meta
| metadata                       | a : b                                |

4. Manipulate DB, set a "fake" delete date, notice we didn't actuality delete the volume yet:
MariaDB [cinder]> update volumes set deleted_at='2020-03-31 07:29:14' where id='b760df67-d786-4a00-b1a7-c3b53f57b01a';

5. Now lets try to purge record db using:
cinder-manage.log reports the original problem reproduced:

()[root@controller-0 cinder]# cinder-manage db purge 0
Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Purge command failed, check cinder-manage logs for more details.

2020-06-09 12:36:47.501 7421 ERROR cinder.db.sqlalchemy.api [req-54759fe4-6136-49dd-8725-be59ab46f6b6 - - - - -] DBError detected when purging from volumes: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_metadata`, CONSTRAINT `volume_metadata_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2020, 6, 9, 12, 36, 47, 492991)}] (Background on this error at: http://sqlalche.me/e/gkpj).: DBReferenceError: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_metadata`, CONSTRAINT `volume_metadata_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2020, 6, 9, 12, 36, 47, 492991)}] (Background on this error at: http://sqlalche.me/e/gkpj)

And well unsurprisingly as purge failed we retain the volume's DB record:
| 2020-06-09 11:32:53 | 2020-06-09 11:32:58 | 2020-03-31 07:29:14 |       0 | b760df67-d786-4a00-b1a7-c3b53f57b01a | NULL   | 9449585f76de4556b7c31d9d001f4618 | db6d21ea62474d62811b2354371875ce | controller-0@nfs#nfs |    4 | nova              | available | detached      | 2020-06-09 11:32:53 | 2020-06-09 11:32:58 | NULL          | Tshefi       | NULL                | 10.35.160.111:/export/ins_cinder | NULL          | NULL        | ef58deca-4022-47c2-b56e-8d2a5e58f1cc | NULL         |        0 | NULL              | NULL     | NULL              | NULL             | NULL               | NULL                        | NULL                    | NULL                | NULL        |           0 | NULL            | NULL         | NULL     | 2d22950a-7c10-4344-8ec0-6ed789b65369 |              0 |


6. This time delete the volume for real
(overcloud) [stack@undercloud-0 ~]$ cinder delete b760df67-d786-4a00-b1a7-c3b53f57b01a
Request to delete volume b760df67-d786-4a00-b1a7-c3b53f57b01a has been accepted.

7. Now re-purge the DB again:
()[root@controller-0 cinder]# cinder-manage db purge 0
Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".

This time around the purge prunes the DB as expected. 

2020-06-09 12:42:22.126 7497 INFO cinder.db.sqlalchemy.api [req-fe9d5b06-54e4-4592-8560-63c8ed5f0ba4 - - - - -] Deleted 1 rows from table=volumes
2020-06-09 12:42:22.127 7497 INFO cinder.db.sqlalchemy.api [req-fe9d5b06-54e4-4592-8560-63c8ed5f0ba4 - - - - -] Purging deleted rows older than age=0 days from table=volume_types


As expected volume is purged from DB, despite having the metadata set. 


MariaDB [cinder]> MariaDB [cinder]> select * from volumes where id='b760df67-d786-4a00-b1a7-c3b53f57b01a';
Empty set (0.00 sec)

..

Comment 20 errata-xmlrpc 2020-06-24 11:51:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2722

Comment 21 Red Hat Bugzilla 2024-02-04 04:25:33 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.