Bug 1704338

Summary: [RFE] Implement date-based archive-deleted-rows
Product: Red Hat OpenStack Reporter: Priscila <pveiga>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED DUPLICATE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: alifshit, dasmith, egallen, eglynn, jhakimra, jjoyce, jschluet, kchamart, lyarwood, mschuppe, nlevinki, sbauza, sgordon, slinaber, tvignaud, vromanso
Target Milestone: ---Keywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-04 15:42:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Priscila 2019-04-29 15:16:59 UTC
Description of problem:

Recently we noticed that our database was growing too fast and making the environment very slow, further investigation has shown that the nova archive that runs on "nova_api_cron" containers are not working.

We have since run it manually and it worked:
/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported
  exception.NotSupportedWarning
+-------------------------+-------------------------+
| Table                   | Number of Rows Archived |
+-------------------------+-------------------------+
| instance_actions_events | 5000                    |
+-------------------------+-------------------------+

Version-Release number of selected component (if applicable):

To workaround We manually purge de DB

'nova-manage db archive_deleted_rows --all-cells --purge --verbose --until-complete'

Seems the nova-api cron does not run on containers

Bugs related:
 https://bugzilla.redhat.com/show_bug.cgi?id=1500362 and https://bugzilla.redhat.com/show_bug.cgi?id=1154875

How reproducible: Always


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results: Not ned to run manually the purge


Additional info:

Comment 2 Matthew Booth 2019-05-17 12:33:08 UTC
Cron from an OSP13 system:

    ()[root@controller-0 /]$ cat /var/spool/cron/nova
    # HEADER: This file was autogenerated at 2019-05-15 12:27:58 +0000 by puppet.
    # HEADER: While it can still be managed manually, it is definitely not recommended.
    # HEADER: Note particularly that the comments starting with 'Puppet Name' should
    # HEADER: not be deleted, as doing so could cause duplicate cron jobs.
    # Puppet Name: nova-manage db archive_deleted_rows
    PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
    1 0 * * * nova-manage db archive_deleted_rows --max_rows 100  >>/var/log/nova/nova-rowsflush.log 2>&1

Comment 3 Matthew Booth 2019-05-17 12:43:21 UTC
I can see from the cron logs that the job is running successfully. It looks like the issue is that we're defaulting to archiving only 100 rows per day. This should almost certainly be time based rather than row based.

Comment 4 Matthew Booth 2019-05-17 13:28:23 UTC
Note that while purge supports time-based deletion in master, archive_deleted_rows does not. However, it could.

Comment 5 Matthew Booth 2019-05-17 14:14:30 UTC
Martin, please can you describe a workaround for how to bump this until we can do it properly?

Comment 7 Martin Schuppert 2019-05-20 09:30:32 UTC
(In reply to Matthew Booth from comment #5)
> Martin, please can you describe a workaround for how to bump this until we
> can do it properly?

First of all two things:
1) puppet-nova default for max_rows is 100 while nova-manage default is 1000 [1]
2) '--all-cells' and '--purge' are no options for archive_deleted_rows in OSP13

> nova-manage db archive_deleted_rows --all-cells --purge --verbose --until-complete

()[root@controller-0 /]$ nova-manage db archive_deleted_rows --help
usage: nova-manage db archive_deleted_rows [-h] [--max_rows <number>]
                                           [--verbose] [--until-complete]

optional arguments:
  -h, --help           show this help message and exit
  --max_rows <number>  Maximum number of deleted rows to archive. Defaults to
                       1000.
  --verbose            Print how many rows were archived per table.
  --until-complete     Run continuously until all deleted rows are archived.
                       Use max_rows as a batch size for each iteration.

So basically what here probably results in success is the number of rows and until-complete :

We can configure the archive_deleted_rows cron job in OSP13 using the following parameters for when it should run:

  NovaCronDBArchivedMinute:
    type: string
    description: >
        Cron to move deleted instances to another table that doesn't need backup - Minute
    default: '1'
  NovaCronDBArchivedHour:
    type: string
    description: >
        Cron to move deleted instances to another table that doesn't need backup - Hour
    default: '0'
  NovaCronDBArchivedMonthday:
    type: string
    description: >
        Cron to move deleted instances to another table that doesn't need backup - Month Day
    default: '*'
  NovaCronDBArchivedMonth:
    type: string
    description: >
        Cron to move deleted instances to another table that doesn't need backup - Month
    default: '*'
  NovaCronDBArchivedWeekday:
    type: string
    description: >
        Cron to move deleted instances to another table that doesn't need backup - Week Day
    default: '*'

In addition we can overwrite the default hiera for max_rows and/or until_complete:
#  [*max_rows*]
#    (optional) Maximum number of deleted rows to archive.
#    Defaults to '100'.
#
#  [*until_complete*]
#    (optional) Adds --until-complete to the archive command
#    Defaults to false.

I'd try using:
    ControllerExtraConfig:
        nova::cron::archive_deleted_rows::max_rows: 1000
        nova::cron::archive_deleted_rows::until_complete: True

[1] https://github.com/openstack/nova/blob/stable/queens/nova/cmd/manage.py#L493

Comment 8 Priscila 2019-05-22 13:41:32 UTC
We have recently upgraded the cloud to the latest and it seems that the crontask works now. However, we still have the issue that the "nova_cell0" database does not get archived/cleaned up.

Comment 9 Martin Schuppert 2019-05-22 13:53:19 UTC
(In reply to Priscila from comment #8)
> We have recently upgraded the cloud to the latest and it seems that the
> crontask works now. However, we still have the issue that the "nova_cell0"
> database does not get archived/cleaned up.

check [1][2] re nova_cell0

[1] https://access.redhat.com/solutions/4088311
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1703091

Comment 10 Artom Lifshitz 2022-10-04 15:42:15 UTC

*** This bug has been marked as a duplicate of bug 1763329 ***