Description of problem: When checking on cellcontroller of e.g. this cell1 the celeted cell instances won’t get archived: MariaDB [nova]> select count(*) from instances; +----------+ | count(*) | +----------+ | 86255 | +----------+ 1 row in set (0.021 sec) Running instances: MariaDB [nova]> select count(*) from instances where deleted_at is null; +----------+ | count(*) | +----------+ | 3 | +----------+ 1 row in set (0.023 sec) Total: MariaDB [nova]> select count(*) from instances where deleted_at is not null; +----------+ | count(*) | +----------+ | 86266 | +----------+ 1 row in set (0.024 sec) Difference because rally job is still running. Version-Release number of selected component (if applicable): OSP16 How reproducible: always Steps to Reproduce: 1. create instances in an additional cell 2. delete instances from the cell 3. check if instances get archived Actual results: Expected results: Additional info:
we should change the current archive_deleted_rows cron command to something like: ()[root@controller-0 /]$ nova-manage db archive_deleted_rows --before `date --date='today - 2 days' +\%F` --until-complete --all-cells --verbose Archiving.....................................................................................................................................................................................................................complete +--------------------------------+-------------------------+ | Table | Number of Rows Archived | +--------------------------------+-------------------------+ | API_DB.instance_group_member | 0 | | API_DB.instance_mappings | 11270 | | API_DB.request_specs | 11270 | | cell1.block_device_mapping | 7792 | | cell1.instance_actions | 7792 | | cell1.instance_extra | 3896 | | cell1.instance_id_mappings | 3896 | | cell1.instance_info_caches | 3896 | | cell1.instance_system_metadata | 35064 | | cell1.instances | 3896 | | cell2.block_device_mapping | 1684 | | cell2.instance_actions | 1684 | | cell2.instance_actions_events | 1684 | | cell2.instance_extra | 842 | | cell2.instance_id_mappings | 842 | | cell2.instance_info_caches | 842 | | cell2.instance_system_metadata | 7578 | | cell2.instances | 842 | | cell4.block_device_mapping | 5734 | | cell4.instance_actions | 5734 | | cell4.instance_actions_events | 5734 | | cell4.instance_extra | 2867 | | cell4.instance_id_mappings | 2867 | | cell4.instance_info_caches | 2867 | | cell4.instance_system_metadata | 25803 | | cell4.instances | 2867 | | cell5.block_device_mapping | 3194 | | cell5.instance_actions | 3194 | | cell5.instance_actions_events | 3194 | | cell5.instance_extra | 1597 | | cell5.instance_id_mappings | 1597 | | cell5.instance_info_caches | 1597 | | cell5.instance_system_metadata | 14373 | | cell5.instances | 1597 | | cell6.block_device_mapping | 3594 | | cell6.instance_actions | 3594 | | cell6.instance_actions_events | 3594 | | cell6.instance_extra | 1797 | | cell6.instance_id_mappings | 1797 | | cell6.instance_info_caches | 1797 | | cell6.instance_system_metadata | 16173 | | cell6.instances | 1797 | | cell7.block_device_mapping | 542 | | cell7.instance_actions | 542 | | cell7.instance_actions_events | 542 | | cell7.instance_extra | 271 | | cell7.instance_id_mappings | 271 | | cell7.instance_info_caches | 271 | | cell7.instance_system_metadata | 2439 | | cell7.instances | 271 | +--------------------------------+-------------------------+
current cron jobs: [root@controller-0 cron]# cat /var/lib/config-data/puppet-generated/nova/var/spool/cron/nova # HEADER: This file was autogenerated at 2019-11-19 11:16:27 +0000 by puppet. # HEADER: While it can still be managed manually, it is definitely not recommended. # HEADER: Note particularly that the comments starting with 'Puppet Name' should # HEADER: not be deleted, as doing so could cause duplicate cron jobs. # Puppet Name: nova-manage db archive_deleted_rows PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh 1 0 * * * sleep `expr ${RANDOM} \% 3600`; nova-manage db archive_deleted_rows --max_rows 100 --until-complete >>/var/log/nova/nova-rowsflush.log 2>&1 # Puppet Name: nova-manage db purge PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh 0 5 * * * sleep `expr ${RANDOM} \% 3600`; nova-manage db purge --before `date --date='today - 14 days' +\%D` >>/var/log/nova/nova-rowspurge.log 2>&1
I agree that we need to add the --before and --all-cells options to the 'nova-manage db archive_deleted_rows' and 'nova-manage db purge' commands in our cron jobs. We need --before to prevent the orphaning of libvirt guests if/when nova-compute is down when a db archive cron job fires and we need --all-cells to (1) ensure the cell0 database is archived in a single cell deployment and (2) ensure additional cell databases are archived in a multi cell deployment. These are the following rhbz's (which are also cloned for OSP15, OSP14, and OSP13) that could be added as dependencies for this rhbz: --before: https://bugzilla.redhat.com/show_bug.cgi?id=1749382 --all-cells: https://bugzilla.redhat.com/show_bug.cgi?id=1703091 Also, I had just opened this rhbz to add the --all-cells option to our cron jobs yesterday: https://bugzilla.redhat.com/show_bug.cgi?id=1778905 and I think you could probably close it ^ as a duplicate of this rhbz and let Paras know as he should be the QE contact here, I think.
*** Bug 1778905 has been marked as a duplicate of this bug. ***
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0655