Description of problem: Due to an yet unknown reason we can end up in having duplicate primary keys in prod and shadow nova tables, which make the cleanup cron job to fail: ~~~ 1 */12 * * * nova-manage db archive_deleted_rows --max_rows 100 >>/dev/null 2>&1 ~~~ This was reported on OSP9 and OSP11. On a broken situation we have: - 1062 deleted instances in the non shadow tables # mysql -u root nova -e "select count(*) from instances where deleted_at is not null\G;" *************************** 1. row *************************** count(*): 1062 - 1250 deleted instances got previously migrated from the prod tables to the shadow tables # mysql -u root nova -e "select count(*) from shadow_instances\G;" *************************** 1. row *************************** count(*): 1250 - running the job manually fails with dup primary key 3 in shadow_reservations table: # su - nova -s /bin/bash -c "nova-manage db archive_deleted_rows --max_rows 100" Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Option "notification_topics" from group "DEFAULT" is deprecated. Use option "topics" from group "oslo_messaging_notifications". error: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '3' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_reservations (created_at, updated_at, deleted_at, deleted, id, uuid, usage_id, project_id, user_id, resource, delta, expire) SELECT reservations.created_at, reservations.updated_at, reservations.deleted_at, reservations.deleted, reservations.id, reservations.uuid, reservations.usage_id, reservations.project_id, reservations.user_id, reservations.resource, reservations.delta, reservations.expire \nFROM reservations \nWHERE reservations.deleted != %s ORDER BY reservations.id \n LIMIT %s'] [parameters: (0, 100)] Checking the id='3' in shadow_reservations and reservations table: # mysql -u root nova -e "select * from shadow_reservations where id='3'\G;" *************************** 1. row *************************** created_at: 2017-05-31 18:44:58 updated_at: NULL deleted_at: 2017-05-31 18:44:58 id: 3 uuid: 693727f4-a06e-4b0f-be68-9696e6367e1b usage_id: 3 project_id: 1a03f74536e0494681379940c80cf994 resource: instances delta: 1 expire: 2017-06-01 18:44:58 deleted: 3 user_id: 2f33b551cc1d4d3ea85880d1f5e07428 # mysql -u root nova -e "select * from reservations where id='3'\G;" *************************** 1. row *************************** created_at: 2017-07-05 18:38:46 updated_at: NULL deleted_at: 2017-07-05 18:38:49 id: 3 uuid: e9d7983b-9413-4b6b-894b-23828a7a3e93 usage_id: 180 project_id: 29417c4074cf40699337f392c20b35ee resource: instances delta: -1 expire: 2017-07-06 18:38:46 deleted: 3 user_id: 45aadbf32f8da0be1ff49749f2e80e96a117b125e68ebc65b2f422665011325e The primary key seems have started from the beginning at some point. Checking the last entry in the shadow_reservations table the last successful run was a day before the new dup entry got created in the reservations table. *************************** 6350. row *************************** created_at: 2017-07-04 18:33:53 updated_at: NULL deleted_at: 2017-07-04 18:33:53 id: 17907 uuid: c468f4cc-7e84-43cf-8a33-0b54f69bcab1 usage_id: 186 project_id: 29417c4074cf40699337f392c20b35ee resource: cores delta: 1 expire: 2017-07-05 18:33:53 deleted: 17907 user_id: 45aadbf32f8da0be1ff49749f2e80e96a117b125e68ebc65b2f422665011325e Version-Release number of selected component (if applicable): openstack-nova-13.1.4-7 How reproducible: unknown at the moment Actual results: Expected results: Additional info:
I was able to reproduce the dupe primary key situation when we start the archive job on two nodes at the same time, but that will not result in the situation that information get moved to the shadow tables and prim key starts fresh. # su - nova -s /bin/bash -c "nova-manage --verbose db archive_deleted_rows --max_rows 100 --until-complete" Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. An error has occurred: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 1606, in main ret = fn(*fn_args, **fn_kwargs) File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 681, in archive_deleted_rows run = db.archive_deleted_rows(max_rows) File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 2040, in archive_deleted_rows return IMPL.archive_deleted_rows(max_rows=max_rows) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 6564, in archive_deleted_rows tablename, max_rows=max_rows - total_rows_archived) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 6513, in _archive_deleted_rows_for_table conn.execute(insert) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute return meth(self, multiparams, params) File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement compiled_sql, distilled_params File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context context) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1337, in _handle_dbapi_exception util.raise_from_cause(newraise, exc_info) File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 200, in raise_from_cause reraise(type(exception), exception, tb=exc_tb) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context context) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute cursor.execute(statement, parameters) File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute result = self._query(query) File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query conn.query(q) File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 841, in query self._affected_rows = self._read_query_result(unbuffered=unbuffered) File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1029, in _read_query_result result.read() File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1312, in read first_packet = self.connection._read_packet() File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 991, in _read_packet packet.check_error() File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error err.raise_mysql_exception(self._data) File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception raise errorclass(errno, errval) DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '4553' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_actions_events (created_at, updated_at, deleted_at, deleted, id, event, action_id, start_time, finish_time, result, traceback, host, details) SELECT instance_actions_events.created_at, instance_actions_events.updated_at, instance_actions_events.deleted_at, instance_actions_events.deleted, instance_actions_events.id, instance_actions_events.event, instance_actions_events.action_id, instance_actions_events.start_time, instance_actions_events.finish_time, instance_actions_events.result, instance_actions_events.traceback, instance_actions_events.host, instance_actions_events.details \nFROM instance_actions_events \nWHERE instance_actions_events.deleted != %(deleted_1)s ORDER BY instance_actions_events.id \n LIMIT %(param_1)s'] [parameters: {u'param_1': 100, u'deleted_1': 0}] This issue here got also reported as seen in the nova-rowsflush.log, but is likely related that we start all archive jobs on all controllers at the same time. There is no delay as we do it with the keystone token_flush where we create it with: keystone::cron::token_flush::maxdelay if $maxdelay == 0 { $sleep = '' } else { $sleep = "sleep `expr \${RANDOM} \\% ${maxdelay}`; " }
Was able to reproduce with the following sequence: 1) create x instances 2) delete x instances 3) archive the instances 4) restart mariadb 5) create new instance 6) delete instance 7) archive the instance => # su - nova -s /bin/bash -c "nova-manage db archive_deleted_rows --max_rows 100" Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Option "notification_topics" from group "DEFAULT" is deprecated. Use option "topics" from group "oslo_messaging_notifications". error: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '1' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_actions_events (created_at, updated_at, deleted_at, deleted, id, event, action_id, start_time, finish_time, result, traceback, host, details) SELECT instance_actions_events.created_at, instance_actions_events.updated_at, instance_actions_events.deleted_at, instance_actions_events.deleted, instance_actions_events.id, instance_actions_events.event, instance_actions_events.action_id, instance_actions_events.start_time, instance_actions_events.finish_time, instance_actions_events.result, instance_actions_events.traceback, instance_actions_events.host, instance_actions_events.details \nFROM instance_actions_events \nWHERE instance_actions_events.deleted != %s ORDER BY instance_actions_events.id \n LIMIT %s'] [parameters: (0, 100)] From discussing this with Melanie Witt: > While I was looking into this, I learned that the InnoDB engine stores its > AUTO_INCREMENT counter only in memory, and after a restart it selects the > maximum value and increments it by one to initialize the counter again.
Martin, is the customer okay with just dumping the shadow tables before each archive run? Because that's the easy workaround here. The alternative is ... not easy.
(In reply to Dan Smith from comment #3) > Martin, is the customer okay with just dumping the shadow tables before each > archive run? Because that's the easy workaround here. The alternative is ... > not easy. Yes I think that should be fine. We might want to backup the tables before, in case any information is needed at some later point, like: DATE=$(date -I); mkdir -p /srv/$DATE; for i in $(mysql -u root nova -e "show tables like 'shadow%';"); do echo "backup and truncate table $i"; mysqldump -u root nova $i > /srv/$DATE/$i && mysql -u root nova -e "truncate table $i;"; done
I'm closing this as a duplicate of bz1154875 which is DB purging functionality merged in master which will be released in OSP14. *** This bug has been marked as a duplicate of bug 1154875 ***
(In reply to Martin Schuppert from comment #2) > Was able to reproduce with the following sequence: > > 1) create x instances > 2) delete x instances > 3) archive the instances > 4) restart mariadb In case it's not clear to others (because it took me a while to figure out), this is the problematic step... > 5) create new instance > 6) delete instance > 7) archive the instance > => > # su - nova -s /bin/bash -c "nova-manage db archive_deleted_rows --max_rows > 100" > Option "verbose" from group "DEFAULT" is deprecated for removal. Its value > may be silently ignored in the future. > Option "notification_topics" from group "DEFAULT" is deprecated. Use option > "topics" from group "oslo_messaging_notifications". > error: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '1' for key > 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_actions_events (created_at, > updated_at, deleted_at, deleted, id, event, action_id, start_time, > finish_time, result, traceback, host, details) SELECT > instance_actions_events.created_at, instance_actions_events.updated_at, > instance_actions_events.deleted_at, instance_actions_events.deleted, > instance_actions_events.id, instance_actions_events.event, > instance_actions_events.action_id, instance_actions_events.start_time, > instance_actions_events.finish_time, instance_actions_events.result, > instance_actions_events.traceback, instance_actions_events.host, > instance_actions_events.details \nFROM instance_actions_events \nWHERE > instance_actions_events.deleted != %s ORDER BY instance_actions_events.id \n > LIMIT %s'] [parameters: (0, 100)] > > From discussing this with Melanie Witt: > > While I was looking into this, I learned that the InnoDB engine stores its > > AUTO_INCREMENT counter only in memory, and after a restart it selects the > > maximum value and increments it by one to initialize the counter again. ...because of this ^ As noted, InnoDB stores the AUTO_INCREMENT counter only in memory, and after a restart it selects the maximum value and increments it by one to initialize the counter again. By archiving instances, we have copied lots of rows from the prod tables into the shadow tables and then deleted those rows from the prod tables. By restarting mariadb, our AUTO_INCREMENT counter will get reset and may be set to a lower value that previously, since some of the instances with the highest IDs may have been archived. This isn't an issue for the prod tables since there will be no entries with IDs greater than the AUTO_INCREMENT counter (they were deleted), but it will be an issue for the archive tables. There are a few solutions: we could manually set the AUTO_INCREMENT counter start values for the prod tables based on the max value in the archive table, possibly by way of an 'init_file' [1], we could use a different engine that doesn't have this in-memory AUTO_INCREMENT counter behavior (I'm not aware of any), or we could suggest people either never archive or always archive and purge. None of those are particularly attractive resolutions, but the third one is the least invasive and therefore the one we've opted for here. [1] https://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_init_file