Bug 1245033 - Openstack DB size & cleanup
Summary: Openstack DB size & cleanup
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: mariadb-galera
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 5.0 (RHEL 7)
Assignee: Michael Bayer
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-21 05:25 UTC by Dave Maley
Modified: 2019-07-11 09:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-12 20:40:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dave Maley 2015-07-21 05:25:06 UTC
Description of problem:
The customers mysql DB file has grown to 53 GB even though the environment has a limited number of tenants & vm's.  It was determined this was due to the size of the keystone token table (nearly 3,000,000 entries).  Attempts to free space using 'keystone-manage token_flush' failed.  When our developers who were onsite attempted a similar operation they encountered locking errors.

It was suggested that the customer could truncate the table to remedy the situation however they requested the suggestion be run by our mariadb experts.

The customer is also requesting guidance on the best way to keep it cleaned up ongoing.

Comment 2 Dave Maley 2015-07-21 05:30:29 UTC
> Can we get a confirmation from someone about if truncate of tokens table in
> keystone database would  help us reduce the size of this DB?

So yes, a TRUNCATE would definitely reduce the size of the database, however the actual ibdata1 file that was previously diagnosed as very large on the filesystem would probably not shrink in size; this is a file that only grows but isn't shrunk.  To make the actual ibdata1 file smaller the whole database would need to be dumped and restored to an empty data directory (I can provide instructions on this).  That said, you certainly would want to get rid of millions of unwanted rows before you do this.

One thing to consider with TRUNCATE is that it only blanks out the entire table; that is, any keystone tokens you have that are unexpired would also be deleted.  The implementation of the keystone-manage command limits the rows that it deletes only to those whose "expires" value is prior to the current time.   If the customer is OK with losing all non-expired tokens then that would be fine.  However, there are other ways to get around this; see below.

> Also, is there a procedure that we need to follow when this table is 
> truncated? I mean do we need to shutdown any particular services before
> we do this?

So there's actually some options here.  The TRUNCATE, as it is under the classification of data definition language for MySQL, needs to lock the entire table to work; once it has this lock, it should be extremely fast.  So if you are doing a straight TRUNCATE of the table, if Keystone is still running and using the table, there might be a delay while TRUNCATE attempts to acquire a lock. You'd probably be fine if Keystone continues running though if shutting it down is an option, I'd probably prefer that, as it at least guarantees that the TRUNCATE can lock the table and proceed immediately.

So the straight TRUNCATE approach is fine if the customer can lose non-expired tokens and can also tolerate a potential hiccup as Keystone operations that proceed while the TRUNCATE takes place may see the data they just persisted vanishing.

The other option here is to use a common technique with large MySQL data operations which is to use table renames in order to swap out the very large table to an anonymously-named table, which can then just be TRUNCATED safely, or more likely just dropped entirely.  With this approach, as it is likely that the number of unexpired tokens in this table is much smaller than the expired tokens ( I can give you a query that will show this comparison if you need one), you would follow these steps:

1. create a new table with CREATE TABLE matching the same definition as the current "token" table, with an alternate name, e.g. "token_new" (I can provide this for you if you want to try this approach).

2. INSERT rows into the new table that are SELECTed from the existing "token" table, specifying criteria that only refers to the *unexpired* rows; e.g. "INSERT INTO token_new (id, expires, extra, valid, user_id, trust_id) SELECT * from token WHERE expires > UTC_TIMESTAMP()"

3. issue RENAME instructions to rename token to token_old and token_old to token; as it turns out MySQL has an extremely handy syntax to do this in one step, which is "RENAME TABLE token TO token_old, token_new TO token", and it even does it atomically using table locks, so the rest of the environment would never see the "token" table missing.

4. you can then TRUNCATE or DROP token_old, as this table is no longer used.

With the above approach, the Keystone API service would be able to continue running without issue, and you would maintain all of your unexpired tokens, with the edge-case exception of tokens that were added to "token" in between when you do the INSERT and the RENAME; if you really wanted to, you could locate these tokens in "token_old" and INSERT them into "token", it depends on how critical it is to maintain every token.

The RENAME TABLE operation again wants to lock the table fully, so again keeping Keystone running means that the operation might be delayed while it attempts to acquire a full table lock.  I'd be less concerned about keeping Keystone running while the RENAME approach takes effect vs. the TRUNCATE approach, because at least the RENAME should definitely not have any delay while the lock is acquired.  Galera will replicate out the operation (either TRUNCATE or RENAME) to other nodes *before* it actually runs the DDL locally; this is due to the mechanics of how it replicates non-transactional DDL instructions.  The effect is that the operation should occur on all nodes at basically the same time.

Whichever approach we choose, if playing it safe is a priority here, it may be worthwhile to run the whole operation on a separate environment first, ensuring that all the operations succeed quickly and keystone remains unimpacted.

Comment 3 Dave Maley 2015-07-21 05:34:17 UTC
comment 2 is mbayer's response to the customers initial questions.  below is the customers response to comment 2:

-----
1. To do the initial cleanup, I am ok to use the option 2 so that we have minimal disruption to the system. We can work with the rename options.

I am concerned about the details that ibdata file will not shrink and it requires to restore to an empty db. I need very clear details on how to do it and what it means to our running system.

Also, does the file size growth gets limited if we run nightly token cleanup jobs? Or this file is expected to grow no matter what type of cleanup we do?

We need a solution that can help us achieve this with minimal (if not any) disruption as we have a running production system.
-----

Comment 4 Michael Bayer 2015-07-21 16:49:30 UTC
so wrt to the ibdata1 file, I'm going to try an experiment locally to see if we can take advantage of Galera's replication to do an easier no-downtime version of the ibdata1 shrink operation, where each node is taken offline individually.  

as far as file growth for nightly jobs, the answer is yes, if you prevent the size of the database itself from growing too large then the idbdata1 file will also not grow beyond a certain size.  *However* there is also the option of setting innodb file_per_table which will change the engine to generate individual datafiles for each table, no longer using ibdata1 as the primary data storage (it is still used for transaction logs, however).   It is the common technique of managing ibdata1 files that grow unbounded so this should be considered.

Comment 5 Michael Bayer 2015-07-21 16:51:21 UTC
for the RENAME operation, I need to know the exact version of Keystone that is in use so that I can identify the correct table structure.

Comment 6 Michael Bayer 2015-07-21 17:14:13 UTC
The table structure hasn't changed since icehouse save for two indexes added in Juno, so I will use the structure that's common to juno/kilo

Comment 18 Michael Bayer 2015-08-12 20:40:54 UTC
this issue has been resolved with the customer.


Note You need to log in before you can comment on or make changes to this bug.