Bug 1207980 - Split up large Liquibase changesets to avoid partial updates
Summary: Split up large Liquibase changesets to avoid partial updates
Alias: None
Product: Zanata
Classification: Retired
Component: DatabaseChange
Version: 3.7
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.7
Assignee: Alex Eng
QA Contact: Ding-Yi Chen
Depends On: 1207994
TreeView+ depends on / blocked
Reported: 2015-04-01 07:14 UTC by Sean Flanigan
Modified: 2015-07-22 02:20 UTC (History)
5 users (show)

Fixed In Version: 3.7.0-SNAPSHOT (git-server-3.6.2-425-g01da786)
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-07-22 02:20:15 UTC

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1207994 0 unspecified CLOSED RFE/TECH DEBT: Upgrade to Liquibase 3.3 2021-02-22 00:41:40 UTC

Internal Links: 1207994

Description Sean Flanigan 2015-04-01 07:14:25 UTC
Description of problem:

Several of the changesets in 3.7 can take a very long time to run.  (That is a problem in itself, for another time.)  If the long operation is cancelled (eg appserver times out), mysql finishes the current change and autocommits it, but the DATABASECHANGELOGLOCK is never cleaned up.  Eventually, the admin will have to clear the lock, but if the changeset includes multiple changes, Liquibase will attempt the first change again, and fail.  Recovering from this is quite slow and painful, especially for anyone who isn't familiar with Liquibase or Zanata.

I suggest that we replace the affected changesets with multiple smaller changesets:

- one change per changeset.
- using a precondition to skip the change if it has already been applied.

Also, we may be able to speed up the changes overall by making multiple alterations to the same table in a single "change" (eg adding several indexes or columns to the same table).

Finally, we should log a message before beginning each large changeset, so that the admin will have some idea why Zanata is taking a long time to start.

Comment 1 Sean Flanigan 2015-04-01 07:34:27 UTC
Bug 1207994 might help with the logging.  (Or if we have to write a simple logging extension, better to use the latest API.)

Comment 2 Sean Flanigan 2015-04-21 01:11:38 UTC
Assigning to pahuang, since most of the affected changesets are his.

I recommend we move the replacements for changeset aeng:3 (adding columns to HTextFlowTarget/History) to appear before the replacements for pahuang:2 and pahuang:3 (adding indexes to HTextFlowTarget/History).  Adding columns before indexes should save the database from having to redo the indexing work.

When it comes to writing the preconditions for the split-up changesets, I recommend using <not><changeSetExecuted></not>.  This should be fairly quick to execute.  Preconditions which check for the existence of schema elements (eg <foreignKeyConstraintExists>) can be very slow in Liquibase, because at least some of them are implemented using database snapshots, which are quite slow in some circumstances.

I originally suggested adding multiple indexes to a single table in one changeset as a potential optimisation.  This can be faster on MySQL 5.5 (or 5.1 with an updated InnoDB plugin), but I don't think Liquibase will structure the SQL to take advantage of it, and it could lead to problems on MySQL 5.1 without the plugin.  So I think we should just stick to the rule: one change per changeset.

As for the logging problem, Liquibase doesn't seem to have any good hooks for this sort of logging.  Liquibase itself logs a message before each changeset, but the level is DEBUG, and the category is "liquibase", so DEBUG level will include far too many irrelevant messages.  I'll see if I can make an extension which will help.

Comment 3 Sean Flanigan 2015-04-21 04:01:02 UTC
I have added an extension for pre-changeset logging as part of https://github.com/zanata/zanata-server/pull/773

Comment 4 Sean Flanigan 2015-05-08 04:26:48 UTC
Reassigning to aeng, since the only changeset which is really, really slow is db.changelog-3.7.xml::3::aeng.  

We definitely need this fixed for 3.7, but it really needs to be split up as soon as possible, because it really wastes a lot of time when the changeset fails part-way through for whatever reason.  (In my case, I was trying to get Zanata working on WildFly 9 and didn't have the longer deployment timeout configured.)

Reminder: the "add column" changesets should be moved above the "add index" changesets, to save the database from indexing twice.

Comment 5 Alex Eng 2015-05-26 00:46:39 UTC
Pull request:


Comment 6 Ding-Yi Chen 2015-06-03 00:15:47 UTC
VERIFIED with Zanata 3.7.0-SNAPSHOT (git-server-3.6.2-425-g01da786)

Note You need to log in before you can comment on or make changes to this bug.