Bug 1207980

Summary: Split up large Liquibase changesets to avoid partial updates
Product: [Retired] Zanata Reporter: Sean Flanigan <sflaniga>
Component: DatabaseChangeAssignee: Alex Eng <aeng>
Status: CLOSED CURRENTRELEASE QA Contact: Ding-Yi Chen <dchen>
Severity: high Docs Contact:
Priority: high    
Version: 3.7CC: aeng, camunoz, mkim, pahuang, zanata-bugs
Target Milestone: ---Keywords: screened
Target Release: 3.7   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3.7.0-SNAPSHOT (git-server-3.6.2-425-g01da786) Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 02:20:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1207994    
Bug Blocks:    

Description Sean Flanigan 2015-04-01 07:14:25 UTC
Description of problem:

Several of the changesets in 3.7 can take a very long time to run.  (That is a problem in itself, for another time.)  If the long operation is cancelled (eg appserver times out), mysql finishes the current change and autocommits it, but the DATABASECHANGELOGLOCK is never cleaned up.  Eventually, the admin will have to clear the lock, but if the changeset includes multiple changes, Liquibase will attempt the first change again, and fail.  Recovering from this is quite slow and painful, especially for anyone who isn't familiar with Liquibase or Zanata.

I suggest that we replace the affected changesets with multiple smaller changesets:

- one change per changeset.
- using a precondition to skip the change if it has already been applied.

Also, we may be able to speed up the changes overall by making multiple alterations to the same table in a single "change" (eg adding several indexes or columns to the same table).

Finally, we should log a message before beginning each large changeset, so that the admin will have some idea why Zanata is taking a long time to start.

Comment 1 Sean Flanigan 2015-04-01 07:34:27 UTC
Bug 1207994 might help with the logging.  (Or if we have to write a simple logging extension, better to use the latest API.)

Comment 2 Sean Flanigan 2015-04-21 01:11:38 UTC
Assigning to pahuang, since most of the affected changesets are his.

I recommend we move the replacements for changeset aeng:3 (adding columns to HTextFlowTarget/History) to appear before the replacements for pahuang:2 and pahuang:3 (adding indexes to HTextFlowTarget/History).  Adding columns before indexes should save the database from having to redo the indexing work.

When it comes to writing the preconditions for the split-up changesets, I recommend using <not><changeSetExecuted></not>.  This should be fairly quick to execute.  Preconditions which check for the existence of schema elements (eg <foreignKeyConstraintExists>) can be very slow in Liquibase, because at least some of them are implemented using database snapshots, which are quite slow in some circumstances.


I originally suggested adding multiple indexes to a single table in one changeset as a potential optimisation.  This can be faster on MySQL 5.5 (or 5.1 with an updated InnoDB plugin), but I don't think Liquibase will structure the SQL to take advantage of it, and it could lead to problems on MySQL 5.1 without the plugin.  So I think we should just stick to the rule: one change per changeset.


As for the logging problem, Liquibase doesn't seem to have any good hooks for this sort of logging.  Liquibase itself logs a message before each changeset, but the level is DEBUG, and the category is "liquibase", so DEBUG level will include far too many irrelevant messages.  I'll see if I can make an extension which will help.

Comment 3 Sean Flanigan 2015-04-21 04:01:02 UTC
I have added an extension for pre-changeset logging as part of https://github.com/zanata/zanata-server/pull/773

Comment 4 Sean Flanigan 2015-05-08 04:26:48 UTC
Reassigning to aeng, since the only changeset which is really, really slow is db.changelog-3.7.xml::3::aeng.  

We definitely need this fixed for 3.7, but it really needs to be split up as soon as possible, because it really wastes a lot of time when the changeset fails part-way through for whatever reason.  (In my case, I was trying to get Zanata working on WildFly 9 and didn't have the longer deployment timeout configured.)

Reminder: the "add column" changesets should be moved above the "add index" changesets, to save the database from indexing twice.

Comment 5 Alex Eng 2015-05-26 00:46:39 UTC
Pull request:

https://github.com/zanata/zanata-server/pull/827

Comment 6 Ding-Yi Chen 2015-06-03 00:15:47 UTC
VERIFIED with Zanata 3.7.0-SNAPSHOT (git-server-3.6.2-425-g01da786)