Bug 1120417 - Break up transaction of updatePluginConfigurationDefinition into smaller pieces
Summary: Break up transaction of updatePluginConfigurationDefinition into smaller pieces
Keywords:
Status: ON_QA
Alias: None
Product: RHQ Project
Classification: Other
Component: Installer
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: GA
: RHQ 4.13
Assignee: Jay Shaughnessy
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-16 21:32 UTC by Elias Ross
Modified: 2022-03-31 04:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Elias Ross 2014-07-16 21:32:23 UTC
Description of problem:

The following transaction can take a long, long time:

public class PluginConfigurationMetadataManagerBean implements PluginConfigurationMetadataManagerLocal {

    public void updatePluginConfigurationDefinition(ResourceType existingType, ResourceType newType) {

...

                    //Use CriteriaQuery to automatically chunk/page through criteria query results
                    CriteriaQueryExecutor<Resource, ResourceCriteria> queryExecutor = new CriteriaQueryExecutor<Resource, ResourceCriteria>() {
                        @Override
                        public PageList<Resource> execute(ResourceCriteria criteria) {
                            return resourceMgr.findResourcesByCriteria(overlord, criteria);
                        }
                    };

                    CriteriaQuery<Resource, ResourceCriteria> resources = new CriteriaQuery<Resource, ResourceCriteria>(
                        criteria, queryExecutor);

                    for (Resource resource : resources) {
                        updateResourcePluginConfiguration(resource, updateReport);
                    }

If there are thousands of resources, then the updateResourcePluginConfiguration can take a long, long time (like > 10 minutes)

Version-Release number of selected component (if applicable): 4.12


How reproducible: Depending on number of resources


Steps to Reproduce:
1. Create 10,000+ resources with config
2. Try to update resource type

Actual results: Transaction hangs


Expected results: Transaction completes



Additional info:

Comment 1 Jay Shaughnessy 2014-07-17 14:31:16 UTC
I think in general we are in timeout danger when updating plugin types at scale.  We currently try to update all of the types in one Tx, which for something like the AS7 plugin and it's 200+ types could be an issue even with the extended, 30 minute timeout we apply on the outer Tx.  We may want to look at performing one type update per Tx.  At least that way, if one fails, we may have made progress towards updating the plugin overall.

Furthermore, we could apply the extended timeout to each type.  And past that we can still optimize as suggested above, using nested transactions as necessary, to break up the work and reduce the rollback logging for any single Tx.

Looking to see how difficult this re-working may be...

Comment 2 Jay Shaughnessy 2014-07-18 17:29:33 UTC
Elias, thanks for reporting this, please feel free to review the commit.


master commit 26e5712b8cefc7601f6ee95091922de667ee3752
Author: Jay Shaughnessy <jshaughn>
Date:   Fri Jul 18 13:25:47 2014 -0400

    Another round of scalability enhancements for updating plugin metadata.  In
    the past we broke the update of each Plugin into its own Tx.  Later we split
    registering types and removing [obsolete] types into separate Tx and applied
    a 30 minute timeout to the type registration.  With this pass we now update
    each type in its own Tx and allow up to 30 minutes per type.  This can be
    necessary if updating plugin configurations for a large existing resource
    population.

Comment 3 Jay Shaughnessy 2014-07-25 22:29:20 UTC
The recent changes brought to the surface a few other things, leading to these commits.

master commit 001a3d1ed50713172774c4ae94effd34523aa9ca
Author: Jay Shaughnessy <jshaughn>
Date:   Fri Jul 25 18:19:04 2014 -0400

    Another pass here given some oracle test failures in the CI env.
    - Fix an issue with PropertyDefinitionSimple.removeEnumeratedValues.
      An unexpected problem brought out, I guess, by the Tx reworking,
      must be careful not to replace hibernate proxy dealing with
      orphanRemoval.
    - remove unnecessary REQUIRES_NEW that could lead to locking issues
    - remove some dead code
    - start shortening xxxInNewTransaction to xxxInNewTx, purely for selfish
      reasons.


master commit 0e8b7ed217d5ec91ac1313e5fc87829c330b4ca9
Author: Jay Shaughnessy <jshaughn>
Date:   Fri Jul 25 18:22:44 2014 -0400

    In a recent commit for [1120417] (and the resulting oracle test failures) we
    added protection against Hibernate errors related  to detached sets when
    orphanRemoval=true.  This applies similar changes to entity classes
    outside of the BZ work.


Note You need to log in before you can comment on or make changes to this bug.