Bug 905632
Summary: | removing large numbers of plugin types cause foreign key constraint violations | ||
---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Simeon Pinder <spinder> |
Component: | Core Server | Assignee: | Simeon Pinder <spinder> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.5 | CC: | hrupp |
Target Milestone: | --- | ||
Target Release: | RHQ 4.5.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-09-01 10:06:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Simeon Pinder
2013-01-29 19:48:00 UTC
Finally figured out the problem here after looking through this code for hours. The ResourceCriteria query used during type deletion was not using unlimited page size but defaulting to only return 200 results. This means that for a specific type if there were more than 200 instances that constraint violations would occur because the type removal logic must first: i)remove all resources/instances of that type along we removing relationships/etc ii)then finally remove the type After the first 200 resources were actually removed, then constraint violation occurred because not all instances all the type had been removed as assumed by type removal logic. This will likely happen in larger installations when they attempt to remove a plugin from their management domain. As it's based on the number of resources returned via query then this error would occur over time with any moderately sized deployment. The result of the bug is that only the first 200 resources are removed and the plugin/type is not actually removed. Interestingly enough a work around for smaller installs is to continue to update the timestamp on the offending plugin and redeploy to clear out the offending types in 200 size increments. Wonder how many other places this subtle criteria default may be biting us. This is fixed with commit: 9e70606deeb4654ae6117 to master Moving this to ON_QA. FYI no indication this is a recent regression. Improved on earlier fix to instead chunk through the available pages so as to defend against excessive memory consumption in large deployments. This is applied with commit : 4b1de4c61a8a5118d to master. Leaving this ON_QA. If this is cherrypicked to other branches, both commits need to be applied in order. Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since. |