Bug 817604 - Trying to delete a Dynagroup Definition gives db constraint errors
Summary: Trying to delete a Dynagroup Definition gives db constraint errors
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Resource Grouping
Version: 4.3
Hardware: All
OS: All
high
medium
Target Milestone: ---
: RHQ 4.5.0
Assignee: Jay Shaughnessy
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: jon310-sprint11, rhq44-sprint11 815869
TreeView+ depends on / blocked
 
Reported: 2012-04-30 15:34 UTC by Larry O'Leary
Modified: 2013-09-01 10:14 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 815869
Environment:
Last Closed: 2013-09-01 10:14:07 UTC
Embargoed:


Attachments (Terms of Use)

Description Larry O'Leary 2012-04-30 15:34:36 UTC
+++ This bug was initially created as a clone of JBoss ON Bug #815869 +++

Description of problem:
I created a new dynagroup with the follow customized expression:
"resource.name.endsWith = .sh"

Recursive flag is checked and recalculation interval is 0.
Now I can't delete this group. JON server shows above: "Failed to delete the selected group definitions"
Here is the stacktrace tail:

"Message: Failed to delete the selected group definitions
Severity :	
Error

Time :	
Thursday, April 19, 2012 2:31:11 PM Etc/GMT+3
Detail :	
java.lang.RuntimeException:[1334856671827] java.lang.RuntimeException:javax.transaction.RollbackException: [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] Can't commit because the transaction is in aborted state -> javax.transaction.RollbackException:[com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] Can't commit because the transaction is in aborted state -> javax.persistence.EntityExistsException:org.hibernate.exception.ConstraintViolationException: Could not execute JDBC batch update -> org.hibernate.exception.ConstraintViolationException:Could not execute JDBC batch update -> java.sql.BatchUpdateException:Batch entry 0 delete from RHQ_RESOURCE_GROUP where ID='10131' was aborted.  Call getNextException to see the cause.[SQLException=Batch entry 0 delete from RHQ_RESOURCE_GROUP where ID='10131' was aborted.  Call getNextException to see the cause. -> ERROR: update or delete on table "rhq_resource_group" violates foreign key constraint "rhq_resource_group_cluster_resource_group_id_fkey" on table "rhq_resource_group"
  Detail: Key (id)=(10131) is still referenced from table "rhq_resource_group".(error-code=0,sql-state=23503)]

Somehow a cluster group was created, I had to manually delete the cluster group and the main group definition from rhq_resource_group


Version-Release number of selected component (if applicable):
4.2

Comment 1 Mike Foley 2012-04-30 16:02:10 UTC
triaged 4/30/2012 by loleary, ccrouch, mfoley

Comment 2 Jay Shaughnessy 2012-05-02 18:30:03 UTC
Hmm, I wonder if there is a general problem with recursive compat
group removal after it's been navigated in the UI... I'll check on
that.

Comment 3 Jay Shaughnessy 2012-05-07 17:33:11 UTC
It's definitely not a general problem.  I have not been able to reproduce
the issue and an inspection of the code indicates that we seem to be doing
the right thing. 

But, after just about giving up, I looked more closely at the case
attachments and i see why the error is generated.  I don't know yet how
this can happen but basically, there is an autocluster group hanging off of
a mixed group (the mixed group having been generated by the group def). That
should not happen.

When a group is recalculated we do have logic in place to ensure that
if it goes from compat to mixed that we clean up autoclusters.  Perhaps if the
definition is actually changed completely we have an issue, I'll keep 
looking...

Comment 4 Jay Shaughnessy 2012-05-07 18:30:29 UTC
OK, I recreated this but it was not easy. Perhaps there is an easier way
but I couldn't find it, I'm surprised this could come up in general 
practice.

1) Create a group definition that generates a recursive compatible group

   For example:
   resource.type.plugin = JBossAS5
   resource.type.name = Web Application (WAR)

   recursive?  yes

2) Navigate to the new group, then drill down into a child node (aka
   an autocluster node)

   In our example: Web Application Context

3) In a second GUI session, navigate to the new group and ** do not yet
   navigate to the autocluster node **.

4) Back in session 1, return to the group definition and change it
   completely, this time to generate a mixed group.

   For example:
   resource.type.plugin = JBossAS5
   resource.name.contains = t

5) Back in session 2, which still shows the compat group tree, navigate to
   the Web Application Context autocluster node.  This unwittingly creates
   the compatible backing group for the autocluster, and links it to the
   original, now mixed, resource group.  And now we have a problem.

The thing that surprises me is that when we change the group definition we actually re-use the resource group from the previous definition.  I guess we
just consider it a recalculation and that logic must preserve groups that
have the same name after the recalculation. Since there is nothing
changing that would affect naming, like a change to agroup-by (pivot),
we end up keeping the same group for a completely different resource set.

I don't recommend we change this behavior.

We already protect against changes from Compat to Mixed, which is why
recreating this is difficult (at least in all ways I could think of).  It
required that stale tree in a second window to force the issue.

We could protect further, I guess, in one of two ways.  Validate the
autocluster root node is valid (not mixed) at create-time.  Or, we could
potentially just try and clean up this sort of case under-the covers, at
delete time.

I'll take a look.  I suggest the easiest possible solution, as this, as
far as I can tell, should be super-rare.

Comment 5 Jay Shaughnessy 2012-05-07 21:00:14 UTC
This stale group tree scenario is not totally far-fetched. Although unlikely,
the more I think about it it is possible to get a stale group tree when 
the group changes type.  Due to dynagroup recalculation, or just a
manual member edit, if a session is looking at a compat tree when the group
changes to mixed you can get the problem above.  You can also generate
an unhandled exception by trying to show a context menu on the root node.

I've put in some handling for both cases.  In both cases the GUI will refresh
the view automatically, bringing up the mixed group tree, to indicate the
state change and give the user a valid tree.

Note: This is not a general group tree state change solution.  It covers
only these  scenarios.  If and when a server-side change notification
mechanism is put in place then this is one more state change we
should respond to.

Comment 6 Jay Shaughnessy 2012-05-08 15:42:04 UTC
master commit 2f9c4823ea11cfdc6f289d865661b06ed4063c6d

Protect against a couple of issues that result from a stale Compat Group
tree in the GUI.  If a compatible group changes to mixed *while a GUI
session is actively navigating the compat group tree* then bad things can
happen, including this BZ.

The only way to really protect against this would be a live (non-polling)
server-side change event listener for the GUI.  I'm not sure that's
possible but it is something to investigate.

Comment 7 Heiko W. Rupp 2013-09-01 10:14:07 UTC
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.


Note You need to log in before you can comment on or make changes to this bug.