Bug 864850

Summary: ClusterManagerBean#createAutoClusterBackingGroup may create backing group twice, causing "Failed to create or update autocluster backing group" later
Product: [Other] RHQ Project Reporter: Andreas Veithen <andreas.veithen>
Component: Core ServerAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED WONTFIX QA Contact: Mike Foley <mfoley>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.5CC: hrupp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-21 09:01:47 EST Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Andreas Veithen 2012-10-10 05:33:47 EDT
I have a resource group that is a descendant of a compatible group (with recursive=true) and that is inaccessible in the GUI: an attempt to display the summary fails with "Failed to create or update autocluster backing group".

In the server logs, I see the following error:

Caused by: javax.persistence.NonUniqueResultException: result returns 2 elements
        at org.hibernate.ejb.QueryImpl.getSingleResult(QueryImpl.java:85)
        at org.rhq.enterprise.server.resource.cluster.ClusterManagerBean.createAutoClusterBackingGroup(ClusterManagerBean.java:96)

I checked the database, and for that resource group, there are indeed two entries in rhq_resource_group:

rhq=> select id, ctime, mtime from rhq_resource_group where cluster_key='13231::10247:JVM::10249:java.lang%3atype=Memory::10260:java.lang%3aname=class storage,type=MemoryPool';
  id   |     ctime     |     mtime
 13541 | 1349781276785 | 1349781276785
 13551 | 1349781285229 | 1349781285229
(2 rows)

It looks like there is a race condition in ClusterManagerBean#createAutoClusterBackingGroup that in some circumstances (probably a user clicking repeatedly on groups in the tree because the server takes too long to respond) causes multiple resource groups with the same cluster key to be created.

Note that the creation time of the two groups is +/- 10 seconds apart, which is compatible with the delay caused by the issue described here:

Comment 1 Jay Shaughnessy 2015-12-21 09:01:47 EST
This is likely implicitly fixed already or has not otherwise been a problem.