Bug 1211407

Summary: Agent plug-in deletion fails if resource group or auto-group operation was used
Product: [JBoss] JBoss Operations Network Reporter: Larry O'Leary <loleary>
Component: InventoryAssignee: Libor Zoubek <lzoubek>
Status: CLOSED ERRATA QA Contact: Sunil Kondkar <skondkar>
Severity: high Docs Contact:
Priority: unspecified    
Version: JON 3.3.2CC: fbrychta, jshaughn, lzoubek, skondkar, spinder, theute
Target Milestone: ER01Keywords: Triaged
Target Release: JON 3.3.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-30 16:41:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Larry O'Leary 2015-04-13 23:11:09 UTC
Description of problem:
You are unable to purge/delete an agent plug-in if a resource group contained a member which used one of the plug-in's resource types and a group operation was performed.

Version-Release number of selected component (if applicable):
3.3.2

How reproducible:
Always

Steps to Reproduce:
1. Install, configure and start JBoss EAP 6.3 standalone server.
2. Install, configure and start JBoss ON system.

   In addition to the core plug-ins, only install the AS7 plug-in from the JBoss EAP plug-in pack. This is to ensure simplicity when deleting / purging plug-ins.
3. Import the JBoss EAP resource into inventory and set its connection settings.
4. Create a resource group that includes the newly imported JBoss EAP standalone server.
5. From the resource group's operation page, invoke the "Reload" operation.
6. Verify the roload is successfully executed.
7. From the agent plug-ins administration page, delete the AS7 plug-in.

   Please note that if using JBoss ON 3.3.1 or earlier, you will first need to manually work around BZ-1200579 before you can call this issue "reproduced."

Actual results:
Every 5 minutes a database integrity error is logged indicating that RHQ_OPERATION_DEF is referenced by RHQ_OPERATION_HISTORY.

The following database query returns data:

        SELECT rhq_operation_history.group_id AS op_his_grp_id
        FROM   rhq_plugin
               LEFT JOIN rhq_resource_type
                      ON rhq_resource_type.plugin = rhq_plugin.NAME
               LEFT JOIN rhq_operation_def
                      ON rhq_operation_def.resource_type_id = rhq_resource_type.id
               RIGHT JOIN rhq_operation_history
                       ON rhq_operation_history.operation_def_id = rhq_operation_def.id AND rhq_operation_history.group_id IS NOT NULL
        WHERE  rhq_plugin.status = 'DELETED'; 


Expected results:
No errors and AS7 plug-in is successfully deleted.

Additional info:
This appears to be an issue in where a resource group continues to remain even though all of its resources have been deleted. 

Although we probably should just delete the resource group as it will be empty, it is not necessary that we do so. Instead, perhaps the best fix here is to simply purge the operation history associated with the resource group?

In the end, the expectation is that when the resource type is deleted, that the operation definition associated with the resource type is also deleted. For the operation definition to be deleted, the operation history must also be deleted. The state of things is like:

operation history:
.. resource group:
   .. resource 1 (member)
   .. resource 2 (member)

After an attempt is made to delete the resource type we end up like so:

operation history:
.. resource group:

The resources no longer appear in the operation history table because their corresponding resources have been deleted. But, the group still remains. The fix here could simply be to see if there are any resource group entries in the operation history table that do not have a corresponding resource entry. If so, delete the row.

Comment 1 Larry O'Leary 2015-04-21 19:25:23 UTC
This same issue occurs if an operation had been invoked on an auto-group. For an JBoss EAP 6 standalone server and its platform's auto-group, the rhq_resource_group table will have an entry for the _JBossAS7 Standalone Servers_ auto-group:

id	name	description	resource_type_id	ctime	mtime	modified_by	recursive	group_definition_id	group_by	category	subject_id	cluster_key	cluster_resource_group_id	auto_group_parent_resource_id
10041	jon3-3-0-server-psql-01372689 ( JBossAS7 Standalone Server )	<null>	10919	1429642814816	1429642814816	admin	false	<null>	<null>	COMPATIBLE	2	<null>	<null>	10001

In this case the group id is 10041 and its parent is the platform resource which an id of 10001.

The rhq_operation_history table will have an entry for the group operation that was performed on the _JBossAS7 Standalone Servers_ auto-group:

id	dtype	job_name	job_group	operation_def_id	status	error_message	subject_name	ctime	stime	mtime	parameters_config_id	resource_id	results_config_id	group_history_id	group_id
10031	group	rhq-group-10041-109757538-1429643133932	rhq-group-10041	11014	SUCCESS	<null>	rhqadmin	1429643134035	1429643134035	1429643134385	<null>	<null>	<null>	<null>	10041

In this case the operation history id is 10031 and the group the operation was invoked on has the id 10041. This row also has a reference to the operation definition with the id 11014.

To reproduce this issue with an auto-group:

1.  Install, configure and start JBoss EAP 6 standalone server.
2.  Install, configure and start JBoss ON 3.3 system.
3.  Import EAP 6 server into inventory.
4.  Configure connection settings for EAP server.
5.  From the _JBossAS7 Standalone Servers_ auto-group node, found under the platform resource, invoke the _Start_ operation.

    This is the auto-group node for resources of type _JBossAS7 Standalone Server_, NOT the individual resource.
6.  From the agent plug-ins administration page, delete the AS7 plug-in.

    Please note that if using JBoss ON 3.3.1 or earlier, you will first need to manually work around BZ-1200579 before you can call this issue "reproduced."

Comment 2 Jay Shaughnessy 2015-04-21 19:50:44 UTC
I'd recommend deleting all compat groups for the doomed type, as part of the type removal.  I don't think this is too heavy handed, even if the groups had been manually created.

Comment 3 Libor Zoubek 2015-04-27 16:12:01 UTC
I was able to reproduce the issue on manually created group.

Based on IRC chat with Jay, I am going to fix this within 
ResourceGroupManagerBean#removeCompatibleGroupConstructs - this get's called whenever compatible group get's empty and switches to MIXED group. We're going to delete GroupOperationHistories, but keep/detach ResourceOperationHistories. Easiest way to achieve that is exposing OperationManagerBean#deleteOperationHistory_helper method as Local interface and put optional detach logic into it.

Comment 4 Libor Zoubek 2015-04-29 18:19:19 UTC
branch:  master
link:    https://github.com/rhq-project/rhq/commit/c3b9a2374
time:    2015-04-29 14:28:22 +0200
commit:  c3b9a237403afa066bcf13cfa24bdff323e65404
author:  Libor Zoubek - lzoubek
message: Bug 1211407 - Agent plug-in deletion fails if resource group or
         auto-group operation was used
         Whenever Compatible group transforms to mixed group (ie.
         becomes empty) we now delete GroupOperationHistory rows, but
         keep ResourceOperationHistory. This fixes Bug 1211407, because
         agent plug-in deletion means resourceType deletion, which means
         resource deletion, which means removing resources from groups -
         removing resource of same resourceType from group makes it
         empty => mixed. This is when this fix comes to play.

Comment 6 Simeon Pinder 2015-07-10 18:55:40 UTC
Available for test with 3.3.3 ER01 build: 
https://brewweb.devel.redhat.com/buildinfo?buildID=446732
 *Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of
 jon-server-3.3.0.GA-update-03.zip.

Comment 7 Sunil Kondkar 2015-07-15 10:18:35 UTC
Verified on JBoss ON 3.3.3 ER01 build

Verified AS7 plug-in deletion after operation invoked on compatible resource group of JBossAS7 Standalone Server and also for JBossAS7 Standalone Servers auto-group.

The plug-in deletion is successful without errors.

Comment 9 errata-xmlrpc 2015-07-30 16:41:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1525.html