Red Hat Bugzilla – Bug 697061
group config change histories are broken when one fails
Last modified: 2013-08-05 20:39:19 EDT
Description of problem:
If you have a group configuration change where one of the resources fails, the other config change histories never get out of "inprogress" but in reality they really did finish. The ones in my test that showed inprogress, I went over to the agent machine and actually looked at the resource true config and it was changed successfully.
How to reproduce:
create a group config change with multiple resources. Ensure one of them fails. Notice that some of the others are still in in progress (though if you check the actual resource's config, you'll see they did change).
I was unable to reproduce this. I created a compat group of DefaultDS datasources from three different AS6 server Resources. I then stopped one of the AS6 servers, and then edited the group config, made a change, and saved it. The group updated completed with a status of failure, and the members had the expected statuses - two were success and the one corresponding to the down server was failure. None of them were stuck in in-progress.
Mazz, if you give more detailed reproduction steps, I can try again.
this appears not to be easily reproducible.
I just submitted another, and my group resource history shows INPROGRESS but all individual resources show end states (success or failure).
Jay S. tried it and saw the same.
possible problem - we may need ConfigurationManagerBean.executeResourceConfigUpdate to check to see if its part of a group and if so, check to see if all individual histories are done and if so, update the group status.
I tried again and reproduced. I think I didn't see it the first time, because the CheckForTimedOutConfigUpdatesJob reaper job, which runs every 10 minutes, happened to run very soon after I initiated the group update.
In any case, it is fixed now - [master c643d2b].
Verified on rhq4 release build (Version: 4.0.0 Build Number: db0c817)
created a compatible group with multiple resources, stopped one of the resource and did a group configuration change. The group update completed with failure status, and the members display the expected status.
Marking as verified.
1) in ConfigurationManagerBean.executeResourceConfigurationUpdate(), if the remote call to an Agent to update a config fails, make sure to call checkForCompletedGroupResourceConfigurationUpdate() to update the status of the parent group config update, in the case that the update is part of a group update
2) in ConfigurationServerServiceImpl.persistUpdatedResourceConfiguration(), remove a LOG.isDebugEnabled() check that was erroneously preventing a return call from executing when not debug logging was not enabled
Setting this to ON_QA so it can be retested, after ips also committed into release-4.0.0 branch:
Verified on build#38 (Version: 4.0.1-SNAPSHOT Build Number: a104cdf)
created a compatible group with two rhq agents. Stopped one agent and navigated to the compatible group. When clicked on the 'Configuration' tab, it displayed below message in UI and the tab does not display the configuration details.
Failed to retrieve member Resource configurations for [ResourceGroup[id=10031, name=Group-Agent, category=COMPATIBLE, type=RHQ Agent, isDynaGroup=false, isClusterGroup=false]]
The server log displays:
2011-05-20 19:02:23,730 WARN [gwt-log] Sending exception to client: 
java.lang.Exception: Current group Resource configuration for 10031 cannot be calculated, because one or more of this group's member Resources are DOWN.
at sun.reflect.GeneratedMethodAccessor1251.invoke(Unknown Source)
It is throwing exception and does not allow us to edit anything if one of the group member resource is down which is working like expected and so marking it verified.
This can be verified as follows:
1) make sure both the RHQ Agent resources in the group are UP
2) go to the group's Configuration>Current tab and wait for the page to fully load
3) stop the Agent corresponding to one of the RHQ Agent resources
4) wait a minute or so, then click the Save button to save the group config
5) from the History subtab, verify that the member update fails for the Agent that is down and that the group update also fails
Pushing back to ON_QA.
Verified on Version: 4.0.1 Build Number: ecd91b2
Created a group of RHQ Agents. Navigated to group's Configuration>Current tab and waited for the page to fully load.
Stopped one of the member agents, made a change in group config and saved. The group config history shows the failed status and the view member history page displays one success status for the agent which is up and failed for the down agent.
Marking as verified.
Bookkeeping - closing bug - fixed in recent release.