Bug 727686

Summary: "Failed to load group Resource configuration" when attempting to click Configure tab for compat group of Cron Resources
Product: [Other] RHQ Project Reporter: Robert Buck <rbuck>
Component: Core ServerAssignee: John Mazzitelli <mazz>
Status: CLOSED NOTABUG QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: low    
Version: 1.2CC: cwelton, hrupp, ian.springer, jshaughn, mazz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
URL: http://jira.rhq-project.org/browse/RHQ-1908
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: RHQ-1908 Environment:
Last Closed: 2011-08-19 15:51:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 729848, 730796    
Attachments:
Description Flags
stack trace none

Description Robert Buck 2011-08-02 21:15:53 UTC
I just reproduced this bug in the performance environment; it looks identical to the one that was closed?







+++ This bug was initially created as a clone of Bug #535185 +++

User is unable to click the configuration tab for a group of RHQ Agents.  This seems to work with other groups for which "Configure" is a valid path, such as postgresql servers.

Repro steps:

1. Create a compatible group of 3 or more RHQ Agents.  I selected non-recursive, if that matters.
2. Resource > Compatible Groups > $group_name
3. Attempt to view the Configure page for the group (either by icons in the leftnav or via the "Configure" tab)
4. View results.

Current results:

Failed to load group Resource configuration. Cause: javax.ejb.EJBException:java.lang.RuntimeException: Failed to retrieve live Resource configurations. -> java.lang.RuntimeException:Failed to retrieve live Resource configurations. -> java.util.concurrent.ExecutionException:java.lang.Exception: Failed to obtain live Resource configuration for Resource[id=501051, type=RHQ Agent, key=jonqa.rdu.redhat.com RHQ Agent, name=jonqa.rdu.redhat.com RHQ Agent, parent=jonqa.rdu.redhat.com, version=1.2.0-SNAPSHOT]. -> java.lang.Exception:Failed to obtain live Resource configuration for Resource[id=501051, type=RHQ Agent, key=jonqa.rdu.redhat.com RHQ Agent, name=jonqa.rdu.redhat.com RHQ Agent, parent=jonqa.rdu.redhat.com, version=1.2.0-SNAPSHOT].

Expected results:
Can view/edit configurations.


Other notes: 

See attached traceback from serverlog..

--- Additional comment from cwelton on 2009-04-01 14:23:13 EDT ---

Attaching screenshot, to note that all agents are up, and as far as I can tell had been and continued to be.

--- Additional comment from cwelton on 2009-04-01 15:55:37 EDT ---

This occurs when agent is misconfigured and defaults with an endpoint of 127.0.0.1.  Agent can communicate with server, but not vice-versa.

Reducing priority for this, and certainly not a major 2.2 issue.


--- Additional comment from mazz on 2009-04-01 16:42:24 EDT ---

created RHQ-1911 in response to this issue.

--- Additional comment from bugzilla on 2009-11-10 15:49:14 EST ---

This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1908
Imported an attachment (id=368688)
Imported an attachment (id=368689)
This bug relates to RHQ-1911


--- Additional comment from whayutin on 2010-02-16 11:55:21 EST ---

Temporarily adding the keyword "SubBug" so we can be sure we have accounted for all the bugs.

keyword:
new = Tracking + FutureFeature + SubBug

--- Additional comment from whayutin on 2010-02-16 12:00:38 EST ---

making sure we're not missing any bugs in rhq_triage

--- Additional comment from mazz on 2011-03-25 15:13:07 EDT ---

just tested this - the new gwt ui will still show something - it won't error out.

The latest known config will be shown if some agents in the group are down or unreachable. If the agents are reachable, their live configs are used.

Comment 1 Robert Buck 2011-08-02 21:16:41 UTC
Created attachment 516394 [details]
stack trace

Comment 2 Ian Springer 2011-08-16 16:23:22 UTC
Bob, notice the stack trace you attached is missing the stack for the root cause (the java.util.concurrent.ExecutionException). Please try to reproduce again and then try to grab the full stack trace from the Server or Agent log.

Also, do you only see this for Cron compat groups, or do you see it for other types of compat groups too?

Comment 3 Ian Springer 2011-08-16 19:51:50 UTC
I updated the title of this BZ, since it looks like Bob is seeing an error for a group of Cron resources, not a group of RHQ Agent resources.

Note, I suspect this is a bug in the cron plugin and not in the RHQ platform or GUI. Waiting to hear back from Bob with the full stack trace and whether he can reproduce the issue with other types of compat groups.

Comment 4 Jay Shaughnessy 2011-08-17 15:18:04 UTC
I saw this today, it seems rooted in the fact that an agent is down.  There was a corresponding *WARN level* server stack, shown below.  The problem may be that we're not properly getting the right message to the user.  Here is the [very abbreviated] stack:

09:17:24,771 WARN  [gwt-log] Sending exception to client: [1313587044768]
java.lang.Exception: Current group Resource configuration for 10056 cannot be calculated, because one or more of this group's member Resources are DOWN.
        at org.rhq.enterprise.server.configuration.ConfigurationManagerBean.getResourceConfigurationsForCompatibleGroup(ConfigurationManagerBean.java:610)

Comment 5 John Mazzitelli 2011-08-17 16:52:07 UTC
I just tried this and the second time I went to the group config page, I got this:

java.lang.RuntimeException:One or more null or empty member connection settings was returned by the Server.
--- STACK TRACE FOLLOWS ---
One or more null or empty member connection settings was returned by the Server.
    at Unknown.com_google_gwt_core_client_impl_StackTraceCreator$CollectorMoz_collect__Lcom_google_gwt_core_client_JsArrayString_2(Unknown source:0)
...
    at Unknown.org_rhq_enterprise_gui_coregui_client_inventory_groups_detail_configuration_GroupResourceConfigurationEditView$3$1_$onSuccess__Lorg_rhq_enterprise_gui_coregui_client_inventory_groups_detail_configuration_GroupResourceConfigurationEditView$3$1_2Ljava_util_Map_2V(Unknown source:0)
    at Unknown.org_rhq_enterprise_gui_coregui_client_inventory_groups_detail_configuration_GroupResourceConfigurationEditView$3$1_onSuccess__Ljava_lang_Object_2V(Unknown source:0)
    at Unknown.com_google_gwt_user_client_rpc_impl_RequestCallbackAdapter_onResponseReceived__Lcom_google_gwt_http_client_Request_2Lc
...

note that BOTH of my agents were up and running.

Comment 6 John Mazzitelli 2011-08-17 17:00:04 UTC
I put down bug 719731 and bug 719999 in the see-also section, but I don't think that this is truly related to those. This "cannot get live config" is actually to be expected IF the agent cannot be accessed for whatever reason. If the server fails to communicate with the agent successfully and/or fails to retrieve the live configuration successfully for whatever reason (maybe the managed resource was down? maybe there is a bug in the plugin container or the plugin itself?) then we will not show any group config in this subtab. The reason is that we do not want to show potentially erroneous config data to the user. Unless we can be sure we have the most current LIVE config for all group members, we will not show the group config, lest the user think the data that is being shown is the true up-to-date config for all resources (which is isn't, because we failed to get at least one resource's live config)

So, I would say - make sure you did not get any error messages in the server and/or agent to indicate problems with the agent communication (even though the agent may show UP, its possible that particular comm request failed for some reason).

Lowering this priority down - but I will look at those other two bugs as those do appear to not have any explanation yet.

Comment 7 John Mazzitelli 2011-08-19 15:51:01 UTC
this is to be expected when you can't get the live config. whether or not it is what we want is another question. but the code as it is today works.