Bug 835113

Summary: EJB2 MDBs are DOWN in JON UI
Product: [Other] RHQ Project Reporter: bkramer <bkramer>
Component: PluginsAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: 4.4CC: ccrouch, hrupp, jshaughn, loleary
Target Milestone: ---   
Target Release: RHQ 4.5.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: 4.5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 835111
: 835111 (view as bug list) Environment:
Last Closed: 2013-09-01 10:12:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 835111    
Attachments:
Description Flags
zip file containing files from "Additional info" in the Description. none

Description bkramer 2012-06-25 15:06:32 UTC
Created attachment 594208 [details]
zip file containing files from "Additional info" in the Description.

Description of problem:
For some reason, JON does not recognise UP and properly running EJB2 MDB. JMX console will show StateString = "Started" and State = "3" for this bean. At the same time, the same MDB will be detected in JON but it's availability will stay DOWN.

Version-Release number of selected component (if applicable):
JON 3.0.1

How reproducible:
Always

Steps to Reproduce:
1. Install EJB2 MDB in JBoss EAP 5.1.1;
2. Start and run JON 3.0.1 Server and Agent;
3. In JON UI, navigate to the newly installed Bean.
  
Actual results:
Although, this MDB works properly, JON reports it's as DOWN.


Expected results:
The availability of this MDB should be properly defined as UP.

Additional info:
Added screen-shots: 
1) mdb_down_in_JON.png;
2) jmx_view.png
3) destination_view.png
4) agent.log file in TRACE level;
5) AuctionOffer.jar 

** To deploy MDB, copy AuctionOffer.jar into deploy folder and add the following into ../deploy/messaging/destinations-service.xml:


<!--
      Queue Auction
-->
<mbean code="org.jboss.jms.server.destination.QueueService"
       name="jboss.messaging.destination:service=Queue,name=Auction"
       xmbean-dd="xmdesc/Queue-xmbean.xml">
       <depends optional-attribute name="ServerPeer">jboss.messaging:service=ServerPeer</depends>
      <depends>jboss.messaging:service=PostOffice</depends>
   </mbean>

Comment 1 Ian Springer 2012-06-25 16:20:01 UTC
Worked around in master by always translating a run state of UNKNOWN to an avail of UP:

http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=841e1f0

Comment 2 JBoss JIRA Server 2012-06-26 13:29:53 UTC
Ian Springer <ian.springer> made a comment on jira JBPAPP-9397

Did you check whether this issue also exists for EJB2 stateless session beans, stateful session beans, and/or entity beans?

Comment 3 JBoss JIRA Server 2012-06-26 14:00:59 UTC
Biljana Kramer <bkramer> made a comment on jira JBPAPP-9397

No, I had and tested only MDB.

Comment 4 JBoss JIRA Server 2012-06-26 19:02:38 UTC
Larry O'Leary <loleary> made a comment on jira JBPAPP-9397

After more consideration, we can not conclude that the RunState for the EJB2 MDB ManagedComponent is actually UNKNOWN. More investigation is being done in the upstream RHQ Bug 835113 and the findings should be updated here when available.

Comment 5 Larry O'Leary 2012-06-26 19:04:55 UTC
(In reply to comment #1)
> Worked around in master by always translating a run state of UNKNOWN to an
> avail of UP:
> 
> http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=841e1f0

This workaround is insufficient as it will result in all ManagedComponents which failed to be deployed (are missing from AS) returning a status of UP.

The workaround should be backed out until a more suitable solution is found.

Comment 6 Ian Springer 2012-06-26 19:06:54 UTC
>> This workaround is insufficient as it will result in all ManagedComponents 
>> which failed to be deployed (are missing from AS) returning a status of UP.

Not all MC's, just EBJ2 MDB MC's - right?

Comment 7 Larry O'Leary 2012-06-26 19:09:52 UTC
(In reply to comment #6)
> >> This workaround is insufficient as it will result in all ManagedComponents 
> >> which failed to be deployed (are missing from AS) returning a status of UP.
> 
> Not all MC's, just EBJ2 MDB MC's - right?

Sorry. You are correct, EJB2 MDB MCs only.

Comment 8 Jay Shaughnessy 2012-06-27 15:15:16 UTC
The root cause is now understood.  The agent logs attached to the case are more helpful than those attached to this BZ.  These logs show two things about calls to getAvailability()for types based on class ManagedComponetComponent. First, it seems that in some cases attempts to get the ManagedComponents return null.  This throws exceptions that are caught by the availability checker and produce messages like:

"Failed to collect availability on Resource..."

In other cases the runState of the returned MC is null.

These situation may be due to bug 795918.  It's not clear, but the situation tends to correct itself and subsequent requests return an MC with a valid runState.  Any fixes for that problem may prove beneficial in addition to the changes made for this BZ.

The first problem to be handled here has to do with the fact that the code defaulted the runState to UNKNOWN.  This is actually not a valid thing to do because UNKNOWN is a valid runState, and we need to be able to distinguish a null runState from UNKNOWN.  So, fix #1 is to change the default to null and handle appropriately such that a null runState always reports DOWN avail.

The second issue was that the override of getManagedComponent() in Ejb2BeanComponent did not work with the same semantics as the super's impl. This was mainly due to the super performing various side-effects, making an override dangerous.  The result was that the MC was never marked as being fetched, and also that failures to fetch the MC were never reported as Exceptions.

The code has been reworked to centralize the error handling, add proper jdoc, and make the override point perform only the actual "get" and not worry about error handling or side-effects.

Comment 9 Jay Shaughnessy 2012-06-27 17:28:07 UTC
master commit 8e3709f8d7a7ffce5b44322a656836be13a7a652

Restructure getAvailability() and getManagedComponent() to remove
ambiguity in UNKNOWN runState handling as well as provide consistent
Exception throwing/handling, and ManagedComponent refresh.  Simplify the
override point for getManagedComponent.

Remove Ejb2BeanComponent's special-handling for UNKNOWN runState, previously committed for this fix, in favor of the new base handling.


Test Notes:
Note that all relevant logging for issues discussed here is at DEBUG level.
Note that if we fail to get the ManagedComponent it will be (by default) 15 minutes before we try again, so avail will be DOWN for that resource for at least that long.

Comment 10 Jay Shaughnessy 2012-06-27 20:55:44 UTC
I guess I'm not sure whether this BZ should really be ON_QA.  I don't think the root problem is necessarily solved by the code changes here.  The root problem is really that in several cases we're not successfully fetching the ManagedComponent for the resource. I don't really know why that is but is perhaps related to bug 795918 as mentioned above.

So, given the code changes thus far, correcting some errant code, that the resources will continue to report as DOWN, but handling and logging should be more coherent. In fact, It's possible additional resources will show as DOWN now that we've corrected the UNKNOWN runState default problem.

Comment 11 Larry O'Leary 2012-06-29 04:45:29 UTC
I tested the commits from this bug against the test case described in the bug description and found that the commits do fix this issue for the EJB2 MDB. However, as suspected, the result is that other services are now marked as DOWN because we were explicitly interpreting runState UNKNOWN as UP and now it is [null]. The only ones I see this on in my current test environment is for web application context:

2012-06-28 23:13:16,854 DEBUG [InventoryManager.discovery-1] (org.rhq.plugins.jbossas5.WebApplicationContextComponent)- Returning DOWN avail for ComponentType{type=MBean, subtype=WebApplicationManager} component 'jboss.web:host=localhost,path=/admin-console,type=Manager' with runState [null].
2012-06-28 23:13:16,854 DEBUG [InventoryManager.discovery-1] (rhq.core.pc.inventory.RuntimeDiscoveryExecutor)- Availability of [Resource[id=10073, type=Web Application Context, key=localhost, name=//localhost/admin-console, parent=admin-console.war]] is not UP, cannot perform service scan on it.
2012-06-28 23:13:16,854 DEBUG [InventoryManager.discovery-1] (org.rhq.plugins.jbossas5.WebApplicationContextComponent)- Returning DOWN avail for ComponentType{type=MBean, subtype=WebApplicationManager} component 'jboss.web:host=localhost,path=/admin-console,type=Manager' with runState [null].
2012-06-28 23:13:16,855 DEBUG [InventoryManager.discovery-1] (rhq.core.pc.inventory.RuntimeDiscoveryExecutor)- Availability of [Resource[id=10073, type=Web Application Context, key=localhost, name=//localhost/admin-console, parent=admin-console.war]] is not UP, cannot perform service scan on it.


I am not completely certain that this changed behavior makes sense though. Could these changes have cased this?

Comment 12 Larry O'Leary 2012-06-29 04:52:20 UTC
Actually, strike that last comment. It appears to have been a result of the ProfileService.load() issue in where the component types get wiped away temporarily. Everything is now reported as UP. I missed the messages stating that the component failed to load due to the missing component type.

As such, I think this BZ can sufficiently be placed on to ON_QA.

Comment 13 JBoss JIRA Server 2012-06-29 04:58:28 UTC
Larry O'Leary <loleary> updated the status of jira JBPAPP-9397 to Closed

Comment 14 JBoss JIRA Server 2012-06-29 04:58:28 UTC
Larry O'Leary <loleary> made a comment on jira JBPAPP-9397

Closing this as the state returned by ProfileService was actually RUNNING. It was RHQ that was setting the RunState to UNKNOWN due to a misbehaving override.

Comment 15 Heiko W. Rupp 2013-09-01 10:12:51 UTC
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.