Bug 701375

Summary: [JBossAS4] JVM metrics stop being collected after a restart of EAP server
Product: [Other] RHQ Project Reporter: Larry O'Leary <loleary>
Component: PluginsAssignee: Ian Springer <ian.springer>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: urgent    
Version: 4.2CC: ccrouch, hrupp, ian.springer, rsoares, skondkar
Target Milestone: ---   
Target Release: JON 3.0.0, RHQ 4.3.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: 4.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
JON 2.4.1 EAP 4.3 CP06 EAP started using the following system properties: -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
Last Closed: 2012-02-07 19:23:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 745494    
Attachments:
Description Flags
patch to upgrade RHQ to EMS 1.3 none

Description Larry O'Leary 2011-05-02 18:00:40 UTC
Description of problem:
If -Djboss.platform.mbeanserver and -Dcom.sun.management.jmxremote were used to start an EAP 4.3 server, restarting the EAP server will result in JMV metrics no longer being collected. The only way to get the missing metrics collected is to restart the JON agent.

Version-Release number of selected component (if applicable):
JON 2.4.1

How reproducible:
Always

Steps to Reproduce:
   #. Install JDK 1.5 if not already
   #. Install JBoss EAP 4.3.CP06
   #. Create a new profile
         JBOSS_HOME=/opt/jboss/eap/jboss-eap-4.3.CP06/jboss-as
         JBOSS_BASE_PROFILE=default
         JBOSS_PROFILE=00024643
         rm -rf "${JBOSS_HOME}/server/${JBOSS_PROFILE}"
         cp -a "${JBOSS_HOME}/server/${JBOSS_BASE_PROFILE}" "${JBOSS_HOME}/server/${JBOSS_PROFILE}"
         cp -a "${JBOSS_HOME}/bin/run.conf" "${JBOSS_HOME}/server/${JBOSS_PROFILE}" 
   #. Configure EAP to use JDK 1.5 if not already
         JAVA_HOME=/opt/java/jdk1.5.0_22
         safe_replacement=$(printf "%s\n" "${JAVA_HOME}" | sed 's/[\&/]/\\&/g')
         sed -i 's/^#JAVA_HOME=.*$/JAVA_HOME="'"${safe_replacement}"'"/g' "${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Enable EAP JMX admin account
         sed -i 's/^.*admin=admin$/admin=admin/' "${JBOSS_HOME}/server/${JBOSS_PROFILE}/conf/props/jmx-console-users.properties"
   #. Add required JBoss MBean Server configuration to EAP so JON can monitor JVM
         echo 'JAVA_OPTS="$JAVA_OPTS -Djboss.platform.mbeanserver"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Add additional JMX/MBean configuration for JConsole to EAP
         echo 'JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Add additional JMX/MBean configuration to workaround classloader timing issues for JDK1.5 to EAP
         echo 'JAVA_OPTS="$JAVA_OPTS -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Start EAP
         cd "${JBOSS_HOME}/bin" && eval "./run.sh -c ${JBOSS_PROFILE} -b 0.0.0.0"
   #. Force JON discovery and import new EAP instance
   #. Allow JVM metrics to be collected and confirm they have been collected:
         #. Navigate to JBossAS Server -> vm121.gsslab.rdu.redhat.com:1099 0024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor >> Tables
         #. Set "Metric Display Range" Last: 60 Minutes
         #. Default collection interval is 10 minutes
   #. Shutdown EAP
         cd "${JBOSS_HOME}/bin" && ./shutdown.sh -S -s jnp://localhost:1099 -u admin -p admin
   #. Start EAP
         cd "${JBOSS_HOME}/bin" && eval "./run.sh -c ${JBOSS_PROFILE} -b 0.0.0.0"
   #. Allow JVM metrics to be collected and confirm they have been collected:
         #. Navigate to JBossAS Server -> vm121.gsslab.rdu.redhat.com:1099 0024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor >> Tables
         #. Set "Metric Display Range" Last: 60 Minutes
         #. No new metrics will be collected
  
Actual results:
JVM metrics are not collected after EAP server is restarted

The following DEBUG message appears in the agent.log file:

DEBUG [MeasurementManager.collector-1] (rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler)- Call to [org.rhq.plugins.jmx.MBeanResourceComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@12170ce, [ScheduledMeasurementInfo[res=11574, name={Usage.max}, sched=23537], ScheduledMeasurementInfo[res=11574, name=Type, sched=23547], ScheduledMeasurementInfo[res=11574, name=MemoryManagerNames, sched=23542], ScheduledMeasurementInfo[res=11574, name={Usage.init}, sched=23532], ScheduledMeasurementInfo[res=11574, name={Usage.used}, sched=23522], ScheduledMeasurementInfo[res=11574, name={Usage.committed}, sched=23527]]]] failed.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Unable to load attributes on bean [java.lang:name=PS Eden Space,type=MemoryPool] Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
	at java.util.concurrent.FutureTask.get(FutureTask.java:91)
	at org.rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler.invokeInNewThreadWithLock(ResourceContainer.java:446)
	at org.rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler.invoke(ResourceContainer.java:434)
	at $Proxy74.getValues(Unknown Source)
	at org.rhq.core.pc.measurement.MeasurementCollectorRunner.getValues(MeasurementCollectorRunner.java:111)
	at org.rhq.core.pc.measurement.MeasurementCollectorRunner.call(MeasurementCollectorRunner.java:91)
	at org.rhq.core.pc.measurement.MeasurementManager$MeasurementCollectionRequester.run(MeasurementManager.java:171)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Unable to load attributes on bean [java.lang:name=PS Eden Space,type=MemoryPool] Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at org.mc4j.ems.impl.jmx.connection.bean.DMBean.refreshAttributes(DMBean.java:268)
	at org.rhq.plugins.jmx.MBeanResourceComponent.getBeanProperties(MBeanResourceComponent.java:320)
	at org.rhq.plugins.jmx.MBeanResourceComponent.getValues(MBeanResourceComponent.java:278)
	at org.rhq.plugins.jmx.MBeanResourceComponent.getValues(MBeanResourceComponent.java:245)
	at sun.reflect.GeneratedMethodAccessor110.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:525)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	... 3 more
Caused by: org.mc4j.ems.connection.EmsConnectException: Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at org.mc4j.ems.impl.jmx.connection.support.providers.proxy.GenericMBeanServerProxy.invoke(GenericMBeanServerProxy.java:160)
	at $Proxy109.getAttributes(Unknown Source)
	at org.mc4j.ems.impl.jmx.connection.bean.DMBean.refreshAttributes(DMBean.java:229)
	... 12 more
Caused by: javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at org.jboss.mx.server.registry.BasicMBeanRegistry.get(BasicMBeanRegistry.java:523)
	at org.jboss.mx.server.MBeanServerImpl.getAttributes(MBeanServerImpl.java:562)
	at org.jboss.system.server.jmx.LazyMBeanServer.getAttributes(LazyMBeanServer.java:268)
	at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at org.jboss.jmx.connector.invoker.InvokerAdaptorService.invoke(InvokerAdaptorService.java:266)
	at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
	at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
	at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133)
	at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
	at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142)
	at org.jboss.jmx.connector.invoker.SerializableInterceptor.invoke(SerializableInterceptor.java:74)
	at org.jboss.jmx.connector.invoker.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:108)
	at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
	at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
	at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
	at org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:291)
	at org.jboss.invocation.jrmp.server.JRMPProxyFactory.invoke(JRMPProxyFactory.java:180)
	at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
	at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
	at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
	at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
	at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
	at org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:291)
	at org.jboss.invocation.jrmp.server.JRMPInvoker$MBeanServerAction.invoke(JRMPInvoker.java:818)
	at org.jboss.invocation.jrmp.server.JRMPInvoker.invoke(JRMPInvoker.java:419)
	at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:294)
	at sun.rmi.transport.Transport$1.run(Transport.java:153)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:149)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:466)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:707)
	at java.lang.Thread.run(Thread.java:595)


The root cause is also logged at WARN level.

This seems to occur for all the JVM level metrics.

Expected results:
JVM metrics should be collected as per their collection schedule even after an EAP restart

Additional info:
I can confirm that this issue does not occur if the system properties -Dcom.sun.management.jmxremote and -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl are not used. I believe the issue is actually being introduced by MBeanServerBuilderImpl as this is a delegate class to load JBoss MBeans into the JVM's MbeanServer. Although our connection object remains valid, the MBeans within it are no longer valid upon restart.

At the moment, this has only been tested against EAP 4.3 CP06. It is unknown if the same issue exists for a later 4.3 release or the 5.x release of EAP (JBoss AS5 plug-in).

It is also worth noting that availability has nothing to do with the issue. For example, if you shutdown the EAP server and force an availability check (or wait for it) the server resource will be reported as DOWN. Once the EAP server is started back up, the availability check goes back to UP but metrics are still not be collected. 

It also does not matter if you restart the EAP instance from outside of JON or use the server resource's stop/start/restart operations.

Comment 1 Charles Crouch 2011-10-07 19:20:12 UTC
I thought we had done work to improve the connection pooling/dropping for EMS

Comment 2 Ian Springer 2011-10-28 23:22:59 UTC
This was in a bug in EMS. EMS caches the MBeanInfos for MBeans it manages. When EMS re-established the remote connection to an MBeanServer that was restarted, it was not clearing the now-defunct cached MBeanInfos. When it later tried to make calls on the cached MBeanInfos, they would fail with InstanceNotFoundExceptions, e.g.:

Caused by: javax.management.InstanceNotFoundException:
java.lang:type=MemoryPool,name=PS Eden Space is not registered.

or other exceptions due to the MBeanInfo being defunct.

I have fixed the issue in EMS 1.3, which clears all cached MBeanInfos after a reconnect. 

The attached patch upgrades RHQ to EMS 1.3. I will apply it once RHQ 4.2 has been released and master is open for business.

Comment 3 Ian Springer 2011-10-28 23:24:53 UTC
Created attachment 530751 [details]
patch to upgrade RHQ to EMS 1.3

Comment 4 Ian Springer 2011-10-28 23:29:53 UTC
For the record, I reproduced this bug using EAP 4.3.0.GA_CP09 on JDK 6.0u29 and RHQ 4.2 on JDK 6.0u29, and verified my fix on the same.

Comment 5 Ian Springer 2011-11-04 16:27:10 UTC
[master 9c437b0] (http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=9c437b0) upgrades EMS from 1.2.16 to 1.3 in the root pom, which should fix this issue.

Comment 6 Ian Springer 2011-11-04 16:43:40 UTC
Fix cherry-picked to release_jon3.x branch - commit dc4a80e.

Comment 7 Sunil Kondkar 2011-11-23 12:24:24 UTC
Verified on build#100 (Version: 4.2.0.JON300-SNAPSHOT Build Number: 5d0d95e) in release_jon3.x branch.

Verified on JBoss EAP 4.3.CP06 on JDK 6u18.

Followed the steps and navigated to JBossAS Server -> EAP suniltest.usersys.redhat.com:1099 00024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor>> Tables

Verified that the JVM metrics are collected after restart of EAP server.

Comment 8 Mike Foley 2012-02-07 19:23:13 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE