Bug 701375 - [JBossAS4] JVM metrics stop being collected after a restart of EAP server
Summary: [JBossAS4] JVM metrics stop being collected after a restart of EAP server
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Plugins
Version: 4.2
Hardware: All
OS: All
urgent
high
Target Milestone: ---
: JON 3.0.0,RHQ 4.3.0
Assignee: Ian Springer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: jon30-sprint8
TreeView+ depends on / blocked
 
Reported: 2011-05-02 18:00 UTC by Larry O'Leary
Modified: 2018-11-14 20:37 UTC (History)
5 users (show)

Fixed In Version: 4.3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
JON 2.4.1 EAP 4.3 CP06 EAP started using the following system properties: -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
Last Closed: 2012-02-07 19:23:13 UTC
Embargoed:


Attachments (Terms of Use)
patch to upgrade RHQ to EMS 1.3 (604 bytes, patch)
2011-10-28 23:24 UTC, Ian Springer
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 55049 0 None None None Never

Description Larry O'Leary 2011-05-02 18:00:40 UTC
Description of problem:
If -Djboss.platform.mbeanserver and -Dcom.sun.management.jmxremote were used to start an EAP 4.3 server, restarting the EAP server will result in JMV metrics no longer being collected. The only way to get the missing metrics collected is to restart the JON agent.

Version-Release number of selected component (if applicable):
JON 2.4.1

How reproducible:
Always

Steps to Reproduce:
   #. Install JDK 1.5 if not already
   #. Install JBoss EAP 4.3.CP06
   #. Create a new profile
         JBOSS_HOME=/opt/jboss/eap/jboss-eap-4.3.CP06/jboss-as
         JBOSS_BASE_PROFILE=default
         JBOSS_PROFILE=00024643
         rm -rf "${JBOSS_HOME}/server/${JBOSS_PROFILE}"
         cp -a "${JBOSS_HOME}/server/${JBOSS_BASE_PROFILE}" "${JBOSS_HOME}/server/${JBOSS_PROFILE}"
         cp -a "${JBOSS_HOME}/bin/run.conf" "${JBOSS_HOME}/server/${JBOSS_PROFILE}" 
   #. Configure EAP to use JDK 1.5 if not already
         JAVA_HOME=/opt/java/jdk1.5.0_22
         safe_replacement=$(printf "%s\n" "${JAVA_HOME}" | sed 's/[\&/]/\\&/g')
         sed -i 's/^#JAVA_HOME=.*$/JAVA_HOME="'"${safe_replacement}"'"/g' "${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Enable EAP JMX admin account
         sed -i 's/^.*admin=admin$/admin=admin/' "${JBOSS_HOME}/server/${JBOSS_PROFILE}/conf/props/jmx-console-users.properties"
   #. Add required JBoss MBean Server configuration to EAP so JON can monitor JVM
         echo 'JAVA_OPTS="$JAVA_OPTS -Djboss.platform.mbeanserver"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Add additional JMX/MBean configuration for JConsole to EAP
         echo 'JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Add additional JMX/MBean configuration to workaround classloader timing issues for JDK1.5 to EAP
         echo 'JAVA_OPTS="$JAVA_OPTS -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf"
   #. Start EAP
         cd "${JBOSS_HOME}/bin" && eval "./run.sh -c ${JBOSS_PROFILE} -b 0.0.0.0"
   #. Force JON discovery and import new EAP instance
   #. Allow JVM metrics to be collected and confirm they have been collected:
         #. Navigate to JBossAS Server -> vm121.gsslab.rdu.redhat.com:1099 0024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor >> Tables
         #. Set "Metric Display Range" Last: 60 Minutes
         #. Default collection interval is 10 minutes
   #. Shutdown EAP
         cd "${JBOSS_HOME}/bin" && ./shutdown.sh -S -s jnp://localhost:1099 -u admin -p admin
   #. Start EAP
         cd "${JBOSS_HOME}/bin" && eval "./run.sh -c ${JBOSS_PROFILE} -b 0.0.0.0"
   #. Allow JVM metrics to be collected and confirm they have been collected:
         #. Navigate to JBossAS Server -> vm121.gsslab.rdu.redhat.com:1099 0024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor >> Tables
         #. Set "Metric Display Range" Last: 60 Minutes
         #. No new metrics will be collected
  
Actual results:
JVM metrics are not collected after EAP server is restarted

The following DEBUG message appears in the agent.log file:

DEBUG [MeasurementManager.collector-1] (rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler)- Call to [org.rhq.plugins.jmx.MBeanResourceComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@12170ce, [ScheduledMeasurementInfo[res=11574, name={Usage.max}, sched=23537], ScheduledMeasurementInfo[res=11574, name=Type, sched=23547], ScheduledMeasurementInfo[res=11574, name=MemoryManagerNames, sched=23542], ScheduledMeasurementInfo[res=11574, name={Usage.init}, sched=23532], ScheduledMeasurementInfo[res=11574, name={Usage.used}, sched=23522], ScheduledMeasurementInfo[res=11574, name={Usage.committed}, sched=23527]]]] failed.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Unable to load attributes on bean [java.lang:name=PS Eden Space,type=MemoryPool] Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
	at java.util.concurrent.FutureTask.get(FutureTask.java:91)
	at org.rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler.invokeInNewThreadWithLock(ResourceContainer.java:446)
	at org.rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler.invoke(ResourceContainer.java:434)
	at $Proxy74.getValues(Unknown Source)
	at org.rhq.core.pc.measurement.MeasurementCollectorRunner.getValues(MeasurementCollectorRunner.java:111)
	at org.rhq.core.pc.measurement.MeasurementCollectorRunner.call(MeasurementCollectorRunner.java:91)
	at org.rhq.core.pc.measurement.MeasurementManager$MeasurementCollectionRequester.run(MeasurementManager.java:171)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Unable to load attributes on bean [java.lang:name=PS Eden Space,type=MemoryPool] Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at org.mc4j.ems.impl.jmx.connection.bean.DMBean.refreshAttributes(DMBean.java:268)
	at org.rhq.plugins.jmx.MBeanResourceComponent.getBeanProperties(MBeanResourceComponent.java:320)
	at org.rhq.plugins.jmx.MBeanResourceComponent.getValues(MBeanResourceComponent.java:278)
	at org.rhq.plugins.jmx.MBeanResourceComponent.getValues(MBeanResourceComponent.java:245)
	at sun.reflect.GeneratedMethodAccessor110.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:525)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	... 3 more
Caused by: org.mc4j.ems.connection.EmsConnectException: Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at org.mc4j.ems.impl.jmx.connection.support.providers.proxy.GenericMBeanServerProxy.invoke(GenericMBeanServerProxy.java:160)
	at $Proxy109.getAttributes(Unknown Source)
	at org.mc4j.ems.impl.jmx.connection.bean.DMBean.refreshAttributes(DMBean.java:229)
	... 12 more
Caused by: javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=PS Eden Space is not registered.
	at org.jboss.mx.server.registry.BasicMBeanRegistry.get(BasicMBeanRegistry.java:523)
	at org.jboss.mx.server.MBeanServerImpl.getAttributes(MBeanServerImpl.java:562)
	at org.jboss.system.server.jmx.LazyMBeanServer.getAttributes(LazyMBeanServer.java:268)
	at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at org.jboss.jmx.connector.invoker.InvokerAdaptorService.invoke(InvokerAdaptorService.java:266)
	at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
	at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
	at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133)
	at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
	at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142)
	at org.jboss.jmx.connector.invoker.SerializableInterceptor.invoke(SerializableInterceptor.java:74)
	at org.jboss.jmx.connector.invoker.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:108)
	at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
	at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
	at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
	at org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:291)
	at org.jboss.invocation.jrmp.server.JRMPProxyFactory.invoke(JRMPProxyFactory.java:180)
	at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
	at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
	at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
	at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
	at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
	at org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:291)
	at org.jboss.invocation.jrmp.server.JRMPInvoker$MBeanServerAction.invoke(JRMPInvoker.java:818)
	at org.jboss.invocation.jrmp.server.JRMPInvoker.invoke(JRMPInvoker.java:419)
	at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:592)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:294)
	at sun.rmi.transport.Transport$1.run(Transport.java:153)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:149)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:466)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:707)
	at java.lang.Thread.run(Thread.java:595)


The root cause is also logged at WARN level.

This seems to occur for all the JVM level metrics.

Expected results:
JVM metrics should be collected as per their collection schedule even after an EAP restart

Additional info:
I can confirm that this issue does not occur if the system properties -Dcom.sun.management.jmxremote and -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl are not used. I believe the issue is actually being introduced by MBeanServerBuilderImpl as this is a delegate class to load JBoss MBeans into the JVM's MbeanServer. Although our connection object remains valid, the MBeans within it are no longer valid upon restart.

At the moment, this has only been tested against EAP 4.3 CP06. It is unknown if the same issue exists for a later 4.3 release or the 5.x release of EAP (JBoss AS5 plug-in).

It is also worth noting that availability has nothing to do with the issue. For example, if you shutdown the EAP server and force an availability check (or wait for it) the server resource will be reported as DOWN. Once the EAP server is started back up, the availability check goes back to UP but metrics are still not be collected. 

It also does not matter if you restart the EAP instance from outside of JON or use the server resource's stop/start/restart operations.

Comment 1 Charles Crouch 2011-10-07 19:20:12 UTC
I thought we had done work to improve the connection pooling/dropping for EMS

Comment 2 Ian Springer 2011-10-28 23:22:59 UTC
This was in a bug in EMS. EMS caches the MBeanInfos for MBeans it manages. When EMS re-established the remote connection to an MBeanServer that was restarted, it was not clearing the now-defunct cached MBeanInfos. When it later tried to make calls on the cached MBeanInfos, they would fail with InstanceNotFoundExceptions, e.g.:

Caused by: javax.management.InstanceNotFoundException:
java.lang:type=MemoryPool,name=PS Eden Space is not registered.

or other exceptions due to the MBeanInfo being defunct.

I have fixed the issue in EMS 1.3, which clears all cached MBeanInfos after a reconnect. 

The attached patch upgrades RHQ to EMS 1.3. I will apply it once RHQ 4.2 has been released and master is open for business.

Comment 3 Ian Springer 2011-10-28 23:24:53 UTC
Created attachment 530751 [details]
patch to upgrade RHQ to EMS 1.3

Comment 4 Ian Springer 2011-10-28 23:29:53 UTC
For the record, I reproduced this bug using EAP 4.3.0.GA_CP09 on JDK 6.0u29 and RHQ 4.2 on JDK 6.0u29, and verified my fix on the same.

Comment 5 Ian Springer 2011-11-04 16:27:10 UTC
[master 9c437b0] (http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=9c437b0) upgrades EMS from 1.2.16 to 1.3 in the root pom, which should fix this issue.

Comment 6 Ian Springer 2011-11-04 16:43:40 UTC
Fix cherry-picked to release_jon3.x branch - commit dc4a80e.

Comment 7 Sunil Kondkar 2011-11-23 12:24:24 UTC
Verified on build#100 (Version: 4.2.0.JON300-SNAPSHOT Build Number: 5d0d95e) in release_jon3.x branch.

Verified on JBoss EAP 4.3.CP06 on JDK 6u18.

Followed the steps and navigated to JBossAS Server -> EAP suniltest.usersys.redhat.com:1099 00024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor>> Tables

Verified that the JVM metrics are collected after restart of EAP server.

Comment 8 Mike Foley 2012-02-07 19:23:13 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.