Description of problem: If -Djboss.platform.mbeanserver and -Dcom.sun.management.jmxremote were used to start an EAP 4.3 server, restarting the EAP server will result in JMV metrics no longer being collected. The only way to get the missing metrics collected is to restart the JON agent. Version-Release number of selected component (if applicable): JON 2.4.1 How reproducible: Always Steps to Reproduce: #. Install JDK 1.5 if not already #. Install JBoss EAP 4.3.CP06 #. Create a new profile JBOSS_HOME=/opt/jboss/eap/jboss-eap-4.3.CP06/jboss-as JBOSS_BASE_PROFILE=default JBOSS_PROFILE=00024643 rm -rf "${JBOSS_HOME}/server/${JBOSS_PROFILE}" cp -a "${JBOSS_HOME}/server/${JBOSS_BASE_PROFILE}" "${JBOSS_HOME}/server/${JBOSS_PROFILE}" cp -a "${JBOSS_HOME}/bin/run.conf" "${JBOSS_HOME}/server/${JBOSS_PROFILE}" #. Configure EAP to use JDK 1.5 if not already JAVA_HOME=/opt/java/jdk1.5.0_22 safe_replacement=$(printf "%s\n" "${JAVA_HOME}" | sed 's/[\&/]/\\&/g') sed -i 's/^#JAVA_HOME=.*$/JAVA_HOME="'"${safe_replacement}"'"/g' "${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf" #. Enable EAP JMX admin account sed -i 's/^.*admin=admin$/admin=admin/' "${JBOSS_HOME}/server/${JBOSS_PROFILE}/conf/props/jmx-console-users.properties" #. Add required JBoss MBean Server configuration to EAP so JON can monitor JVM echo 'JAVA_OPTS="$JAVA_OPTS -Djboss.platform.mbeanserver"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf" #. Add additional JMX/MBean configuration for JConsole to EAP echo 'JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf" #. Add additional JMX/MBean configuration to workaround classloader timing issues for JDK1.5 to EAP echo 'JAVA_OPTS="$JAVA_OPTS -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl"' >>"${JBOSS_HOME}/server/${JBOSS_PROFILE}/run.conf" #. Start EAP cd "${JBOSS_HOME}/bin" && eval "./run.sh -c ${JBOSS_PROFILE} -b 0.0.0.0" #. Force JON discovery and import new EAP instance #. Allow JVM metrics to be collected and confirm they have been collected: #. Navigate to JBossAS Server -> vm121.gsslab.rdu.redhat.com:1099 0024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor >> Tables #. Set "Metric Display Range" Last: 60 Minutes #. Default collection interval is 10 minutes #. Shutdown EAP cd "${JBOSS_HOME}/bin" && ./shutdown.sh -S -s jnp://localhost:1099 -u admin -p admin #. Start EAP cd "${JBOSS_HOME}/bin" && eval "./run.sh -c ${JBOSS_PROFILE} -b 0.0.0.0" #. Allow JVM metrics to be collected and confirm they have been collected: #. Navigate to JBossAS Server -> vm121.gsslab.rdu.redhat.com:1099 0024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor >> Tables #. Set "Metric Display Range" Last: 60 Minutes #. No new metrics will be collected Actual results: JVM metrics are not collected after EAP server is restarted The following DEBUG message appears in the agent.log file: DEBUG [MeasurementManager.collector-1] (rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler)- Call to [org.rhq.plugins.jmx.MBeanResourceComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@12170ce, [ScheduledMeasurementInfo[res=11574, name={Usage.max}, sched=23537], ScheduledMeasurementInfo[res=11574, name=Type, sched=23547], ScheduledMeasurementInfo[res=11574, name=MemoryManagerNames, sched=23542], ScheduledMeasurementInfo[res=11574, name={Usage.init}, sched=23532], ScheduledMeasurementInfo[res=11574, name={Usage.used}, sched=23522], ScheduledMeasurementInfo[res=11574, name={Usage.committed}, sched=23527]]]] failed. java.util.concurrent.ExecutionException: java.lang.RuntimeException: Unable to load attributes on bean [java.lang:name=PS Eden Space,type=MemoryPool] Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered. at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler.invokeInNewThreadWithLock(ResourceContainer.java:446) at org.rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler.invoke(ResourceContainer.java:434) at $Proxy74.getValues(Unknown Source) at org.rhq.core.pc.measurement.MeasurementCollectorRunner.getValues(MeasurementCollectorRunner.java:111) at org.rhq.core.pc.measurement.MeasurementCollectorRunner.call(MeasurementCollectorRunner.java:91) at org.rhq.core.pc.measurement.MeasurementManager$MeasurementCollectionRequester.run(MeasurementManager.java:171) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: Unable to load attributes on bean [java.lang:name=PS Eden Space,type=MemoryPool] Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered. at org.mc4j.ems.impl.jmx.connection.bean.DMBean.refreshAttributes(DMBean.java:268) at org.rhq.plugins.jmx.MBeanResourceComponent.getBeanProperties(MBeanResourceComponent.java:320) at org.rhq.plugins.jmx.MBeanResourceComponent.getValues(MBeanResourceComponent.java:278) at org.rhq.plugins.jmx.MBeanResourceComponent.getValues(MBeanResourceComponent.java:245) at sun.reflect.GeneratedMethodAccessor110.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:525) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: org.mc4j.ems.connection.EmsConnectException: Connection failure java.lang:type=MemoryPool,name=PS Eden Space is not registered. at org.mc4j.ems.impl.jmx.connection.support.providers.proxy.GenericMBeanServerProxy.invoke(GenericMBeanServerProxy.java:160) at $Proxy109.getAttributes(Unknown Source) at org.mc4j.ems.impl.jmx.connection.bean.DMBean.refreshAttributes(DMBean.java:229) ... 12 more Caused by: javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=PS Eden Space is not registered. at org.jboss.mx.server.registry.BasicMBeanRegistry.get(BasicMBeanRegistry.java:523) at org.jboss.mx.server.MBeanServerImpl.getAttributes(MBeanServerImpl.java:562) at org.jboss.system.server.jmx.LazyMBeanServer.getAttributes(LazyMBeanServer.java:268) at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:592) at org.jboss.jmx.connector.invoker.InvokerAdaptorService.invoke(InvokerAdaptorService.java:266) at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:592) at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155) at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94) at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133) at org.jboss.mx.server.Invocation.invoke(Invocation.java:88) at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142) at org.jboss.jmx.connector.invoker.SerializableInterceptor.invoke(SerializableInterceptor.java:74) at org.jboss.jmx.connector.invoker.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:108) at org.jboss.mx.server.Invocation.invoke(Invocation.java:88) at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264) at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659) at org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:291) at org.jboss.invocation.jrmp.server.JRMPProxyFactory.invoke(JRMPProxyFactory.java:180) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:592) at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155) at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94) at org.jboss.mx.server.Invocation.invoke(Invocation.java:86) at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264) at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659) at org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:291) at org.jboss.invocation.jrmp.server.JRMPInvoker$MBeanServerAction.invoke(JRMPInvoker.java:818) at org.jboss.invocation.jrmp.server.JRMPInvoker.invoke(JRMPInvoker.java:419) at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:592) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:294) at sun.rmi.transport.Transport$1.run(Transport.java:153) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:149) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:466) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:707) at java.lang.Thread.run(Thread.java:595) The root cause is also logged at WARN level. This seems to occur for all the JVM level metrics. Expected results: JVM metrics should be collected as per their collection schedule even after an EAP restart Additional info: I can confirm that this issue does not occur if the system properties -Dcom.sun.management.jmxremote and -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl are not used. I believe the issue is actually being introduced by MBeanServerBuilderImpl as this is a delegate class to load JBoss MBeans into the JVM's MbeanServer. Although our connection object remains valid, the MBeans within it are no longer valid upon restart. At the moment, this has only been tested against EAP 4.3 CP06. It is unknown if the same issue exists for a later 4.3 release or the 5.x release of EAP (JBoss AS5 plug-in). It is also worth noting that availability has nothing to do with the issue. For example, if you shutdown the EAP server and force an availability check (or wait for it) the server resource will be reported as DOWN. Once the EAP server is started back up, the availability check goes back to UP but metrics are still not be collected. It also does not matter if you restart the EAP instance from outside of JON or use the server resource's stop/start/restart operations.
I thought we had done work to improve the connection pooling/dropping for EMS
This was in a bug in EMS. EMS caches the MBeanInfos for MBeans it manages. When EMS re-established the remote connection to an MBeanServer that was restarted, it was not clearing the now-defunct cached MBeanInfos. When it later tried to make calls on the cached MBeanInfos, they would fail with InstanceNotFoundExceptions, e.g.: Caused by: javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=PS Eden Space is not registered. or other exceptions due to the MBeanInfo being defunct. I have fixed the issue in EMS 1.3, which clears all cached MBeanInfos after a reconnect. The attached patch upgrades RHQ to EMS 1.3. I will apply it once RHQ 4.2 has been released and master is open for business.
Created attachment 530751 [details] patch to upgrade RHQ to EMS 1.3
For the record, I reproduced this bug using EAP 4.3.0.GA_CP09 on JDK 6.0u29 and RHQ 4.2 on JDK 6.0u29, and verified my fix on the same.
[master 9c437b0] (http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=9c437b0) upgrades EMS from 1.2.16 to 1.3 in the root pom, which should fix this issue.
Fix cherry-picked to release_jon3.x branch - commit dc4a80e.
Verified on build#100 (Version: 4.2.0.JON300-SNAPSHOT Build Number: 5d0d95e) in release_jon3.x branch. Verified on JBoss EAP 4.3.CP06 on JDK 6u18. Followed the steps and navigated to JBossAS Server -> EAP suniltest.usersys.redhat.com:1099 00024643 -> JVM -> Memory Subsystem -> Memory Pool -> PS Eden Space >> Monitor>> Tables Verified that the JVM metrics are collected after restart of EAP server.
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE