Bug 536468 (RHQ-813)
Summary: | Thread dump operation fails against JBossAS | ||
---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Greg Hinkle <ghinkle> |
Component: | No Component | Assignee: | John Mazzitelli <mazz> |
Status: | CLOSED WONTFIX | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 1.0.1 | CC: | asantos, ccrouch, dsteigne, jginzburg, jshaughn, loleary, mazz, tao |
Target Milestone: | --- | Keywords: | SubBug |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
URL: | http://jira.rhq-project.org/browse/RHQ-813 | ||
Whiteboard: | |||
Fixed In Version: | 1.4 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-09-02 18:38:45 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 565628 |
Description
Greg Hinkle
2008-09-09 18:04:00 UTC
Also fails on 1.1.2 See case 246555 When connecting to the JBoss MBean without using EMS, it seems to work: JMXServiceURL url = new JMXServiceURL("rmi", "", 0, urlPath); this.jmxc = JMXConnectorFactory.connect(url); this.server = jmxc.getMBeanServerConnection(); tmbean = newPlatformMXBeanProxy(server,ManagementFactory.THREAD_MXBEAN_NAME, ThreadMXBean.class); long[] tids = tmbean.getAllThreadIds(); ThreadInfo[] tinfos = tmbean.getThreadInfo(tids, Integer.MAX_VALUE); //it works This bug is not reproducible against JBoss JVM using -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8005 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false i still see this on 1.2 Still broken on rev4508 - now on QA's watchlist Pushed 1.4 This bug was previously known as http://jira.rhq-project.org/browse/RHQ-813 This bug is duplicated by RHQ-1227 Temporarily adding the keyword "SubBug" so we can be sure we have accounted for all the bugs. keyword: new = Tracking + FutureFeature + SubBug making sure we're not missing any bugs in rhq_triage What I think is happening is the javax.management implementation classes in JBoss is different than the JMX implementation getting picked up by the agent. Probably need to figure out how to work around that. One thing that I'm finding odd is the following - first look at the exception message: org.mc4j.ems.connection.EmsConnectException: java.io.StreamCorruptedException: javax.management.openmbean.OpenDataException: item value [Ljavax.management.openmbean.CompositeData;@38df8f23 for item name lockedMonitors is not a javax.management.openmbean.ArrayType [Ljavax.management.openmbean.CompositeData; 1-dimensional array of javax.management.openmbean.CompositeType name=className type=javax.management.openmbean.SimpleType:java.lang.String name=identityHashCode type=javax.management.openmbean.SimpleType:java.lang.Integer name=lockedStackDepth type=javax.management.openmbean.SimpleType:java.lang.Integer name=lockedStackFrame type=javax.management.openmbean.CompositeType name=className type=javax.management.openmbean.SimpleType:java.lang.String name=fileName type=javax.management.openmbean.SimpleType:java.lang.String name=lineNumber type=javax.management.openmbean.SimpleType:java.lang.Integer name=methodName type=javax.management.openmbean.SimpleType:java.lang.String name=nativeMethod type=javax.management.openmbean.SimpleType:java.lang.Boolean and notice it lists the names of the composite types - "fileName", "lineNumber", etc. I do not see a "lockedMonitors" name. It appears to be missing - but it should be expected. *** Bug 535496 has been marked as a duplicate of this bug. *** I think I found something. http://download.oracle.com/javase/1.5.0/docs/api/java/lang/management/ThreadInfo.html Shows there is no "locked monitors" metric available on ThreadInfo for JDK 5. however, for JDK 6: http://download-llnw.oracle.com/javase/6/docs/api/java/lang/management/ThreadInfo.html It IS there. Obviously, something is going on with this. Is it possible that the JBoss AS VM is using JDK 5 platform MBean metrics but we try to deserialize this data on the JDK 6 agent? That seems odd because both server and agent of mine are running on JDK 6. But clearly, this locked monitors stuff is new in JDK 6. the exception message also seems to want a value to be an ArrayType when its really an array of CompositeData. I also notice that in the JDK's CompositeDataSupport, it doesn't even have a readObject method, but JBossAS's implementation does and it throws StreamCorruptedException - so this exception is coming from the JBossAS code - I can't say if its coming from the server or client side yet. But this explains why this works in other JVM Server resources (like the Agent) - it isn't using the JBoss implementation for this class. this is definitely happening on the client side - I'm able to step through this code while JPDA attached to the agent. I see why the problem is happening, not how to fix it. javax.management.openmbean.CompositeDataSupport specifically JBossAS's implementation of it (found in jboss-jmx.jar): /* */ private void readObject(ObjectInputStream in) /* */ throws IOException, ClassNotFoundException /* */ { /* 199 */ ObjectInputStream.GetField getField = in.readFields(); /* 200 */ SortedMap contents = (SortedMap)getField.get("contents", null); /* 201 */ CompositeType compositeType = (CompositeType)getField.get("compositeType", null); The data just got streamed over from the server. readObject pulled out "compositeType" - which is an object of type "javax.management.openmbean.CompositeType". It has a map inside of it "nameToType" which is a mapping of the item names to their open types. I looked in that map, and this is what is in it for lockedMonitors: lockedMonitors=javax.management.openmbean.ArrayType [Ljavax.management.openmbean.CompositeData; 1-dimensional array of javax.management.openmbean.CompositeType name=className type=javax.management.openmbean.SimpleType:java.lang.String name=identityHashCode type=javax.management.openmbean.SimpleType:java.lang.Integer name=lockedStackDepth type=javax.management.openmbean.SimpleType:java.lang.Integer name=lockedStackFrame type=javax.management.openmbean.CompositeType name=className type=javax.management.openmbean.SimpleType:java.lang.String name=fileName type=javax.management.openmbean.SimpleType:java.lang.String name=lineNumber type=javax.management.openmbean.SimpleType:java.lang.Integer name=methodName type=javax.management.openmbean.SimpleType:java.lang.String name=nativeMethod type=javax.management.openmbean.SimpleType:java.lang.Boolean Notice it says the type is "ArrayType". Now looking in the deserialized "contents" object that came over the wire (which is a map the data itself, not the type information), I find this: lockedMonitors=[Ljavax.management.openmbean.CompositeData;@5af478a1 which is not the type "compositeTypes" says it is. We're told it is an ArrayType, but the actual data is of type "CompositeData[]". In the method javax.management.openmbean.CompositeDataSupport.init(CompositeType, Map), it compares those two types, sees a mismatch and bombs: /* 330 */ OpenType openType = compositeType.getType(key); /* 333 */ Object value = items.get(key); /* 334 */ if ((value != null) && (!(openType.isValue(value)))) { /* 335 */ throw new OpenDataException("item value " + value + " for item name " + key + " is not a " + openType); That OpenDataException bubbles up to a StreamCorruptedException that we see. /* */ try { /* 204 */ init(compositeType, contents); /* */ } catch (Exception e) { /* 208 */ throw new StreamCorruptedException(e.toString()); /* */ } Now I have to find out why ArrayType does not match CompositeData[] - the latter sure looks like an array to me :) Stepping thru further, if I go into: /* 334 */ if ((value != null) && (!(openType.isValue(value)))) { openType is an object that is JBossAS implementation of javax.management.openmbean.ArrayType value is an object that is the javax.management.openmbean.CompositeData[] value (its actually a 0-length array in my debugging - its empty, but a valid array) Going to ArrayType.isValue: /* */ try { /* 122 */ thisClass = Thread.currentThread().getContextClassLoader().loadClass(getClassName()); /* */ } catch (ClassNotFoundException e) { /* 126 */ return false; /* */ } That loadClass throws a ClassNotFoundException: java.lang.ClassNotFoundException: [Ljavax.management.openmbean.CompositeData; getClassName() did indeed return the string "[Ljavax.management.openmbean.CompositeData;" because that is the type of the value. So ArrayType.className (and typeName) are both that "[L..." string...which is odd, because that's not an allowed type: From OpenType (which ArrayType extends): public static final String[] ALLOWED_CLASSNAMES = { Void.class.getName(), Boolean.class.getName(), Character.class.getName(), Byte.class.getName(), Short.class.getName(), Integer.class.getName(), Long.class.getName(), Float.class.getName(), Double.class.getName(), String.class.getName(), Date.class.getName(), BigDecimal.class.getName(), BigInteger.class.getName(), ObjectName.class.getName(), CompositeData.class.getName(), TabularData.class.getName() }; So this client code has two problems - one the classname is an invalid unallowed string and its of the "[L" notation and you can't load a class with that string as its name; for example: ClassLoader.findClass("[Ljava.lang.String;") results in "ClassNotFoundException: [Ljava.lang.String;". So, this JMX client code all around looks bad. This is the JBossAS JMX implementation code which is why it affects MBeans in a JBossAS server and no where else. I just found this in JBossAS's run.bat: rem Set sun.lang.ClassLoader.allowArraySyntax to true to avoid deserialization bottleneck in arrays for JDK 1.6 set JAVA_OPTS=%JAVA_OPTS% -Dsun.lang.ClassLoader.allowArraySyntax=true I wonder if this will be fixed if all we do is add that sysprop to the agent??? That setting appears to allow for the "[L" notation in class names (at least, that's what the sysprop name implies). hmm... setting that system property got slightly further: The "[L..." class loaded, however immediately after, ArrayType checks assignability of the expected and actual class: /* 128 */ if (!(thisClass.isAssignableFrom(clazz))) /* 129 */ return false; "thisClass" is what was loaded using the array notation. The both (of type java.lang.Class) have toString of: class [Ljavax.management.openmbean.CompositeData; but that isAssignableFrom returns false. This is their classloaders (thisClass.getClassLoader()) thisClass=null (probably bootstrap classloader?) clazz=org.mc4j.ems.connection.support.classloader.ChildFirstClassloader@5568c5f9 Thread.currentThread().getContextClassLoader() returns the plugin classloader. Loading JMX classes like "[Ljavax.management.openmbean.CompositeData;" via that plugin classloader will load that from the system classloader (because that's where our JMX classes live). So when deserializing, we load ArrayType from the EMS classloader, but this check uses the context classloader which is our plugin classloader, thus the classes are different. Looks like we need to do something like have EMS change the context classloader to its child classloader somehow, somewhere before making remote calls to JBoss? Yet another classloader hell hole. OK, if I go up the stack to the last EMS method that invoked the JBoss client, its this: org.mc4j.ems.impl.jmx.connection.support.providers.proxy.GenericMBeanServerProxy.invoke(Object, Method, Object[]) and curiously, I find that the setting of the context classloader to the EMS classloader has been commented out: ClassLoader ctxLoader = Thread.currentThread().getContextClassLoader(); try { roundTrips++; // Thread.currentThread().setContextClassLoader(this.getClass().getClassLoader()); ... returnValue = method.invoke(this.remoteServer, args); ... } finally { // Thread.currentThread().setContextClassLoader(ctxLoader); } That "method.invoke" is calling into the remote JBoss JMX server and is where the exception is thrown. Notice how setting the context classloader is commented out. If that was not commented out, I think this would have worked (along with needing that allowArraySyntax system property) because this.getClass().getClassLoader() would have returned the EMS classloader. We need to investigate who/why that code was commented out. We don't want to blindly change that because it obviously was done for a reason and we don't want to break what it might have been trying to fix. looking at svn history of that file, that commented out code was in the original code and hasn't changed since april, 2006. we should experiment to see if setting the context classloader is OK to do and if so, fix EMS. I think its worth a shot, it should work, unless uncommenting that code breaks something else that I am not aware of. Specifically, I'm talking about this patch to org.mc4j.ems.impl.jmx.connection.support.providers.proxy.GenericMBeanServerProxy Index: src/ems-impl/org/mc4j/ems/impl/jmx/connection/support/providers/proxy/GenericMBeanServerProxy.java =================================================================== --- src/ems-impl/org/mc4j/ems/impl/jmx/connection/support/providers/proxy/GenericMBeanServerProxy.java (revision 616) +++ src/ems-impl/org/mc4j/ems/impl/jmx/connection/support/providers/proxy/GenericMBeanServerProxy.java (working copy) @@ -108,7 +108,7 @@ ClassLoader ctxLoader = Thread.currentThread().getContextClassLoader(); try { roundTrips++; -// Thread.currentThread().setContextClassLoader(this.getClass().getClassLoader()); + Thread.currentThread().setContextClassLoader(this.getClass().getClassLoader()); boolean isJBossConnection = (this.provider instanceof JBossConnectionProvider); if (isJBossConnection) { JBossConnectionProvider jbossProvider = (JBossConnectionProvider) this.provider; @@ -167,7 +167,7 @@ //e.printStackTrace(); throw e; } finally { -// Thread.currentThread().setContextClassLoader(ctxLoader); + Thread.currentThread().setContextClassLoader(ctxLoader); } } I made the changes to EMS and the agent scripts and I get past the classloader error but now get this: Caused by: java.lang.ClassCastException: [Ljavax.management.openmbean.CompositeData; cannot be cast to [Ljava.lang.management.ThreadInfo; at $Proxy87.getThreadInfo(Unknown Source) at org.rhq.plugins.jmx.ThreadDataMeasurementComponent.invokeOperation(ThreadDataMeasurementComponent.java:48) The ThreadMXBean interface that is used to obtain Platform MBeanServer information is this: ThreadInfo[]java.lang.management.ThreadMXBean.getThreadInfo(long[], int) However, the underlying implementation uses Open MBeans and the underlying code returns CompositeData[]. Some how, the translation from CompositeData[] to ThreadInfo[] is not being done. I don't know if JBoss client code is supposed to do this or EMS. But, in short, the classloading hell seems to be circumvented with the new EMS code and the allowArraySyntax sysprop, however, there is another bug cropping up. jmx-client.jar has an implementation of javax.management.MBeanServerInvocationHandler Its invoking that ThreadMXBean method, which is returning a CompositeData[] (since the implementation apparently follows the OpenMBean way of doing things). But the public API to ThreadMXBean says it should be returning ThreadInfo[]. Somewhere in the call chain of these proxies, that CompositeData[] needs to be translated to the expected return value of ThreadInfo[] and I think this needs to be done in JBossAS's MBeanServerInvocationHandler implementation. Because its getting back a CompositeData but that dynamic proxy knows it needs to pass back a ThreadInfo[] as its return type. Its possible the JBossAS implementation was coded up to not support Open MBeans. After looking at this a little closer, I'm thinking JBossAS 4.3's JMX Client does not support Open MBean dynamic MBeans, at least not through its MBean invoker proxy. I don't see how jmx-client.jar's implementation of javax.management.MBeanServerInvocationHandler can support calling the ThreadMXBean (which is an OpenMBean returning CompositeData[] but whose public API returns ThreadInfo[]). All that said, I'm not even sure if EMS should be the thing doing the translation from CompositeData[] to ThreadInfo. I don't think that it should, but I'm not sure. We'd need to write a simple test using just the JBoss JMX client (no EMS) and see if we can replicate the issue. You'd have to call getThreadInfo via the MBeanServerConnection interface and see if it can return a valid ThreadInfo[] array. so I don't lose this, here's the patch to the agent startup scripts so the proper sysprop is set to allow SUN VM's classloaders to load array classes. --------------------------- diff --git a/modules/enterprise/agent/src/etc/java-service-wrapper/rhq-agent-wrapper.conf b/modules/enterprise/agent/src/etc/java-service-wrapper/rhq-agen index 25c99aa..2826815 100644 --- a/modules/enterprise/agent/src/etc/java-service-wrapper/rhq-agent-wrapper.conf +++ b/modules/enterprise/agent/src/etc/java-service-wrapper/rhq-agent-wrapper.conf @@ -70,6 +70,7 @@ wrapper.java.additional.3=-Xmx128m wrapper.java.additional.4=-Di18nlog.dump-stack-traces=false wrapper.java.additional.5=-Dsigar.nativeLogging=false wrapper.java.additional.6="-Djava.endorsed.dirs=%RHQ_AGENT_HOME%/lib/endorsed" +wrapper.java.additional.7=-Dsun.lang.ClassLoader.allowArraySyntax=true # We want to make sure the agent starts in its install directory (quotes not needed) wrapper.working.dir=%RHQ_AGENT_HOME% diff --git a/modules/enterprise/agent/src/etc/rhq-agent.bat b/modules/enterprise/agent/src/etc/rhq-agent.bat index 59b7676..a3555ae 100644 --- a/modules/enterprise/agent/src/etc/rhq-agent.bat +++ b/modules/enterprise/agent/src/etc/rhq-agent.bat @@ -110,6 +110,7 @@ rem ---------------------------------------------------------------------- if not defined RHQ_AGENT_JAVA_OPTS ( set RHQ_AGENT_JAVA_OPTS=-Xms64m -Xmx128m -Djava.net.preferIPv4Stack=true ) +set RHQ_AGENT_JAVA_OPTS="%RHQ_AGENT_JAVA_OPTS% -Dsun.lang.ClassLoader.allowArraySyntax=true" if defined RHQ_AGENT_DEBUG echo RHQ_AGENT_JAVA_OPTS: %RHQ_AGENT_JAVA_OPTS% if "%RHQ_AGENT_JAVA_ENDORSED_DIRS%" == "none" ( diff --git a/modules/enterprise/agent/src/etc/rhq-agent.sh b/modules/enterprise/agent/src/etc/rhq-agent.sh index e0122bd..905769f 100755 --- a/modules/enterprise/agent/src/etc/rhq-agent.sh +++ b/modules/enterprise/agent/src/etc/rhq-agent.sh @@ -159,6 +159,7 @@ done if [ "x$RHQ_AGENT_JAVA_OPTS" = "x" ]; then RHQ_AGENT_JAVA_OPTS="-Xms64m -Xmx128m -Djava.net.preferIPv4Stack=true" fi +RHQ_AGENT_JAVA_OPTS="${RHQ_AGENT_JAVA_OPTS} -Dsun.lang.ClassLoader.allowArraySyntax=true" debug_msg "RHQ_AGENT_JAVA_OPTS: $RHQ_AGENT_JAVA_OPTS" if [ "$RHQ_AGENT_JAVA_ENDORSED_DIRS" = "none" ]; then Here's more info, I don't know how relevent it is, but it seems important to know: First this: http://download.oracle.com/javase/1.5.0/docs/api/java/lang/management/ManagementFactory.html#newPlatformMXBeanProxy%28javax.management.MBeanServerConnection,%20java.lang.String,%20java.lang.Class%29 says "MBeanServerInvocationHandler or its newProxyInstance method cannot be used to create a proxy for a platform MXBean. The proxy object created by MBeanServerInvocationHandler does not handle the properties of the platform MXBeans described in the class specification." So, using MBeanServerInvocationHandler isn't support to work for things like ThreadMX beans. But EMS seems to take care of this - see org.mc4j.ems.impl.jmx.connection.bean.DAdvancedBean.getProxy(Class<T>): public <T> T getProxy(Class<T> beanInterface) { if (mbeanProxy == null) { try { // 1.5 only stuff Class c = Class.forName("java.lang.management.ManagementFactory"); Class mbsc = Class.forName("javax.management.MBeanServerConnection"); Method m = c.getMethod("newPlatformMXBeanProxy",mbsc,String.class,Class.class); return (T) m.invoke(null,connectionProvider.getMBeanServer(), getBeanName().getCanonicalName(), beanInterface); } catch (Exception e) { // Expected if its not a platform mbean // e.printStackTrace(); } MBeanServer server = getConnectionProvider().getMBeanServer(); mbeanProxy = MBeanServerInvocationHandler.newProxyInstance(server, getObjectName(), beanInterface, getNotifications().size() > 0); Notice the comment "Expected if its not a platform mbean". If its a platform MX bean, using newPlatformMXBeanProxy is used properly. If its not a platform MX bean (like ThreadMX) then it uses MBeanServerInvocationHandler. So this EMS code seems to be coded up properly - it knows now to use MBeanServerInvocationHandler for those platform MX beans. Mazz, does this have any current relevance? If not please close. (In reply to Jay Shaughnessy from comment #32) > Mazz, does this have any current relevance? If not please close. this probably still happens for JBossAS 4 servers in inventory. Closing due to inactivity. |