Bug 1751985 - Slower execution when running JVM in debug mode with debug agent attached
Summary: Slower execution when running JVM in debug mode with debug agent attached
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: java-1.8.0-openjdk
Version: 7.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Roman Kennke
QA Contact: OpenJDK QA
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-13 11:22 UTC by Simeon Andreev
Modified: 2020-05-25 00:29 UTC (History)
9 users (show)

Fixed In Version: java-1.8.0-openjdk-1.8.0.252.b09-4.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
ayadav: needinfo-


Attachments (Terms of Use)
A snippet that loads many classes multiple times and measures running time. (4.55 MB, application/zip)
2019-09-13 11:22 UTC, Simeon Andreev
no flags Details
Simplified patch, after examining the Android JVM changes in "classTrack.c". (120.30 KB, patch)
2019-09-19 09:37 UTC, Simeon Andreev
no flags Details | Diff
Simplified and fixed patch, after contacting author of Android JDK patch. (120.80 KB, patch)
2019-09-22 07:57 UTC, Simeon Andreev
no flags Details | Diff
Script to compare performance of GC with and without debugger, when many classes are loaded and classes are being unloaded. (48.16 KB, application/zip)
2019-11-29 15:39 UTC, Simeon Andreev
no flags Details
Crash with the updated JDK. (1.22 MB, text/plain)
2020-01-09 07:34 UTC, Simeon Andreev
no flags Details
Crash log with debug build and patch from 20.12.2019. (343.16 KB, text/plain)
2020-01-15 18:42 UTC, Simeon Andreev
no flags Details
Video showing the JDK on-debugger-attach crash with Eclipse. (1.19 MB, video/mp4)
2020-01-16 07:25 UTC, Simeon Andreev
no flags Details


Links
System ID Priority Status Summary Last Updated
openjdk bug system JDK-8227269 None None None 2019-09-16 07:30:47 UTC
openjdk bug system JDK-8241750 None None None 2020-05-23 11:02:28 UTC

Description Simeon Andreev 2019-09-13 11:22:48 UTC
Created attachment 1614789 [details]
A snippet that loads many classes multiple times and measures running time.

Description of problem:

We observe the slowdown in our product (an xtext and Eclipse based DSL development environment). We have many loaded classes and believe we observe OpenJDK bug 8227269. So we believe this OpenJDK ticket is the actual cause:

https://bugs.openjdk.java.net/browse/JDK-8227269

Version-Release number of selected component (if applicable):

socbl958:/home/sandreev$ java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

socbl958:/home/sandreev$ cat /etc/redhat-release
Red Hat Enterprise Linux Workstation release 7.4 (Maipo)

How to reproducible:

1. Execute the main method found in the attached "Looper.zip". Observe the total running time (on my HP Z620 its about 30 seconds).
2. Execute the same main method with a debug agent ("-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8100" command line argument) and attach some debugger. E.g. I use the Eclipse one. Observe the total running time (on my HP Z620, about 50 seconds).

Actual results:

Execution time with an attached debugger, when loading classes, is much higher than without an attached debugger.

Expected results:

If the debugger installs no breakpoints and doesn't interfere with the execution, execution time should be at most marginally higher.

Execution in debug mode (with a debug agent attached) should not be taking much longer than normal execution, if the agent is not pausing the JVM or has installed breakpoints.

Additional info:

We would like a fix for 
https://bugs.openjdk.java.net/browse/JDK-8227269 in Java 1.8 to avoid the problem described above.

Comment 3 Simeon Andreev 2019-09-16 14:47:56 UTC
(In reply to Simeon Andreev from comment #2)
> From our R&D, a suggestion of an Android patch that might be helpful:
> 
> https://android.googlesource.com/platform/external/oj-libjdwp/+/refs/heads/
> master/src/share/back/classTrack.c /
> https://github.com/sbwml/external_oj-libjdwp/commit/
> e039297a3e877b562320a1ead160687944d43c80

To correct this, the commit which improves performance is:

https://github.com/sbwml/external_oj-libjdwp/commit/c53a7db69ec9c262fcc227403e1336a1f48bfd01

This is still not ideal, since execution time is still high and we don't install a listener for unload class events. But it is better than the current state. When I increase the outer iterations in the attached "Looper.zip" example to 20 (10 iterations show somewhat varying total running time), with this patch I see 200 seconds, without an agent I see 160 seconds. With the agent and without the patch, I see 260 seconds. I use an HP Z620 workstation.

Comment 4 Simeon Andreev 2019-09-16 15:29:38 UTC
Ideally we would not cache classes as long as nobody is interested in the class unload event (I believe our Eclipse based product is not).


--- ./jdk/src/share/back/eventHandler.bak       2019-09-16 14:56:20.834623000 +0200
+++ ./jdk/src/share/back/eventHandler.c 2019-09-16 14:14:27.500919000 +0200
@@ -583,7 +583,9 @@
         garbageCollected = 0;
 
         /* Analyze which class unloads occurred */
-        unloadedSignatures = classTrack_processUnloads(env);
+        if (getHandlerChain(EI_GC_FINISH)->first != NULL) {
+            unloadedSignatures = classTrack_processUnloads(env);
+        }
 
         debugMonitorExit(handlerLock);
 
@@ -641,7 +643,7 @@
         char        *classname;
 
         /* We must keep track of all classes prepared to know what's unloaded */
-        if (evinfo->ei == EI_CLASS_PREPARE) {
+        if (evinfo->ei == EI_CLASS_PREPARE && getHandlerChain(EI_GC_FINISH)->first != NULL) {
             classTrack_addPreparedClass(env, evinfo->clazz);
         }

This needs some more changes to function correctly (using listener lock, also if you see a listener and there were none store all loaded classes). But in the case of no attached debugger, I see no impact of the debug agent in Looper.zip. With an attached debugger we are at the same bad performance though, so the check must be more fine grained.

I'm not sure how to make this more concrete though. The method that is supposed to check whether a listener is interested in an event requires also a class name or a class name pattern. But the pattern is limited so it cannot be "match everything". And I'm not really finding where the event type plays a role. See eventFilter.c:

/* Determine if this event is interesting to this handler.  Do so
 * by checking each of the handler's filters.  Return false if any
 * of the filters fail, true if the handler wants this event.
 * Special version of filter for unloads since they don't have an
 * event structure or a jclass.
 *
 * If shouldDelete is returned true, a count filter has expired
 * and the corresponding node should be deleted.
 */
jboolean
eventFilterRestricted_passesUnloadFilter(JNIEnv *env,
                                         char *classname,
                                         HandlerNode *node,
                                         jboolean *shouldDelete)
{
    Filter *filter = FILTERS_ARRAY(node);
    int i;

    *shouldDelete = JNI_FALSE;
    for (i = 0; i < FILTER_COUNT(node); ++i, ++filter) {
        switch (filter->modifier) {

            case JDWP_REQUEST_MODIFIER(Count): {
                JDI_ASSERT(filter->u.Count.count > 0);
                if (--filter->u.Count.count > 0) {
                    return JNI_FALSE;
                }
                *shouldDelete = JNI_TRUE;
                break;
            }

            case JDWP_REQUEST_MODIFIER(ClassMatch): {
                if (!patternStringMatch(classname,
                        filter->u.ClassMatch.classPattern)) {
                    return JNI_FALSE;
                }
                break;
            }

            case JDWP_REQUEST_MODIFIER(ClassExclude): {
                if (patternStringMatch(classname,
                       filter->u.ClassExclude.classPattern)) {
                    return JNI_FALSE;
                }
                break;
            }

            default:
                EXIT_ERROR(AGENT_ERROR_ILLEGAL_ARGUMENT,"Invalid filter modifier");
                return JNI_FALSE;
        }
    }
    return JNI_TRUE;
}

Could it be that listeners for "EI_GC_FINISH" are the problem here? I.e. those are the listeners which want to be notified of unloaded classes?

Comment 6 Simeon Andreev 2019-09-17 17:36:12 UTC
Looks like I've had a problem with incremental builds, I think I understand the process now. Anyway, here are our observations from today.

The patch above, which does not do work if there is no class unload listener so far doesn't help with Eclipse, since JDT *does* install such a listener. We are in the process of determining why. If we remove the listener from Eclipse JDT (we don't know whether this is actually OK to do, but assume it is not) the resulting performance during attached debug is similar to no debug agent.

We have also tried the patch from the Android JDK. As far as our reproduction snippets go, the patch is good. We are validating with our product and should have results tomorrow.

The patch is (note that we've taken nearly everything 1:1, including comments):

https://github.com/sbwml/external_oj-libjdwp/commit/c53a7db69ec9c262fcc227403e1336a1f48bfd01

--- ./jdk/src/share/back/classTrack.cbak        2019-09-17 18:56:10.909734000 +0200
+++ ./jdk/src/share/back/classTrack.c   2019-09-17 18:56:25.949737000 +0200
@@ -32,123 +32,109 @@
  * this module, any classes no longer present are known to
  * have been unloaded.
  *
- * For efficient access, classes are keep in a hash table.
- * Each slot in the hash table has a linked list of KlassNode.
+ * ANDROID-CHANGED: This module is almost totally re-written
+ * for android. On android, we have a limited number of jweak
+ * references that can be around at any one time. In order to
+ * preserve this limited resource for user-code use we keep
+ * track of the status of classes using JVMTI tags.
  *
- * Comparing current set of classes is compared with previous
- * set by transferring all classes in the current set into
- * a new table, any that remain in the old table have been
- * unloaded.
+ * We keep a linked-list of the signatures of loaded classes
+ * associated with the tag we gave to that class. The tag is
+ * simply incremented every time we add a new class.
+ *
+ * We also request (on the separate tracking jvmtiEnv) an
+ * ObjectFree event be called for each of these classes. This
+ * allows us to keep a running list of all the classes known to
+ * have been collected since the last call to
+ * classTrack_processUnloads. On each call to processUnloads we
+ * iterate through this list and remove from the main list all
+ * the objects that have been collected. We then return a list of
+ * the class-signatures that have been collected.
+ *
+ * For efficiency and simplicity we don't bother retagging or                                                                                                                                                                               
+ * re-using old tags, instead relying on the fact that no                                                                                                                                                                                   
+ * program will ever be able to exhaust the (2^64 - 1) possible                                                                                                                                                                             
+ * tag values (which would require that many class-loads).                                                                                                                                                                                  
+ *                                                                                                                                                                                                                                          
+ * This relies on the tagging and ObjectFree implementation being                                                                                                                                                                           
+ * relatively efficient for performance. It has the advantage of                                                                                                                                                                            
+ * not requiring any jweaks.                                                                                                                                                                                                                
+ *                                                                                                                                                                                                                                          
+ * All calls into any function of this module must be either                                                                                                                                                                                
+ * done before the event-handler system is setup or done while                                                                                                                                                                              
+ * holding the event handlerLock. The list of freed classes is                                                                                                                                                                              
+ * protected by the classTagLock.                                                                                                                                                                                                           
  */                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                             
 #include "util.h"                                                                                                                                                                                                                           
 #include "bag.h"                                                                                                                                                                                                                            
 #include "classTrack.h"                                                                                                                                                                                                                     
                                                                                                                                                                                                                                             
-/* ClassTrack hash table slot count */                                                                                                                                                                                                      
-#define CT_HASH_SLOT_COUNT 263    /* Prime which eauals 4k+3 for some k */
-
 typedef struct KlassNode {
-    jclass klass;            /* weak global reference */
+    jlong klass_tag;         /* Tag the klass has in the tracking-env */
     char *signature;         /* class signature */
     struct KlassNode *next;  /* next node in this slot */
 } KlassNode;
 
 /*
- * Hash table of prepared classes.  Each entry is a pointer
- * to a linked list of KlassNode.
+ * pointer to first node of a linked list of prepared classes KlassNodes.
  */
-static KlassNode **table;
+static KlassNode *list;
 
 /*
- * Return slot in hash table to use for this class.
+ * The JVMTI env we use to keep track of klass tags which allows us to detect class-unloads.
  */
-static jint
-hashKlass(jclass klass)
-{
-    jint hashCode = objectHashCode(klass);
-    return abs(hashCode) % CT_HASH_SLOT_COUNT;
-}
+static jvmtiEnv *trackingEnv;
 
 /*
- * Transfer a node (which represents klass) from the current
- * table to the new table.
+ * The current highest tag number in use by the trackingEnv.
+ *
+ * No need for synchronization since everything is done under the handlerLock.
  */
-static void
-transferClass(JNIEnv *env, jclass klass, KlassNode **newTable) {
-    jint slot = hashKlass(klass);
-    KlassNode **head = &table[slot];
-    KlassNode **newHead = &newTable[slot];
-    KlassNode **nodePtr;
-    KlassNode *node;
+static jlong currentKlassTag;
 
-    /* Search the node list of the current table for klass */
-    for (nodePtr = head; node = *nodePtr, node != NULL; nodePtr = &(node->next)) {
-        if (isSameObject(env, klass, node->klass)) {
-            /* Match found transfer node */
-
-            /* unlink from old list */
-            *nodePtr = node->next;
-
-            /* insert in new list */
-            node->next = *newHead;
-            *newHead = node;
+/*
+ * A lock to protect access to 'deletedTagBag'
+ */
+static jrawMonitorID deletedTagLock;
 
-            return;
-        }
-    }
+/*
+ * A bag containing all the deleted klass_tags ids. This must be accessed under the
+ * deletedTagLock.
+ *
+ * It is cleared each time classTrack_processUnloads is called.
+ */
+struct bag* deletedTagBag;
 
-    /* we haven't found the class, only unloads should have happenned,
-     * so the only reason a class should not have been found is
-     * that it is not prepared yet, in which case we don't want it.
-     * Asset that the above is true.
-     */
-/**** the HotSpot VM doesn't create prepare events for some internal classes ***
-    JDI_ASSERT_MSG((classStatus(klass) &
-                (JVMTI_CLASS_STATUS_PREPARED|JVMTI_CLASS_STATUS_ARRAY))==0,
-               classSignature(klass));
-***/
+/*
+ * The callback for when classes are freed. Only classes are called because this is registered with
+ * the trackingEnv which only tags classes.
+ */
+static void JNICALL
+cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
+{
+    debugMonitorEnter(deletedTagLock);
+    *(jlong*)bagAdd(deletedTagBag) = tag;
+    debugMonitorExit(deletedTagLock);
 }
 
 /*
- * Delete a hash table of classes.
- * The signatures of classes in the table are returned.
+ * Returns true (thus continuing the iteration) if the item is not the searched for tag.
  */
-static struct bag *
-deleteTable(JNIEnv *env, KlassNode *oldTable[])
+static jboolean
+isNotTag(void* item, void* needle)
 {
-    struct bag *signatures = bagCreateBag(sizeof(char*), 10);
-    jint slot;
-
-    if (signatures == NULL) {
-        EXIT_ERROR(AGENT_ERROR_OUT_OF_MEMORY,"signatures");
-    }
-
-    for (slot = 0; slot < CT_HASH_SLOT_COUNT; slot++) {
-        KlassNode *node = oldTable[slot];
-
-        while (node != NULL) {
-            KlassNode *next;
-            char **sigSpot;
-
-            /* Add signature to the signature bag */
-            sigSpot = bagAdd(signatures);
-            if (sigSpot == NULL) {
-                EXIT_ERROR(AGENT_ERROR_OUT_OF_MEMORY,"signature bag");
-            }
-            *sigSpot = node->signature;
-
-            /* Free weak ref and the node itself */
-            JNI_FUNC_PTR(env,DeleteWeakGlobalRef)(env, node->klass);
-            next = node->next;
-            jvmtiDeallocate(node);
-
-            node = next;
-        }
-    }
-    jvmtiDeallocate(oldTable);
+    return *(jlong*)item != *(jlong*)needle;
+}
 
-    return signatures;
+/*
+ * This requires that deletedTagLock and the handlerLock are both held.
+ */
+static jboolean
+isClassUnloaded(jlong tag)
+{
+    /* bagEnumerateOver returns true if 'func' returns true on all items and aborts early if not. */
+    return !bagEnumerateOver(deletedTagBag, isNotTag, &tag);
 }
 
 /*
@@ -156,72 +142,78 @@
  * of currently loaded prepared classes.
  * The signatures of classes which were unloaded (not present in the
  * new table) are returned.
+ *
+ * NB This relies on addPreparedClass being called for every class loaded after the
+ * classTrack_initialize function is called. We will not request all loaded classes again after
+ * that. It also relies on not being called concurrently with any classTrack_addPreparedClass or
+ * other classTrack_processUnloads calls.
  */
 struct bag *
 classTrack_processUnloads(JNIEnv *env)
 {
-    KlassNode **newTable;
-    struct bag *unloadedSignatures;
+    /* We could optimize this somewhat by holding the deletedTagLock for a much shorter time,
+     * replacing it as soon as we enter and then destroying it once we are done with it. This will
+     * cause a lot of memory churn and this function is not expected to be called that often.
+     * Furthermore due to the check for an empty bag (which should be very common) normally this
+     * will finish very quickly. In cases where there is a concurrent GC occuring and a class is
+     * being collected the GC-ing threads could be blocked until we are done but this is expected to
+     * be very rare.
+     */
+    debugMonitorEnter(deletedTagLock);
+    /* Take and return the deletedTagBag */
+    struct bag* deleted = bagCreateBag(sizeof(char*), bagSize(deletedTagBag));
+    /* The deletedTagBag is going to be much shorter than the klassNode list so we should walk the
+     * KlassNode list once and scan the deletedTagBag each time. We only need to this in the rare
+     * case that there was anything deleted though.
+     */
+    if (bagSize(deletedTagBag) != 0) {
+        KlassNode* node = list;
+        KlassNode** previousNext = &list;
 
-    unloadedSignatures = NULL;
-    newTable = jvmtiAllocate(CT_HASH_SLOT_COUNT * sizeof(KlassNode *));
-    if (newTable == NULL) {
-        EXIT_ERROR(AGENT_ERROR_OUT_OF_MEMORY, "classTrack table");
-    } else {
-
-        (void)memset(newTable, 0, CT_HASH_SLOT_COUNT * sizeof(KlassNode *));
-
-        WITH_LOCAL_REFS(env, 1) {
-
-            jint classCount;
-            jclass *classes;
-            jvmtiError error;
-            int i;
-
-            error = allLoadedClasses(&classes, &classCount);
-            if ( error != JVMTI_ERROR_NONE ) {
-                jvmtiDeallocate(newTable);
-                EXIT_ERROR(error,"loaded classes");
+        while (node != NULL) {
+            if (isClassUnloaded(node->klass_tag)) {
+                /* Update the previous node's next pointer to point after this node. Note that we
+                 * update the value pointed to by previousNext but not the value of previousNext
+                 * itself.
+                 */
+                *previousNext = node->next;
+                /* Put this nodes signature into the deleted bag */
+                *(char**)bagAdd(deleted) = node->signature;
+                /* Deallocate the node */
+                jvmtiDeallocate(node);
             } else {
-
-                /* Transfer each current class into the new table */
-                for (i=0; i<classCount; i++) {
-                    jclass klass = classes[i];
-                    transferClass(env, klass, newTable);
-                }
-                jvmtiDeallocate(classes);
-
-                /* Delete old table, install new one */
-                unloadedSignatures = deleteTable(env, table);
-                table = newTable;
+                /* This node will become the previous node so update the previousNext pointer to
+                 * this nodes next pointer.
+                 */
+                previousNext = &(node->next);
             }
-
-        } END_WITH_LOCAL_REFS(env)
-
+            node = *previousNext;
+        }
+        bagDeleteAll(deletedTagBag);
     }
-
-    return unloadedSignatures;
+    debugMonitorExit(deletedTagLock);
+    return deleted;
 }
 
 /*
- * Add a class to the prepared class hash table.
+ * Add a class to the prepared class list.
  * Assumes no duplicates.
  */
 void
 classTrack_addPreparedClass(JNIEnv *env, jclass klass)
 {
-    jint slot = hashKlass(klass);
-    KlassNode **head = &table[slot];
     KlassNode *node;
     jvmtiError error;
 
     if (gdata->assertOn) {
         /* Check this is not a duplicate */
-        for (node = *head; node != NULL; node = node->next) {
-            if (isSameObject(env, klass, node->klass)) {
-                JDI_ASSERT_FAILED("Attempting to insert duplicate class");
-                break;
-            }
+        jlong tag;
+        error = JVMTI_FUNC_PTR(trackingEnv,GetTag)(trackingEnv, klass, &tag);
+        if (error != JVMTI_ERROR_NONE) {
+            EXIT_ERROR(error,"unable to get-tag with class trackingEnv!");
+        }
+        if (tag != 0l) {
+            JDI_ASSERT_FAILED("Attempting to insert duplicate class");
         }
     }
 
@@ -234,15 +226,42 @@
         jvmtiDeallocate(node);
         EXIT_ERROR(error,"signature");
     }
-    if ((node->klass = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, klass)) == NULL) {
+    node->klass_tag = ++currentKlassTag;
+    error = JVMTI_FUNC_PTR(trackingEnv,SetTag)(trackingEnv, klass, node->klass_tag);
+    if (error != JVMTI_ERROR_NONE) {
         jvmtiDeallocate(node->signature);
         jvmtiDeallocate(node);
-        EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewWeakGlobalRef");
+        EXIT_ERROR(error,"SetTag");
     }
 
     /* Insert the new node */
-    node->next = *head;
-    *head = node;
+    node->next = list;
+    list = node;
+}
+
+static jboolean
+setupEvents()
+{
+    jvmtiCapabilities caps;
+    memset(&caps, 0, sizeof(caps));
+    caps.can_generate_object_free_events = 1;
+    jvmtiError error = JVMTI_FUNC_PTR(trackingEnv,AddCapabilities)(trackingEnv, &caps);
+    if (error != JVMTI_ERROR_NONE) {
+        return JNI_FALSE;
+    }
+    jvmtiEventCallbacks cb;
+    memset(&cb, 0, sizeof(cb));
+    cb.ObjectFree = cbTrackingObjectFree;
+    error = JVMTI_FUNC_PTR(trackingEnv,SetEventCallbacks)(trackingEnv, &cb, sizeof(cb));
+    if (error != JVMTI_ERROR_NONE) {
+        return JNI_FALSE;
+    }
+    error = JVMTI_FUNC_PTR(trackingEnv,SetEventNotificationMode)
+            (trackingEnv, JVMTI_ENABLE, JVMTI_EVENT_OBJECT_FREE, NULL);
+    if (error != JVMTI_ERROR_NONE) {
+        return JNI_FALSE;
+    }
+    return JNI_TRUE;
 }
 
 /*
@@ -251,6 +270,22 @@
 void
 classTrack_initialize(JNIEnv *env)
 {
+    /* ANDROID_CHANGED: Setup the tracking env and the currentKlassTag */
+    trackingEnv = getSpecialJvmti();
+    if ( trackingEnv == NULL ) {
+        EXIT_ERROR(AGENT_ERROR_INTERNAL,"Failed to allocate tag-tracking jvmtiEnv");
+    }
+    /* We want to create these before turning on the events or tagging anything. */
+    deletedTagLock = debugMonitorCreate("Deleted class tag lock");
+    deletedTagBag = bagCreateBag(sizeof(jlong), 10);
+    /* ANDROID-CHANGED: Setup the trackingEnv's ObjectFree event */
+    if (!setupEvents()) {
+        /* On android classes are usually not unloaded too often so this is not a huge loss. */
+        ERROR_MESSAGE(("Unable to setup class ObjectFree tracking! Class unloads will not "
+                       "be reported!"));
+    }
+    currentKlassTag = 0l;
+    list = NULL;
     WITH_LOCAL_REFS(env, 1) {
 
         jint classCount;
@@ -260,24 +295,17 @@
 
         error = allLoadedClasses(&classes, &classCount);
         if ( error == JVMTI_ERROR_NONE ) {
-            table = jvmtiAllocate(CT_HASH_SLOT_COUNT * sizeof(KlassNode *));
-            if (table != NULL) {
-                (void)memset(table, 0, CT_HASH_SLOT_COUNT * sizeof(KlassNode *));
-                for (i=0; i<classCount; i++) {
-                    jclass klass = classes[i];
-                    jint status;
-                    jint wanted =
-                        (JVMTI_CLASS_STATUS_PREPARED|JVMTI_CLASS_STATUS_ARRAY);
-
-                    /* We only want prepared classes and arrays */
-                    status = classStatus(klass);
-                    if ( (status & wanted) != 0 ) {
-                        classTrack_addPreparedClass(env, klass);
-                    }
+            for (i=0; i<classCount; i++) {
+                jclass klass = classes[i];
+                jint status;
+                jint wanted =
+                    (JVMTI_CLASS_STATUS_PREPARED|JVMTI_CLASS_STATUS_ARRAY);
+
+                /* We only want prepared classes and arrays */
+                status = classStatus(klass);
+                if ( (status & wanted) != 0 ) {
+                    classTrack_addPreparedClass(env, klass);
                 }
-            } else {
-                jvmtiDeallocate(classes);
-                EXIT_ERROR(AGENT_ERROR_OUT_OF_MEMORY,"KlassNode");
             }
             jvmtiDeallocate(classes);
         } else {

--- ./jdk/src/share/back/util.cbak      2019-09-17 18:54:31.869658000 +0200
+++ ./jdk/src/share/back/util.c 2019-09-17 18:56:30.025675000 +0200
@@ -1744,7 +1744,7 @@
 }
 
 /* Get the jvmti environment to be used with tags */
-static jvmtiEnv *
+jvmtiEnv *
 getSpecialJvmti(void)
 {
     jvmtiEnv  *jvmti;

--- ./jdk/src/share/back/util.hbak      2019-09-17 18:54:28.563701000 +0200
+++ ./jdk/src/share/back/util.h 2019-09-17 18:56:34.247641000 +0200
@@ -429,4 +429,9 @@
 void saveGlobalRef(JNIEnv *env, jobject obj, jobject *pobj);
 void tossGlobalRef(JNIEnv *env, jobject *pobj);
 
+/* ANDROID_CHANGED: Expose this method publicly.
+ * This returns a newly allocated jvmtiEnv* with the can_tag_objects capability.
+ */
+jvmtiEnv *getSpecialJvmti(void);
+
 #endif

Comment 7 Simeon Andreev 2019-09-19 09:37:03 UTC
Created attachment 1616650 [details]
Simplified patch, after examining the Android JVM changes in "classTrack.c".

Having examined the Android patches, we observe that their focus lies elsewhere, and that the improved performance comes from one small optimization they did to the code.

Namely, they don't change the actual complexity of the GC event processing in "classTrack.c", but they do keep track of unloaded classes and omit the entire operation if unloaded classes don't exist. This is sufficient to "fix" our snippets and is likely sufficient to fix the performance problem in our product. We have cycles of loading classes, doing a lot of work, unloading the classes and loading new classes. That once in a while a GC will result in more work is OK, but we want to avoid that *every* GC causes a lot of work (and so unexpectedly long pauses in execution).

The adjusted patch is quite simple, see attachment "bug1751985.patch".

We are in the process of validating the change with our product.

Comment 8 Andrew John Hughes 2019-09-19 17:36:52 UTC
Have you spoken to the developers of the Android patch? They'd be best placed to take such a fix upstream and be aware of any known issues with it.

Comment 9 Roman Kennke 2019-09-19 17:46:42 UTC
I will also look into this issue.

Comment 11 Simeon Andreev 2019-09-20 06:01:55 UTC
(In reply to Andrew John Hughes from comment #8)
> Have you spoken to the developers of the Android patch? They'd be best
> placed to take such a fix upstream and be aware of any known issues with it.

I didn't think to do this, I thought they work on a fork that is not related. I'll ask the author of the patch.

(In reply to Roman Kennke from comment #9)
> I will also look into this issue.

Thanks!

Comment 12 Simeon Andreev 2019-09-20 06:15:01 UTC
(In reply to Andrew John Hughes from comment #8)

Also thanks for updating the OpenJDK ticket!

Comment 13 Simeon Andreev 2019-09-20 13:24:58 UTC
The attached patch has a typo:

+ * A flag indicating whether classes have been unloaded.
+ *
+ * It is cleared each time classTrack_processUnloads is called.
+ */
+jboolean* hasUnloadedClasses;

This should not be a pointer.

Comment 14 Simeon Andreev 2019-09-22 07:57:58 UTC
Created attachment 1617722 [details]
Simplified and fixed patch, after contacting author of Android JDK patch.

> The patch you posted is broken and will result in all class-unload events
> being missed. The ObjectFree event will only be sent for objects which have
> received a SetTag call. Their patch doesn't contain any of these calls so
> the event will never be triggered and class unloads will be missed. To fix
> the patch they need to add a call to SetTag on the trackingenv in the
> classTrack_addPreparedClass with a non-zero tag value.
>
> I wish you luck in fixing your issues.

Here is the feedback from the author of the patch in the Android JDK. I've updated the patch accordingly and have checked that the code that sends class unload events runs with the attached snippet. I have also double checked that this code indeed doesn't run with the previous patch.

Note that when attaching with the Eclipse debugger, the Eclipse debugger has a bug that prevents unloading classes:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=449791

This is a separate issue, I'm noting it FYI.

Comment 15 Simeon Andreev 2019-09-23 07:30:16 UTC
Since the OpenJDK 1.8 code freeze is nearing (this Wednesday as per https://wiki.openjdk.java.net/display/jdk8u), it seems highly unlikely we'll have our patch in the next 1.8 release. So we must look for alternatives.

One option would be to receive the patch with a build from RHEL (so far we've not had a response on this in our support ticket: https://access.redhat.com/support/cases/#/case/02471359).

The other option would be to patch the OpenJDK ourselves. We are checking whether the OpenJDK license permits this, and if so under what conditions.

For the second option, we believe the RHEL OpenJDK version which we have already has some patches on top of it. We would like to add 1 more patch (as attached, https://bugzilla.redhat.com/attachment.cgi?id=1617722) and do retests, ASAP. Could you point me to the repository from which RHEL OpenJDK is built? OpenJDK has GPL 2 license so we guess the patch code must be available somewhere.

Comment 16 Andrew John Hughes 2019-09-23 23:34:48 UTC
(In reply to Simeon Andreev from comment #15)
> Since the OpenJDK 1.8 code freeze is nearing (this Wednesday as per
> https://wiki.openjdk.java.net/display/jdk8u), it seems highly unlikely we'll
> have our patch in the next 1.8 release. So we must look for alternatives.
> 

As maintainer of OpenJDK 8 upstream, I can say it would not just be highly unlikely, but a definite no at this stage. The October release is already at a stage where we only accept critical fixes and regressions. A performance enhancement would not fit that criteria and certainly not one that has not yet made it into the current JDK release. As Roman mentions, such a patch needs to be extensively reviewed, included & tested in the current JDK version, before being backported to OpenJDK 8.

I'm sorry to be the bearer of bad news, but that likely means the earliest you'd get this into an upstream 8u release would probably be in time for the April 2020 release and that depends on how upstream reviews fare. OpenJDK 8 is a stable release that many people rely on and we have to be very careful about what is allowed in, and that implies a certain level of maturity for changes.

The comments from the Android developer don't make me very optimistic. Did they mention anything about pushing this upstream themselves? Perhaps you could CC me on that mail (gnu.andrew@redhat.com) so I can speak to them about it directly.

> One option would be to receive the patch with a build from RHEL (so far
> we've not had a response on this in our support ticket:
> https://access.redhat.com/support/cases/#/case/02471359).

As I understand it, the comments here feed into that ticket as well, so I don't usually comment on them directly. We could look into the viability of providing a custom build with this patch for your own use, but you seem to suggest on the ticket that you need it available in RHEL itself. My concerns about this are just the same as for upstream, in that we are providing the same package to all customers. We can possibly include it in RPMs before it makes an upstream release, but we'd only want to do that after it has met the criteria above and is guaranteed to be in a forthcoming release i.e. it's not a means to circumvent the above.

> 
> The other option would be to patch the OpenJDK ourselves. We are checking
> whether the OpenJDK license permits this, and if so under what conditions.
> 
> For the second option, we believe the RHEL OpenJDK version which we have
> already has some patches on top of it. We would like to add 1 more patch (as
> attached, https://bugzilla.redhat.com/attachment.cgi?id=1617722) and do
> retests, ASAP. Could you point me to the repository from which RHEL OpenJDK
> is built? OpenJDK has GPL 2 license so we guess the patch code must be
> available somewhere.

I doubt it would be possible to include it in RHEL if the right to obtain and modify the source code was unavailable :)

The licensing for OpenJDK is https://openjdk.java.net/legal/assembly-exception.html

The OpenJDK 8 RPMs use a downstream of the main OpenJDK 8 trees which include the AArch64 port and Shenandoah garbage collector:

https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/

I will say that building OpenJDK is not trivial, especially if you want a drop-in replacement for the RPM version. If you can use something other than from the official RHEL repositories, than it would be preferable for us to just supply you with a patched set of RPMs.

Incidentally, I notice this bug is filed against RHEL 7.4. Any update to the official RHEL packages would be to those in the future RHEL 7.8 and (possibly) 7.7. We don't provide updates for earlier versions.

Comment 17 Simeon Andreev 2019-09-24 08:05:51 UTC
(In reply to Andrew John Hughes from comment #16)
> The comments from the Android developer don't make me very optimistic. Did
> they mention anything about pushing this upstream themselves? Perhaps you
> could CC me on that mail (gnu.andrew@redhat.com) so I can speak to them
> about it directly.

I have addressed the author of the Android JDK patch directly on their commit at their github repository:

https://github.com/sbwml/external_oj-libjdwp/commit/c53a7db69ec9c262fcc227403e1336a1f48bfd01

Unfortunately I have no e-mail address to share; the notifications (for comments) are done by github.

> I will say that building OpenJDK is not trivial, especially if you want a
> drop-in replacement for the RPM version. If you can use something other than
> from the official RHEL repositories, than it would be preferable for us to
> just supply you with a patched set of RPMs.

I think this will be our preferred course of action, having missed the OpenJDK 1.8 release. April 2020 is way too far in the future.

> Incidentally, I notice this bug is filed against RHEL 7.4. Any update to the
> official RHEL packages would be to those in the future RHEL 7.8 and
> (possibly) 7.7. We don't provide updates for earlier versions.

I'll discuss with our sys admins. Our next planned RHEL update is not soon, and I think it might be RHEL 7.6.

Comment 18 Simeon Andreev 2019-09-24 12:06:22 UTC
OK, I discussed with R&D, we would like a build from RHEL that contains the proposed patch.

We plan to provide this build in addition to the official OpenJDK build, for customers to use as a last resort in case of performance problems with our product. We will use the build in that manner until an official OpenJDK 1.8 version is available with the fix, which we fervently hope will be early next year.

What would the timelines be for a build with the patch, from RHEL?

Comment 19 Simeon Andreev 2019-10-15 16:41:20 UTC
Hi all,

I would like to ask how to run tests for a patched OpenJDK.

As there is no activity here, and we are nearing deadlines, we've built the JDK from the RHEL source packages:

java-1.8.0-openjdk-src-1.8.0.181-3.b13.el7_5.x86_64
java-1.8.0-openjdk-src-debug-1.8.0.181-3.b13.el7_5.x86_64

Our R&D observed that its fairly simple to add a patch on top of the already present patches from RHEL and then use the RHEL build scripts.

We'll be running our product tests in the next days. We ran the available tests from the mercurial OpenJDK repository:

Test results: passed: 3,116; failed: 2
...
make[1]: *** No rule to make target `/java/re/jck/8/promoted/latest/binaries', needed by `check-jck'.

We observe the same 2 failures also without our suggested patch. However the test count seems somewhat low, so we expect there are more tests to run.

Could you give us pointers on how to run more tests?

For compatibility we found  https://openjdk.java.net/groups/conformance/JckAccess/, but are not sure whether we can apply.

Best regards,
Simeon

Comment 20 Simeon Andreev 2019-10-22 08:00:44 UTC
We have rolled out an extra JDK version, patched by us. The OpenJDK build was done with RHEL scripts, our suggested patch was added on top of already present patches. Our product will use the official OpenJDK release, unless customers specify the experimental, patched, build.

Next action items for this ticket:

1. We require a RHEL build with the patch, so that we can replace our self-patched build.
2. We require an official OpenJDK build with the fix.
2.1. The build should show increased performance with a debug agent and debugger attached, in case of an application with frequent GC and many loaded classes (100k-200k loaded classes), where the debugger requests class unload events from the JVM.
2.2. The build should show further increased performance if no debugger is attached, in case of an application with frequent GC and many loaded classes (100k-200k loaded classes).

Comment 22 Andrey Loskutov 2019-11-12 09:24:41 UTC
(In reply to Andrew John Hughes from comment #16)
> As maintainer of OpenJDK 8 upstream, I can say it would not just be highly
> unlikely, but a definite no at this stage. The October release is already at
> a stage where we only accept critical fixes and regressions. A performance
> enhancement would not fit that criteria and certainly not one that has not
> yet made it into the current JDK release. As Roman mentions, such a patch
> needs to be extensively reviewed, included & tested in the current JDK
> version, before being backported to OpenJDK 8.
> 
> I'm sorry to be the bearer of bad news, but that likely means the earliest
> you'd get this into an upstream 8u release would probably be in time for the
> April 2020 release and that depends on how upstream reviews fare. OpenJDK 8
> is a stable release that many people rely on and we have to be very careful
> about what is allowed in, and that implies a certain level of maturity for
> changes.

Thanks Andrew, I see & we understand that.

The (simple) patch that we've applied to the custom JVM build improves performance a lot, but is still not providing "expected" performance. Therefore we are still looking for a "fast enough" (== performance near to the execution without debug agent attached) patch for this bug in the *near* future (mid 2020 latest).

We have an (experimental) option to run our software on Java 11 instead of Java 8.
If we set target for requested patch to Java 11 (current LTS) release first - could *this* speed up / simplify the process?

Comment 25 Andrew John Hughes 2019-11-22 18:26:17 UTC
(In reply to Simeon Andreev from comment #20)
> We have rolled out an extra JDK version, patched by us. The OpenJDK build
> was done with RHEL scripts, our suggested patch was added on top of already
> present patches. Our product will use the official OpenJDK release, unless
> customers specify the experimental, patched, build.
> 
> Next action items for this ticket:
> 
> 1. We require a RHEL build with the patch, so that we can replace our
> self-patched build.

I can provide this. I would need to know which RHEL version it should be built against. Please note what I said earlier about any public build being against a current version of RHEL (currently RHEL 7.7).

> 2. We require an official OpenJDK build with the fix.
> 2.1. The build should show increased performance with a debug agent and
> debugger attached, in case of an application with frequent GC and many
> loaded classes (100k-200k loaded classes), where the debugger requests class
> unload events from the JVM.
> 2.2. The build should show further increased performance if no debugger is
> attached, in case of an application with frequent GC and many loaded classes
> (100k-200k loaded classes).

This is trickier, as the patch needs upstream review and, at this point, I'm not even sure who the work is attributable to. I'll try and comment on that GitHub post.

Aiming for 8u & 11u doesn't make much difference, as the process is much the same; the patch has to go into the current development repositories first (OpenJDK 14 at present) and then be backported. We'd probably backport it to both 8 & 11 either way. As far as I recall, this code hasn't changed much between 8 & 11.

Comment 26 Simeon Andreev 2019-11-25 07:09:32 UTC
Please note that I did not express myself all that accurately.

> 1. We require a RHEL build with the patch, so that we can replace our self-patched build.
> 2. We require an official OpenJDK build with the fix.
> 2.1. The build should show increased performance with a debug agent and debugger attached, in case of an application with frequent GC and many loaded classes (100k-200k loaded classes), where the debugger requests class unload events from the JVM.
> 2.2. The build should show further increased performance if no debugger is attached, in case of an application with frequent GC and many loaded classes (100k-200k loaded classes).

A RHEL build with my patch suggestion would be fine. We use RHEL 7.4 and plan to update to RHEL 7.6 in near future. But I'm guessing an OpenJDK build against RHEL 7.7 should be fine too. We are using the newest OpenJDK 11.0.5 package from RHEL on our RHEL 7.4, we don't see problems.

For the official OpenJDK fix, unfortunately our patch suggestion is not enough. We still see performance degradation due to iterating over all loaded classes on GC. I.e. with our patch suggestion, the degradation is lesser but still exists. I.e. for an official fix we require more than just the current suggested patch. I wrote points 2.1 and 2.2 to clarify this but my use of "the fix" and "the patch" are misleading when I re-read my comment; sorry about this.

Comment 27 Simeon Andreev 2019-11-26 06:40:12 UTC
Hi Andrew John Hughes,

do you need more info?

Best regards,
Simeon

Comment 29 amit yadav 2019-11-26 10:37:33 UTC
(In reply to Andrew John Hughes from comment #25)
> (In reply to Simeon Andreev from comment #20)
> > We have rolled out an extra JDK version, patched by us. The OpenJDK build
> > was done with RHEL scripts, our suggested patch was added on top of already
> > present patches. Our product will use the official OpenJDK release, unless
> > customers specify the experimental, patched, build.
> > 
> > Next action items for this ticket:
> > 
> > 1. We require a RHEL build with the patch, so that we can replace our
> > self-patched build.
> 
> I can provide this. I would need to know which RHEL version it should be
> built against. Please note what I said earlier about any public build being
> against a current version of RHEL (currently RHEL 7.7).
> 
> > 2. We require an official OpenJDK build with the fix.
> > 2.1. The build should show increased performance with a debug agent and
> > debugger attached, in case of an application with frequent GC and many
> > loaded classes (100k-200k loaded classes), where the debugger requests class
> > unload events from the JVM.
> > 2.2. The build should show further increased performance if no debugger is
> > attached, in case of an application with frequent GC and many loaded classes
> > (100k-200k loaded classes).
> 
> This is trickier, as the patch needs upstream review and, at this point, I'm
> not even sure who the work is attributable to. I'll try and comment on that
> GitHub post.
> 
> Aiming for 8u & 11u doesn't make much difference, as the process is much the
> same; the patch has to go into the current development repositories first
> (OpenJDK 14 at present) and then be backported. We'd probably backport it to
> both 8 & 11 either way. As far as I recall, this code hasn't changed much
> between 8 & 11.

@Andrew John Hughes 

The customer has already answered the query, hence I am clearing the NNO flag for the customer.

Comment 30 Andrew John Hughes 2019-11-27 04:40:35 UTC
This bug is assigned to java-1.8.0-openjdk. Is the customer now looking for a fix for java-11-openjdk? If so, the bug should be updated to reflect this.

I can supply a test build of java-11-openjdk against RHEL 7.7. Roman is currently testing his own simpler fix (see final comment on https://bugs.openjdk.java.net/browse/JDK-8227269) so it may be better to provide the build using that. If that works, it will eliminate a lot of the issues with getting this upstream in a reasonable time frame.

Comment 31 Simeon Andreev 2019-11-27 07:17:44 UTC
(In reply to Andrew John Hughes from comment #30)
> This bug is assigned to java-1.8.0-openjdk. Is the customer now looking for
> a fix for java-11-openjdk? If so, the bug should be updated to reflect this.
> 
> I can supply a test build of java-11-openjdk against RHEL 7.7. Roman is
> currently testing his own simpler fix (see final comment on
> https://bugs.openjdk.java.net/browse/JDK-8227269) so it may be better to
> provide the build using that. If that works, it will eliminate a lot of the
> issues with getting this upstream in a reasonable time frame.

Hi Roman Kennke, Andrew Hughes,

we are very happy to see you working on this bug. However we are somewhat concerned to see yet another suggestion to merge this patch to OpenJDK: http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.00/

As I already explained, this patch definitely helps but its minimal and safe (from our POV). I want to stress that the patch doesn't fix the problem, only lessens its impact; we still see performance degradation that is not in acceptable bounds (for us and our customers). Merging this patch to OpenJDK would not resolve the problem for us. Is there some miscommunication going on right now?

Note that we would still like a build from RHEL with the minimal patch (as a workaround while waiting for the fix), since we expect quite a delay between actually fixing the OpenJDK and the next OpenJDK release that has the fix. We still have our own build with the patch and we are somewhat uneasy with patching the JVM ourselves.

Best regards and thanks,
Simeon

Comment 32 Simeon Andreev 2019-11-27 07:22:17 UTC
On a side note, would it be possible to add either me or someone from Advantest (e.g. Andrey Loskutov) as a watcher to the OpenJDK bug (https://bugs.openjdk.java.net/browse/JDK-8227269)? We missed the comment from Roman Kennke:

> A much simpler fix candidate than the Android one:
>
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.00/
> 
> Currently testing.

Comment 34 Roman Kennke 2019-11-27 10:23:00 UTC
'My' simplified patch is the one that has been provided here, brought into shape to match JDK 14 (where the development would have to happen). Yes, I am aware that it is not sufficient, and I am working on an improved one. It will take non-trivial amount of time, though, it needs to be developed, then tested & then we can take it upstream to JDK 14, and then backport it to JDK 11u and 8u. We can roll it into our own builds as soon as it's tested, and while taking it upstream. Anyhow, I am now focusing on making a complete solution based on the simplified patch.

Regards,
Roman

Comment 35 Roman Kennke 2019-11-27 15:43:57 UTC
I studied the Android patch a little, and I think it's better than our 'simplified' patch in that it also improves the case when we're looking at actual class-unload-events. It doesn't throw away the whole cache of prepared classes, but instead incrementally updates it. I took the liberty and re-implemented it -- it is pretty straightforward. I also extended it to not do anything if debug agent is not interested in CLASS_UNLOAD (EI_GC_FINISH) events. Unfortunately, jdb seems to register a CLASS_UNLOAD listener right after connection, so I could not really test this, but I expect this to bring performance to the level as if no debug agent were attached because it makes class-tracking a no-op.

The preliminary patch against jdk14 (requires more testing and perhaps some cleanups) is here:

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/

After some more testing, and if we agree that this is what we need, I'm going to make backports to make it fit jdk11 and jdk8.

Let me know.
Roman

Comment 36 Roman Kennke 2019-11-27 15:54:36 UTC
Before I forget, my performance numbers with and without the patch, and with and without debug agent attached are:

unpatched: no debug: 84s with debug: 225s
patched:   no debug: 85s with debug: 95s

As stated above, it doesn't make a difference whether we don't activate class-tracking when no CLASS_UNLOAD listeners are there, because jdb installs such a listener at connection-setup. Not sure about other debug agents. Also, there's no difference between the simple patch and the more complicated patch performance wise, because it doesn't seem to do any class-unloading, so the basic performance properties are the same.

Roman

Comment 37 Simeon Andreev 2019-11-27 16:06:55 UTC
This sounds great! We'll try to build a JDK with the patch and validate performance with out product.

As for no CLASS_UNLOAD listeners, this is also a great improvement. Eclipse will install such a listener, but we don't always attach the Eclipse debugger (or another debugger) to our application; we always have a debug agent as a JVM parameter. So we would gain "full" performance whenever there is no actual debugger attached.

I'll comment here as soon as we have something with this: http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/

Comment 38 Roman Kennke 2019-11-27 20:57:59 UTC
First rounds of testing looks positive. The provided patch applies without issues on jdk11u too.

Comment 39 Roman Kennke 2019-11-27 21:31:44 UTC
The backport to jdk8u is straightforward:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.00/

Comment 40 Andrew John Hughes 2019-11-27 21:38:07 UTC
Simeon, I'm happy to build an RPM for you if you let me know whether you want to test with java-1.8.0-openjdk still or java-11-openjdk.

Thanks.

Comment 41 Andrey Loskutov 2019-11-27 21:40:56 UTC
(In reply to Andrew John Hughes from comment #40)
> Simeon, I'm happy to build an RPM for you if you let me know whether you
> want to test with java-1.8.0-openjdk still or java-11-openjdk.

Andrew, would be great to have 1.8 build, because 11 version is still "optional" one for our customers and not all of their software works with that. Many thanks!

Comment 42 Simeon Andreev 2019-11-29 15:39:16 UTC
Created attachment 1640688 [details]
Script to compare performance of GC with and without debugger, when many classes are loaded and classes are being unloaded.

With the latest patch (http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.00/) we see performance as expected with a debug agent but no debugger attached. And we see improved performance compared to our minimal patch, when a debugger is attached.

So far we have checked only with snippets. Next we are validating the patch in our product.

I've attached the latest snippet we are using. With the latest patch this snippet shows about 2x slower execution when a debugger is attached. We hope very much that our application does not run into such a clinical case. I'll update the ticket as soon as we have performance numbers.

Comment 43 Andrew John Hughes 2019-12-04 22:38:09 UTC
Great, I'm currently building java-1.8.0-openjdk with the patch for RHEL 7.7 and will let you have them to test ASAP.

Sorry for the delay, but I was travelling at the beginning of this week.

Comment 44 Roman Kennke 2019-12-20 09:33:31 UTC
It turns out that the previous implementation was flawed too. I reworked the implementation again to make it O(1), and also added proper synchronization to avoid datastructure-corruption and proper cleanups to avoid memory leaks, both of which was missing in all previous implementations. See upstream review thread:

https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/030171.html

and actual webrev:

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/

Comment 45 Andrey Loskutov 2019-12-20 10:04:31 UTC
(In reply to Roman Kennke from comment #44)
> and actual webrev:
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/

I assume this is for 11 or 14. Could you share patch against 8?
The patch uses jdk.jdwp.agent/share/native/libjdwp paths that aren't available in 8 (we have share/back/ instead). If I update paths accordingly, patch still doesn't apply cleanly.

Comment 46 Roman Kennke 2019-12-20 11:18:10 UTC
(In reply to Andrey Loskutov from comment #45)
> (In reply to Roman Kennke from comment #44)
> > and actual webrev:
> > 
> > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
> 
> I assume this is for 11 or 14. Could you share patch against 8?
> The patch uses jdk.jdwp.agent/share/native/libjdwp paths that aren't
> available in 8 (we have share/back/ instead). If I update paths accordingly,
> patch still doesn't apply cleanly.

Find the jdk8u-based webrev here:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.03/

Roman

Comment 47 Simeon Andreev 2019-12-20 13:06:02 UTC
(In reply to Roman Kennke from comment #46)
> (In reply to Andrey Loskutov from comment #45)
> > (In reply to Roman Kennke from comment #44)
> > > and actual webrev:
> > > 
> > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
> > 
> > I assume this is for 11 or 14. Could you share patch against 8?
> > The patch uses jdk.jdwp.agent/share/native/libjdwp paths that aren't
> > available in 8 (we have share/back/ instead). If I update paths accordingly,
> > patch still doesn't apply cleanly.
> 
> Find the jdk8u-based webrev here:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.03/
> 
> Roman

I've tried to use the patch on top of our 1.8 181 build, when I run the attached script for performance testing I get:

JDWP exit error JVMTI_ERROR_INVALID_MONITOR(50): on raw monitor enter [util.c:1054]
FATAL ERROR in native method: JDWP on raw monitor enter, jvmtiError=JVMTI_ERROR_INVALID_MONITOR(50)
        at java.lang.Class.getDeclaredConstructors0(Native Method)
        at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
        at java.lang.Class.getConstructor0(Class.java:3075)
        at java.lang.Class.newInstance(Class.java:412)
        at test.TestJDKClassUnloadingPerformance.loadTestClasses(TestJDKClassUnloadingPerformance.java:69)
        at test.TestJDKClassUnloadingPerformance.main(TestJDKClassUnloadingPerformance.java:45)

I'm not sure what that means? Did I mess up the build somehow?

Also note that the patch uses "for (int idx = 0" which the RHEL OpenJDK build doesn't allow (defining the int in the loop is a newer feature than the used compiler standard).

Comment 48 Roman Kennke 2019-12-20 13:29:18 UTC
> I've tried to use the patch on top of our 1.8 181 build, when I run the
> attached script for performance testing I get:
> 
> JDWP exit error JVMTI_ERROR_INVALID_MONITOR(50): on raw monitor enter
> [util.c:1054]
> FATAL ERROR in native method: JDWP on raw monitor enter,
> jvmtiError=JVMTI_ERROR_INVALID_MONITOR(50)
>         at java.lang.Class.getDeclaredConstructors0(Native Method)
>         at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
>         at java.lang.Class.getConstructor0(Class.java:3075)
>         at java.lang.Class.newInstance(Class.java:412)
>         at
> test.TestJDKClassUnloadingPerformance.
> loadTestClasses(TestJDKClassUnloadingPerformance.java:69)
>         at
> test.TestJDKClassUnloadingPerformance.main(TestJDKClassUnloadingPerformance.
> java:45)
> 
> I'm not sure what that means? Did I mess up the build somehow?
> 
> Also note that the patch uses "for (int idx = 0" which the RHEL OpenJDK
> build doesn't allow (defining the int in the loop is a newer feature than
> the used compiler standard).

That's probably my fault. I should have tested the patch more thoroughly before sending it out. I assumed because it basically applied cleanly, it would work as expected.

I'll send an update in a bit.

Roman

Comment 49 Roman Kennke 2019-12-21 21:32:12 UTC
(In reply to Simeon Andreev from comment #47)
> (In reply to Roman Kennke from comment #46)
> > (In reply to Andrey Loskutov from comment #45)
> > > (In reply to Roman Kennke from comment #44)
> > > > and actual webrev:
> > > > 
> > > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
> > > 
> > > I assume this is for 11 or 14. Could you share patch against 8?
> > > The patch uses jdk.jdwp.agent/share/native/libjdwp paths that aren't
> > > available in 8 (we have share/back/ instead). If I update paths accordingly,
> > > patch still doesn't apply cleanly.
> > 
> > Find the jdk8u-based webrev here:
> > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.03/
> > 
> > Roman
> 
> I've tried to use the patch on top of our 1.8 181 build, when I run the
> attached script for performance testing I get:
> 
> JDWP exit error JVMTI_ERROR_INVALID_MONITOR(50): on raw monitor enter
> [util.c:1054]
> FATAL ERROR in native method: JDWP on raw monitor enter,
> jvmtiError=JVMTI_ERROR_INVALID_MONITOR(50)
>         at java.lang.Class.getDeclaredConstructors0(Native Method)
>         at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
>         at java.lang.Class.getConstructor0(Class.java:3075)
>         at java.lang.Class.newInstance(Class.java:412)
>         at
> test.TestJDKClassUnloadingPerformance.
> loadTestClasses(TestJDKClassUnloadingPerformance.java:69)
>         at
> test.TestJDKClassUnloadingPerformance.main(TestJDKClassUnloadingPerformance.
> java:45)
> 
> I'm not sure what that means? Did I mess up the build somehow?
> 
> Also note that the patch uses "for (int idx = 0" which the RHEL OpenJDK
> build doesn't allow (defining the int in the loop is a newer feature than
> the used compiler standard).

Find a fixed patch here:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.04/

Thanks,
Roman

Comment 50 Andrew John Hughes 2019-12-23 16:31:31 UTC
Simeon & Andrey, how were the previous builds? Do you need new builds with Roman's updated patch?

Thanks.

Comment 51 Simeon Andreev 2019-12-24 06:23:41 UTC
(In reply to Andrew John Hughes from comment #50)
> Simeon & Andrey, how were the previous builds? Do you need new builds with
> Roman's updated patch?
> 
> Thanks.

Ah sorry, I've somehow missed the last comment from Roman. We'll build our own 1.8 181 with it, since we have a good basis for comparison with the 181 build.

As for the previous patch, the performance results are inconclusive. We see improvement in a number of cases, but the biggest case still has wild performance swings. We'll try with the improved patch, though that will be after the holidays.

Comment 52 Simeon Andreev 2019-12-24 09:33:09 UTC
Tested the new patch with the attached snippet, performance is as expected both with attached and not attached debugger. Also ran our test for class unloading events (as received by the Eclipse JDT debugger), no problems there either.

We'll again run performance tests with our product after the holidays, I'll post results as soon as we have them.

Comment 53 Simeon Andreev 2020-01-09 07:34:25 UTC
Created attachment 1650921 [details]
Crash with the updated JDK.

We got a crash during our performance tests with the latest patch, see the attached log.

Stack: [0x00007ffe7df69000,0x00007ffe7e06a000],  sp=0x00007ffe7e068910,  free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libjdwp.so+0xcdff]
C  [libjdwp.so+0xd466]
C  [libjdwp.so+0x1668d]
C  [libjdwp.so+0x4f14]
C  [libjdwp.so+0x11ec8]
C  [libjdwp.so+0x24e81]
V  [libjvm.so+0x752f8b]
V  [libjvm.so+0xa788a2]
V  [libjvm.so+0x8beb82]
C  [libpthread.so.0+0x7e25]  start_thread+0xc5

Is there any other info you need from the crash?

Comment 54 Roman Kennke 2020-01-09 08:06:22 UTC
(In reply to Simeon Andreev from comment #53)
> Created attachment 1650921 [details]
> Crash with the updated JDK.
> 
> We got a crash during our performance tests with the latest patch, see the
> attached log.
> 
> Stack: [0x00007ffe7df69000,0x00007ffe7e06a000],  sp=0x00007ffe7e068910, 
> free space=1022k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> C  [libjdwp.so+0xcdff]
> C  [libjdwp.so+0xd466]
> C  [libjdwp.so+0x1668d]
> C  [libjdwp.so+0x4f14]
> C  [libjdwp.so+0x11ec8]
> C  [libjdwp.so+0x24e81]
> V  [libjvm.so+0x752f8b]
> V  [libjvm.so+0xa788a2]
> V  [libjvm.so+0x8beb82]
> C  [libpthread.so.0+0x7e25]  start_thread+0xc5
> 
> Is there any other info you need from the crash?

Is there any way I could reproduce this?

Comment 55 Simeon Andreev 2020-01-09 08:14:58 UTC
(In reply to Roman Kennke from comment #54)
> Is there any way I could reproduce this?

I doubt it. Maybe we can build with debug symbols so that its clear where the crash is? If you need this, let me know what I need to specify for such a build.

In the extreme case we can try to reduce our product to a snippet which produces the crash. That will not be trivial.

Right now I've taken your patch on top of OpenJDK 1.8 181, we'll also try with OpenJDK 1.8 232 (in case something in 181 doesn't work with the patch).

Comment 56 Roman Kennke 2020-01-09 08:22:04 UTC
You can try to build jdk8 with --disable-zip-debug-info (when calling configure). If that doesn't help, build with --with-debug-level=fastdebug, that will be slower than release build.

Thanks,
Roman

Comment 57 Simeon Andreev 2020-01-10 08:09:43 UTC
I'm having problems with building OpenJDK 1.8 232 from the RHEL source RPM, as well as building OpenJDK 1.8 181 with "--with-debug-level=fastdebug".

For 232:

    Too many levels of recursion in macro expansion. It is likely caused by recursive macro declaration.
    Bad exit status from /var/tmp/rpm-tmp.2qU6K2 (%install)

There are a ton of lines with "Too many levels of recursion in macro expansion", not just one.

I downloaded the source RPM from here:

https://access.redhat.com/downloads/content/java-1.8.0-openjdk-src/1.8.0.232.b09-2.el8_1/x86_64/fd431d51/package

For 181:

I changed the spec file to hard-code the debug level, my guess is that this is now the correct way to change the debug level. Two builds are created, likely my change messes up one of them.

In particular, I changed the line (1494 in my case):

    --with-debug-level=$debugbuild \

To:

--with-debug-level=fastdebug \

I.e. at about line 1500 of the "SPECS/java-1.8.0-openjdk.spec" file I have: 

%ifnarch %{jit_arches}
    --with-jvm-variants=zero \
%endif
    --disable-zip-debug-info \
    --with-milestone="fcs" \
    --with-update-version=%{updatever} \
    --with-build-number=%{buildver} \
    --with-boot-jdk=/usr/lib/jvm/java-openjdk \
    --with-debug-level=fastdebug \
    --enable-unlimited-crypto \
    --enable-system-nss \
    --with-zlib=system \
    --with-libjpeg=system \
    --with-giflib=system \
    --with-libpng=system \
    --with-lcms=bundled \
    --with-stdc++lib=dynamic \
    --with-extra-cxxflags="$EXTRA_CPP_FLAGS" \
    --with-extra-cflags="$EXTRA_CFLAGS" \
    --with-extra-ldflags="%{ourldflags}" \
    --with-num-cores="$NUM_PROC"

I'm using the RHEL source RPM since that integrates well in our RPM build and roll-out process. As well as adding the patch for this bug is trivial.

Comment 58 Simeon Andreev 2020-01-10 08:57:34 UTC
> my guess is that this is now the correct way to change the debug level

Sorry, I meant to say "this is not the correct way".

Comment 59 jiri vanek 2020-01-10 11:52:49 UTC
Hi Simeon, " Too many levels of recursion in macro expansion. It is likely caused by recursive macro declaration." should be only non fatal warning. The casue will be somewehre else. And error wil be printed somewhere above. For curiosity, the warning is spawned by hardcoded limit of recursion in rpmbuild. It is already fixed in newer rpmbuilds.

What you did should be really enough to create fastdebug jdk image, but not enough to make rpms.

From "Bad exit status from /var/tmp/rpm-tmp.2qU6K2 (%install)" I'm guessing you passed the %build successfully, so in your buildroot is already fastdebug jdk image. Only creation of rpm from it failed.
This image should be enough for you, can it?

If not, then to fix the whole rpmbuild to produce fastdeug rpms will be quite more substitions. Maybe s/slowdebug/fastdebug in spec will do most of them, but thenhunting one by one will be necessary.

gnu_andrew, I thought that we started building fastdebug rpms alongside slowdebug for cases like this.  Have this idea faded away?

Comment 60 Simeon Andreev 2020-01-10 13:21:41 UTC
ASSERT FAILED: classTrack.c : 159 - Attempting to insert duplicate class
JDWP exit error AGENT_ERROR_INTERNAL(181): Assertion Failed [error_messages.c:132]
FATAL ERROR in native method: JDWP Assertion Failed, jvmtiError=AGENT_ERROR_INTERNAL(181)
Current thread is 140730794264320
Dumping core ...

I'll have to analyse this in terms of our product; I'm only somewhat familiar with the recent developments in the area of product code which crashes. I'll check which classes are being loaded and unloaded.

Comment 61 Andrey Loskutov 2020-01-10 13:26:31 UTC
(In reply to Simeon Andreev from comment #60)
> ASSERT FAILED: classTrack.c : 159 - Attempting to insert duplicate class

I assume this could be due the fact that our application *re-writes* classes and loads / unloads a massive amount of re-written classes that have *exactly same* fully qualified names. So if the new logic keeps caches based on FQN class names it might get into the state where the "same" class is both loaded and unloaded multiple times.

Comment 62 Roman Kennke 2020-01-10 14:15:33 UTC
Ok, this helps a lot! I will check my changes and come back to you. Thanks!

Comment 63 Simeon Andreev 2020-01-10 14:43:53 UTC
We run 1 JVM and attach multiple times to it, executing some code. I see the crash after the 2nd attach. I think the crash even occurs on attach, I'm checking this.

Comment 64 Simeon Andreev 2020-01-10 15:29:00 UTC
The crash occurs after:

"Worker-75: Test program execution" #1769 prio=5 os_prio=0 tid=0x0000000004afc000 nid=0xb2ed at breakpoint[0x00007ff3386a2000]
   java.lang.Thread.State: RUNNABLE
        at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:190)
        at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:230)
        at org.eclipse.jdi.internal.request.EventRequestImpl.enable(EventRequestImpl.java:261)
        - locked <0x00007ffc02261a38> (a org.eclipse.jdi.internal.request.ThreadStartRequestImpl)
        at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget$ThreadStartHandler.createRequest(JDIDebugTarget.java:2226)
        at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget$ThreadStartHandler.<init>(JDIDebugTarget.java:2214)
        at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget.initializeRequests(JDIDebugTarget.java:611)
        at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget.initialize(JDIDebugTarget.java:550)
        ...
        at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)

The packet to send is:

[4820] ER_SET

Not sure why we would crash on enabling a ThreadStartEvent, possibly some code in the latest patch reacts on any request enable and results in a crash.

I'm still not able to reproduce this with a simple snippet and Eclipse though. Our debug launch is somewhat different than the Remote Java Application Attach launch from Eclipse, probably this is why.

Let me know if there are other infos I can retrieve; e.g. if you have some patch with more verbosity in case the assertion error is reached.

Comment 65 Roman Kennke 2020-01-15 15:50:20 UTC
I was sure I sent a note here earlier. Apparently it hasn't been sent. Well anyway, are you sure that you are running with the latest patch? Because the assert that you step on is not present there anymore. Please be sure to use this version of the fix:

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.04/


Thanks,
Roman

Comment 66 Simeon Andreev 2020-01-15 18:00:56 UTC
(In reply to Roman Kennke from comment #65)
> I was sure I sent a note here earlier. Apparently it hasn't been sent. Well
> anyway, are you sure that you are running with the latest patch? Because the
> assert that you step on is not present there anymore. Please be sure to use
> this version of the fix:
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.04/
> 
> 
> Thanks,
> Roman

Stack: [0x00007ffe71eb6000,0x00007ffe71fb7000],  sp=0x00007ffe71fb5810,  free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libjdwp.so+0xccef]  setupEvents+0x5f
C  [libjdwp.so+0xd356]  classTrack_activate+0x26
C  [libjdwp.so+0x1658d]  installHandler.part.3+0xad
C  [libjdwp.so+0x4e04]  setCommand+0x1a4
C  [libjdwp.so+0x11dc8]  debugLoop_run+0x2b8
C  [libjdwp.so+0x24d81]  acceptThread+0x101
V  [libjvm.so+0xbeb2ff]  JvmtiAgentThread::call_start_function()+0x15f
V  [libjvm.so+0x110ca4a]  JavaThread::thread_main_inner()+0x15a
V  [libjvm.so+0x110d072]  JavaThread::run()+0x3d2
V  [libjvm.so+0xe26e72]  java_start(Thread*)+0xf2
C  [libpthread.so.0+0x7e25]  start_thread+0xc5

Sorry, I had a patch from late November (27. November I think). Is this enough? I can also run with a debug build, the reproduction is trivial in our product.

Comment 67 Simeon Andreev 2020-01-15 18:42:50 UTC
Created attachment 1652540 [details]
Crash log with debug build and patch from 20.12.2019.

Comment 68 Roman Kennke 2020-01-15 19:02:49 UTC
Thank you! Can you describe what lead to that happening? Repeated connects with the debug agent or something like that?

Comment 69 Simeon Andreev 2020-01-16 07:25:47 UTC
Created attachment 1652649 [details]
Video showing the JDK on-debugger-attach crash with Eclipse.

(In reply to Roman Kennke from comment #68)
> Thank you! Can you describe what lead to that happening? Repeated connects
> with the debug agent or something like that?

We start a JVM, attach to it and run some code (classes are loaded prior to running and possibly unloaded once the code is ran), i.e. we start a debug execution. Then on the next debug execution, the JVM crashes as soon as the Eclipse JDT debugger is attached.

Also see comment 63 and comment 64.

> We run 1 JVM and attach multiple times to it, executing some code. I see the crash after the 2nd attach. I think the crash even occurs on attach, I'm checking this.


> The crash occurs after:
>
> "Worker-75: Test program execution" #1769 prio=5 os_prio=0 tid=0x0000000004afc000 nid=0xb2ed at breakpoint[0x00007ff3386a2000]
>    java.lang.Thread.State: RUNNABLE
>         at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:190)
>         at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:230)
>         at org.eclipse.jdi.internal.request.EventRequestImpl.enable(EventRequestImpl.java:261)
>         - locked <0x00007ffc02261a38> (a org.eclipse.jdi.internal.request.ThreadStartRequestImpl)
>         at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget$ThreadStartHandler.createRequest(JDIDebugTarget.java:2226)
>         at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget$ThreadStartHandler.<init>(JDIDebugTarget.java:2214)
>         at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget.initializeRequests(JDIDebugTarget.java:611)
>         at org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget.initialize(JDIDebugTarget.java:550)
>         ...
>         at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)
>
> The packet to send is:
>
> [4820] ER_SET
>
> Not sure why we would crash on enabling a ThreadStartEvent, possibly some code in the latest patch reacts on any request enable and results in a crash.
>
> I'm still not able to reproduce this with a simple snippet and Eclipse though. Our debug launch is somewhat different than the Remote Java Application Attach launch from Eclipse, probably this is why.
>
> Let me know if there are other infos I can retrieve; e.g. if you have some patch with more verbosity in case the assertion error is reached.

My previous attempts to reproduce the problem in Eclipse might have been missing the Debug view in Eclipse, I think the thread start listener is only registered if that view is open.

Here is the snippet:

package test;


import java.net.URL;
import java.net.URLClassLoader;
import java.nio.file.Paths;

public class TestClassUnloadingCrash {
	
	private static final int NUMBER_OF_CLASSES_IN_JAR = 1_000;

	public static void main(String[] args) throws Exception {
		System.out.println(System.getProperty("java.home"));
		String classesJarPath = "/home/sandreev/development_workspaces/contributor_workspace/ws/TestJDKClassUnloading/lib/classes.jar";
		URL[] urls = {
				Paths.get(classesJarPath).toUri().toURL()
		};
		
		int iterations = 1;
		int numberOfClasses = 5;

		int numberOfClassLoaders = numberOfClasses / NUMBER_OF_CLASSES_IN_JAR;
		long start = System.currentTimeMillis();
		
		for (int iteration = 0; iteration < iterations; ++iteration) {
			System.out.println("Iteration " + iteration + "/" + iterations + " starting.");
			URLClassLoader[] classLoaders = new URLClassLoader[numberOfClassLoaders];
			for (int i = 0; i < numberOfClassLoaders; ++i) {
				classLoaders[i] = new URLClassLoader(urls, null);
			}
			System.out.println("Class loaders created. Loading classes.");
			Thread[] t = new Thread[classLoaders.length];
			for (int i = 0; i < classLoaders.length; ++i) {
				final URLClassLoader classLoader = classLoaders[i];
				t[i] = new Thread(new Runnable() {
					public void run() {
						loadTestClasses(classLoader);
					}
				});
				t[i].start();
			}
			for (Thread tr : t) {
				tr.join();
			}
			System.out.println("Loading classes done. Unloading classes.");
			for (int i = 0; i < numberOfClassLoaders; ++i) {
				classLoaders[i].close();
				classLoaders[i] = null;
			}
			System.gc();
			System.runFinalization();
			classLoaders = null;
			System.gc();
			System.runFinalization();
			System.out.println("Iteration " + iteration + "/" + iterations + " done.");
		}
		
		long end = System.currentTimeMillis();
		long elapsed = end - start;
		System.out.println("Total time: " + elapsed + " ms");
		while (true) {
			Thread.sleep(500);
			System.out.println("sleeping...");
		}
	}

	private static void loadTestClasses(URLClassLoader classLoader) {
		try {
			for (int j = 0; j < NUMBER_OF_CLASSES_IN_JAR; ++j) {
				String className = "sample.C" + j;
				Class<?> clazz = classLoader.loadClass(className);
				clazz.newInstance();
			}
		} catch (Exception e) {
			throw new RuntimeException(e);
		}
	}
}

The jar with classes is the same as the one in the performance snippet I attached previously.

Run with arguments:

-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8001

Then attach with a debug launch of type "Remote Java Application" at port 8001. Disconnect the debugger and attach it again.

I don't know if this can be reproduced with jdb, I didn't try.

Comment 70 Andrey Loskutov 2020-01-22 14:11:45 UTC
@Roman: any update on the patch? Do you need something else from us?

Comment 71 Roman Kennke 2020-01-22 15:00:07 UTC
(In reply to Andrey Loskutov from comment #70)
> @Roman: any update on the patch? Do you need something else from us?

Just in this moment I was uploading an updated patch:

jdk11+:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
jdk8u:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.05/

Let me know if that works better. Thanks, Roman

Comment 72 Andrey Loskutov 2020-01-22 15:56:34 UTC
(In reply to Roman Kennke from comment #71)
> Let me know if that works better. Thanks, Roman

Thanks Roman, build is running...

Comment 73 Simeon Andreev 2020-01-22 16:30:46 UTC
On first glance the build looks good, no performance drop when debugging and also no immediate crash with our product. Tomorrow I'll run the test suite which first showed the crash, I'll let you know what the result is.

Comment 74 Roman Kennke 2020-01-22 16:39:51 UTC
(In reply to Simeon Andreev from comment #73)
> On first glance the build looks good, no performance drop when debugging and
> also no immediate crash with our product. Tomorrow I'll run the test suite
> which first showed the crash, I'll let you know what the result is.

Nice, glad to hear that. Let me know how it goes!
Thanks,
Roman

Comment 75 Simeon Andreev 2020-02-14 12:38:10 UTC
OK, our R&D is done with performance validation. We would like the latest patch to be integrated.

Comment 76 Andrey Loskutov 2020-03-17 18:10:33 UTC
@Roman: any estimation for a possible JDK 8 / 11 release numbers with the fix? 
OpenJDK 8u252 and 11.0.7 would be just great because we will get another fix there (bug 1776915 :-).

Comment 77 Roman Kennke 2020-03-17 18:43:36 UTC
I have patches ready for JDK15, JDK11 (identical to 15) and 8:

jdk15 and jdk11:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
jdk8:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.05/

It is currently under review for JDK15.

I am not sure if we want to include it in our own RPMs for the time being, until it will hit backports?

Roman

Comment 78 Andrey Loskutov 2020-03-17 20:18:46 UTC
(In reply to Roman Kennke from comment #77)
> I have patches ready for JDK15, JDK11 (identical to 15) and 8:
> 
> jdk15 and jdk11:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
> jdk8:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.05/
> 
> It is currently under review for JDK15.

Thank you, very good news.

> I am not sure if we want to include it in our own RPMs for the time being,
> until it will hit backports?
 
Roman, I guess you aren't asking me? I'm all for that, if that's what the usual practice is.

Comment 79 Andrew John Hughes 2020-03-18 16:27:48 UTC
(In reply to Roman Kennke from comment #77)
> I have patches ready for JDK15, JDK11 (identical to 15) and 8:
> 
> jdk15 and jdk11:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
> jdk8:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.05/
> 
> It is currently under review for JDK15.
> 
> I am not sure if we want to include it in our own RPMs for the time being,
> until it will hit backports?
> 
> Roman

We can use this bug to put it into RHEL 7.9, but I doubt that will beat it going into the upstream 8u & 11u releases.

Let me know once it's in OpenJDK 15, and I'll help make sure it makes it into 8u & 11u.

Comment 80 Roman Kennke 2020-03-18 16:57:22 UTC
I discovered a bug in my implementation and would discourage it being put into any release for now. See:

https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030703.html

Sorry. Roman

Comment 81 Roman Kennke 2020-03-20 16:36:03 UTC
The last revision of the fix had a problem with its locking which could lead to deadlocks in the JVM. Also, due to another bug, it did not report *any* class-unloads back to an attached debugger.

I came up with (yet) another implementation that is much simpler and more efficient than the previous ones. It doesn't require any additional data-structures, and is truly O(1) and does not exhibit the locking problems that the previous implementation did. Find more details here:
https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030753.html

Notice that, because it's now actually reporting class-unload events back to the attached debugger (if it asks for it, which is what jdb seems to do by default), performance of 'with agent *and* debugger' is now worse than it was in the previous implementation. This is because it needs to actually get all those class-unload-notifications over the wire, and they are a lot (at least in the provided testcase). Fixing this requires changes in jdb (and possibly other debuggers if they exhibit the same behaviour).

JDK11(+) patch is here:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/

JDK8 patch:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.06/

I'd suggest only including the patches into our builds after upstream (jdk15) review is done, and the patch has been integrated and seen some rounds of testing there.

Roman

Comment 82 Simeon Andreev 2020-03-25 08:08:54 UTC
With the new patch, so far, performance results look OK. We are running more tests, I'll post an update when we have conclusive results.

Comment 83 Roman Kennke 2020-03-27 11:44:38 UTC
I finally pushed the change to jdk15, with some additional modifications that resolve a crash that we observed in testing:

http://hg.openjdk.java.net/jdk/jdk/rev/63a288f3f25a

The corresponding changeset vs jdk8 is:

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev-jdk8u.09/

I'd suggest to give it some time, and then we can either include that in our RPMs or wait some more until it trickles down via backports.

Comment 84 Simeon Andreev 2020-04-22 12:46:54 UTC
Could you provide also a changeset for the OpenJDK 11 stream? Or can the same change be applied on the Java 11 stream (I assume yes)?

Comment 85 Andrew John Hughes 2020-04-22 18:27:27 UTC
We'll backport to 8u via 11u upstream, but we should clone this bug for java-11-openjdk so it's updated at the same as java-1.8.0-openjdk.

Comment 86 jiri vanek 2020-04-23 07:31:17 UTC
No special test for this planed.

Comment 87 Simeon Andreev 2020-04-29 10:07:57 UTC
We tried building OpenJDK 11.0.7 with the changeset on top, unfortunately we are running into crashes: https://bugzilla.redhat.com/show_bug.cgi?id=1828845

Either there is a regression between OpenJDK 11.0.7 or our build on RHEL 7.4 isn't working. We assume the latter.

We would like a RHEL build of OpenJDK 11.0.5 with the patch (we have OpenJDK RPM installed, with version 11.0.5.10-0.el7_7.x86_64), as we are in the process of updating to OpenJDK 11 (while still supporting our product with OpenJDK 8, as long as Java 8 is supported). The performance improvement is unfortunately a must-have before updating to 11. Can you provide a 11.0.5 build with the patch?

Comment 88 Roman Kennke 2020-04-29 21:07:19 UTC
The crash reported in https://bugzilla.redhat.com/show_bug.cgi?id=1828845 seems very very unlikely to be related to this change. Could be a regression in 11.0.7, we need to investigate.

Roman

Comment 89 Andrey Loskutov 2020-05-03 21:17:38 UTC
(In reply to Roman Kennke from comment #88)
> The crash reported in https://bugzilla.redhat.com/show_bug.cgi?id=1828845
> seems very very unlikely to be related to this change. Could be a regression
> in 11.0.7, we need to investigate.

Just an update: the crash above is *not* related to *this* patch, it is just a bug in 11.0.5 C2 compiler if Parallel GC is on. We've reproduced it with the default 11.0.5 JDK version too.

Comment 90 Andrew John Hughes 2020-05-23 11:02:28 UTC
Adding companion fix for x86-32 breakage.


Note You need to log in before you can comment on or make changes to this bug.