This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2138897 - Significant performance degradation after step over during debugging
Summary: Significant performance degradation after step over during debugging
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: java-11-openjdk
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Andrew John Hughes
QA Contact: OpenJDK QA
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-31 15:43 UTC by Simeon Andreev
Modified: 2023-09-12 23:31 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-12 23:31:53 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-3498 0 None Migrated None 2023-09-12 23:31:45 UTC
Red Hat Issue Tracker RHELPLAN-137911 0 None None None 2023-02-06 16:38:24 UTC

Description Simeon Andreev 2022-10-31 15:43:25 UTC
Description of problem:

We observe significant performance degradation if step over is used while debugging. We found this problem with Eclipse and Java 11 / Java 17, but its reproducible also with jdb and Java 11 / Java 17.

Version-Release number of selected component (if applicable):

OpenJDK 11.0.10 (seen also with latest master compiled from https://github.com/openjdk/jdk11u)
OpenJDK 17.0.4

How reproducible:

Debug the following snippet:

  public class StepOver {
      public static void main(String[] args) throws Exception {
          System.out.println(); // set a breakpoint here, and step over
          System.out.println(); // press Resume when suspended here
          double[] ySeries = new double[100_000];
          for (int i = 0; i < ySeries.length; i++) {
              ySeries[i] = Math.sin(i / 20) * Math.pow(10, -21);
          }
          long startTime = System.currentTimeMillis();
          for (int i = 0; i < ySeries.length; i++) {
              String.valueOf(ySeries[i]);
          }
          System.out.println("elapsed time: " + (System.currentTimeMillis() - startTime));
      }
  }


Steps to Reproduce:
1. Compile
  /usr/lib/jvm/java-17/bin/javac StepOver.java

2. Debug the following snippet with jdb
  /usr/lib/jvm/java-17/bin/jdb StepOver

3. Set a breakpoint at StepOver:3
  stop at StepOver:3

4. Run the snippet
  run StepOver

5. Step over the breakpoint:
  step over

6. Continue execution:
  cont 

Actual results:

Observe the snippet takes 8+ seconds to finish (HP Z640 workstation).

Expected results:

The snippet doesn't take much longer to finish than without stepping over during debugging.

Additional info:

The bug was introduced with this commit: https://github.com/openjdk/jdk11u/commit/dbb9eb5b9c3ddcbcfaae8ac4ca21760c9c99a15c

Specifically this change "fixes" the performance downgrade:

diff --git a/src/hotspot/share/prims/jvmtiEventController.cpp b/src/hotspot/share/prims/jvmtiEventController.cpp
index 38de6383cc..22d0f40c97 100644
--- a/src/hotspot/share/prims/jvmtiEventController.cpp
+++ b/src/hotspot/share/prims/jvmtiEventController.cpp
@@ -523,7 +523,7 @@ JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *state) {
     JvmtiEnvThreadStateIterator it(state);
     for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {
       any_env_enabled |= recompute_env_thread_enabled(ets, state);
-      has_frame_pops |= ets->has_frame_pops();
+      //has_frame_pops |= ets->has_frame_pops();
     }
   }


In particular, the interpreter only mode is entered after step over and is never left after this change. Whereas before the change, the interpreter only mode is left after the step over handling is done.

Note that execution directly after this change takes 27+ seconds. There is another change that seems to help with the performance (bringing it down to 8+ seconds):

https://github.com/openjdk/jdk11u/commit/e221522f106318175cfc6f9b38542670656a4cd8

Output with Java 17, using jdb with step over:

$ /usr/lib/jvm/java-11/bin/jdb StepOver
Initializing jdb ...
> stop at StepOver:3
Deferring breakpoint StepOver:3.
It will be set after the class is loaded.
> run StepOver
run  StepOver
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
> 
VM Started: Set deferred breakpoint StepOver:3

Breakpoint hit: "thread=main", StepOver.main(), line=3 bci=0
3            System.out.println(); // set a breakpoint here, and step over

main[1] list
1    public class StepOver {
2        public static void main(String[] args) throws Exception {
3 =>         System.out.println(); // set a breakpoint here, and step over
4            System.out.println(); // press Resume when suspended here
5            double[] ySeries = new double[100_000];
6            for (int i = 0; i < ySeries.length; i++) {
7                ySeries[i] = Math.sin(i / 20) * Math.pow(10, -21);
8            }
9            long startTime = System.currentTimeMillis();                                                                                                                                                          
10            for (int i = 0; i < ySeries.length; i++) {                                                                                                                                                           
main[1] step over                                                                                                                                                                                                  
>                                                                                                                                                                                                                  
Step completed: "thread=main", jdk.internal.misc.Unsafe.arrayBaseOffset(), line=1,056 bci=1                                                                                                                        
                                                                                                                                                                                                                   
main[1] cont                                                                                                                                                                                                       
>                                                                                                                                                                                                                  
                                                                                                                                                                                                                   
elapsed time: 8241                                                                                                                                                                                                 

The application exited


Output with Java 17, without using step over:

$ /usr/lib/jvm/java-11/bin/jdb StepOver
Initializing jdb ...
> stop at StepOver:3
Deferring breakpoint StepOver:3.
It will be set after the class is loaded.
> run StepOver
run  StepOver
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
> 
VM Started: Set deferred breakpoint StepOver:3

Breakpoint hit: "thread=main", StepOver.main(), line=3 bci=0
3            System.out.println(); // set a breakpoint here, and step over

main[1] list
1    public class StepOver {
2        public static void main(String[] args) throws Exception {
3 =>         System.out.println(); // set a breakpoint here, and step over
4            System.out.println(); // press Resume when suspended here
5            double[] ySeries = new double[100_000];
6            for (int i = 0; i < ySeries.length; i++) {
7                ySeries[i] = Math.sin(i / 20) * Math.pow(10, -21);
8            }
9            long startTime = System.currentTimeMillis();
10            for (int i = 0; i < ySeries.length; i++) {
main[1] cont
> 

elapsed time: 133

The application exited


The output and execution time are essentially the same with Java 17.

Comment 1 Simeon Andreev 2022-10-31 18:38:07 UTC
See also: https://bugs.openjdk.org/browse/JDK-8229012

Comment 2 Simeon Andreev 2022-11-01 09:07:32 UTC
From the serviceability-dev mailing list:

> Hi guys,
>
> I looked into into this one more time and agree with Chris.
> I believe JDK-8187289 is correct and it fixed NotifyFramePop bug (the
> fix includes regtest which shows the problem).
> In the scenario described after calling NotifyFramePop we have to remain
> in interp_only mode until the frame is popped from the stack to properly
> clear the request.
> Described performance issue is exactly what is JDK-8229012 about.
> I suppose it's possible to fix it in the debugger level by eliminating
> NotifyFramePop call.
> 
> regards,
> Alex

So I assume to fix this bug, https://bugs.openjdk.org/browse/JDK-8229012 must be fixed.

Comment 3 Andrew Dinn 2022-11-10 17:44:28 UTC
> So I assume to fix this bug, https://bugs.openjdk.org/browse/JDK-8229012 must be fixed.

Probably. I have been following this upstream and am sure Chris knows what he is doing. I will update here as Chris makes further progress.

Comment 5 Andrew Dinn 2023-02-06 16:17:33 UTC
Issue JDK-8229012 is not currently progressing upstream, nor is the related issue JDK-6960970 whose fix Chris Plummer expected would also resolve the former bug.

There is perhaps still some prospect for an independent resolution of this specific bug, as opposed to a more general one for JDK-6960970. Chris mentioned using a different mechanism than NotifyFramePop to track completion of the call. However, that would require non-trivial changes which i expect it would be hard to justify backporting to jdk17, especially if it involved adding compiler support and return barrier support as mentioned in JDK-6960970.

I'm glad to hear that applying the event controller patch you propose above which discounts some frame pop event enabled settings is allowing your test case to complete without causing any follow-on problems. However, I don't see how this is guaranteed to be a correct solution for all cases. Do you have any reason to believe it is correct to other than the heuristic that it fixes this specific test case?

Comment 6 Simeon Andreev 2023-02-07 07:32:52 UTC
(In reply to Andrew Dinn from comment #5)
> I'm glad to hear that applying the event controller patch you propose above
> which discounts some frame pop event enabled settings is allowing your test
> case to complete without causing any follow-on problems. However, I don't
> see how this is guaranteed to be a correct solution for all cases. Do you
> have any reason to believe it is correct to other than the heuristic that it
> fixes this specific test case?

I assume you mean this change:

diff --git a/src/hotspot/share/prims/jvmtiEventController.cpp b/src/hotspot/share/prims/jvmtiEventController.cpp
index 38de6383cc..22d0f40c97 100644
--- a/src/hotspot/share/prims/jvmtiEventController.cpp
+++ b/src/hotspot/share/prims/jvmtiEventController.cpp
@@ -523,7 +523,7 @@ JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *state) {
     JvmtiEnvThreadStateIterator it(state);
     for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {
       any_env_enabled |= recompute_env_thread_enabled(ets, state);
-      has_frame_pops |= ets->has_frame_pops();
+      //has_frame_pops |= ets->has_frame_pops();
     }
   }

We have decided to live with whatever bugs are fixed in https://bugs.openjdk.org/browse/JDK-8187289 as Java 7 and Java 8 had them. The performance degradation from the bug fix is too big for our application.

Hopefully a fix is done in the JDK so we don't have to patch out this change.

Comment 7 Andrey Loskutov 2023-02-07 09:23:29 UTC
(In reply to Andrew Dinn from comment #5) 
> I'm glad to hear that applying the event controller patch you propose above
> which discounts some frame pop event enabled settings is allowing your test
> case to complete without causing any follow-on problems. However, I don't
> see how this is guaranteed to be a correct solution for all cases. Do you
> have any reason to believe it is correct to other than the heuristic that it
> fixes this specific test case?

To be clear, we do NOT have any patch, we've just reverted fix from https://bugs.openjdk.org/browse/JDK-8187289 for our JDK build.

Whatever was fixed there seem to be of lower impact to us as the fix/performance regression caused by the fix of JDK-8187289.

So we still have no real solution for the performance degradation and still wait for a proper fix of https://bugs.openjdk.org/browse/JDK-8229012.

Comment 8 Andrew Dinn 2023-02-09 10:49:21 UTC
(In reply to Andrey Loskutov from comment #7)
> To be clear, we do NOT have any patch, we've just reverted fix from
> https://bugs.openjdk.org/browse/JDK-8187289 for our JDK build.

To be more precise, you cherry-picked one line from that patch to revert. That's not the same as reverting the whole patch and, indeed, is arguably much more risky since it creates a situation that is neither the status quo before the related bug was patched nor the subsequent status determined by the developer of the patch to be an improvement on that original status quo.

I would not recommend taking such a step without being able to articulate with, at the least, a fair degree of clarity and certainty how this might

  i) limit the corrective measures the patch was intended to produce
  ii) reinstate the problem the patch was meant to fix and
  iii) risk the introduction of other problems.

> Whatever was fixed there seem to be of lower impact to us as the
> fix/performance regression caused by the fix of JDK-8187289.

How do you know that? The patch explicitly enables propagation of pop frame events through to the logic which decides whether to enter interpreter mode. Disabling propagation might well fit this case. What cases might it not fit? How can you be sure this cannot backfire or, if it can, that the outcome will be of lower impact?

> So we still have no real solution for the performance degradation and still
> wait for a proper fix of https://bugs.openjdk.org/browse/JDK-8229012.

I am afraid so. It does appear from the explanation provided by Chris Plummer in JDK-8229012 that root problem here, i.e. request for a pop frame notify in the initState() call, is almost always redundant and certainly is so in this case. That is very unfortunate. However, there is a case where it is not redundant (single stepping at a return) and I assume that is why it is still present.

Comment 10 Simeon Andreev 2023-02-09 11:20:42 UTC
> To be more precise, you cherry-picked one line from that patch to revert.

No, we have reverted the entire commit:

    Revert "8187289: NotifyFramePop request is not cleared if JVMTI_EVENT_FRAME_POP is disabled"
    
    This reverts commit dbb9eb5b9c3ddcbcfaae8ac4ca21760c9c99a15c.

> How do you know that?

The performance loss of entering interpreter only mode was deemed not acceptable by product owners of relevant areas in our product. Its great that a bug was fixed in the debugger. But the user cannot resort to debugging due to their code running for 10+ minutes instead of 1 minute. It was decided to live with debugging and maybe running into trouble, instead of not being able to debug.

Comment 11 Andrew Dinn 2023-02-09 11:30:47 UTC
(In reply to Simeon Andreev from comment #10)
> > To be more precise, you cherry-picked one line from that patch to revert.
> 
> No, we have reverted the entire commit:
> 
>     Revert "8187289: NotifyFramePop request is not cleared if
> JVMTI_EVENT_FRAME_POP is disabled"
>     
>     This reverts commit dbb9eb5b9c3ddcbcfaae8ac4ca21760c9c99a15c.

My apologies for misunderstanding the comment in the initial report which only mentioned this one line.

Comment 12 Andrey Loskutov 2023-02-28 12:40:32 UTC
(In reply to Andrew Dinn from comment #5)
> Issue JDK-8229012 is not currently progressing upstream, nor is the related
> issue JDK-6960970 whose fix Chris Plummer expected would also resolve the
> former bug.
> 
> There is perhaps still some prospect for an independent resolution of this
> specific bug, as opposed to a more general one for JDK-6960970. Chris
> mentioned using a different mechanism than NotifyFramePop to track
> completion of the call. 

You mean this comment? 
https://bugs.openjdk.org/browse/JDK-8229012?focusedCommentId=14474180&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14474180

> However, that would require non-trivial changes
> which i expect it would be hard to justify backporting to jdk17, especially
> if it involved adding compiler support and return barrier support as
> mentioned in JDK-6960970.

Let ask it differently: do we have a chance to get working patch for *any* future Java version, and what is the estimated time frame for that?

We are kind of forced to update to new Java 17 version to get a crash fix for https://bugzilla.redhat.com/show_bug.cgi?id=2162365, but that would mean that we will not be able to use "official" RH build, and patch the JDK manually again to get the offending commit dbb9eb5b9c3ddcbcfaae8ac4ca21760c9c99a15c out of "our" JDK 17 build.

Are there any chances to get *some* solution in a Java 17 (which is still latest LTS release as of today)?

Comment 13 Andrew Dinn 2023-03-02 10:53:23 UTC
(In reply to Andrey Loskutov from comment #12)
> Let ask it differently: do we have a chance to get working patch for *any*
> future Java version, and what is the estimated time frame for that?
> 
> We are kind of forced to update to new Java 17 version to get a crash fix
> for https://bugzilla.redhat.com/show_bug.cgi?id=2162365, but that would mean
> that we will not be able to use "official" RH build, and patch the JDK
> manually again to get the offending commit
> dbb9eb5b9c3ddcbcfaae8ac4ca21760c9c99a15c out of "our" JDK 17 build.
> 
> Are there any chances to get *some* solution in a Java 17 (which is still
> latest LTS release as of today)?

I am afraid there is no clear-cut answer to that:

1) There is a high chance that one of the upstream Oracle devs responsible for this part of OpenJDK will provide a fix for this problem in the dev branch. Red Hat does not have staff who are actively developing this part of the code base so we are unlikely to come up with a patch without the co-operation of Oracle devs. We cannot specify a time frame for arrival of any patch as those same devs are very busy with many other tasks and this has not been ranked as a high priority issue (P3 -- lowest is P4). So, it might arrive in jdk21 but it might arrive much later.

2) If a patch were to be provided in jdk21 (or even the non-LTS releases jdk22-24) then it might well be possible to backport it to jdk17u. There would certainly be a much stronger case for a backport to 17u than to 11u or 8u as the case for stabilising a backport branch grows significantly as further LTS releases intervene between it and head. There is really no way to quantify the likelihood in the absence of a specific patch. Any decision would require an assessment of the benefit the patch might provide to one or more users traded off against the actual possibility or potential risk of it negatively affecting existing deployments. Clearly, if a patch causes known changes in behaviour other than just to fix the reported problem than that could well count against a backport (n.b. there have been rare cases where the broad usefulness of a backport has dominated). On the other hand, even where there is no evidence of any behavioural change a patch might be rejected because it involves non-trivial modifications to the code or the code design, with a too high consequent risk of accidentally injecting new defects.

Comment 14 RHEL Program Management 2023-09-12 23:31:10 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 15 RHEL Program Management 2023-09-12 23:31:53 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.