Bug 1284948
Summary: | Using jdb triggers OOME on the debugged application | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ingo Weiss <iweiss> | ||||
Component: | java-1.8.0-openjdk | Assignee: | Severin Gehwolf <sgehwolf> | ||||
Status: | CLOSED ERRATA | QA Contact: | Lukáš Zachar <lzachar> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.0 | CC: | ahughes, cww, dbhole, iweiss, jvanek, leiyu, loskutov, salmy, sgehwolf | ||||
Target Milestone: | rc | Keywords: | Reopened, ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | java-1.8.0-openjdk-1.8.0.121-2.b13.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1442162 (view as bug list) | Environment: | |||||
Last Closed: | 2017-08-01 08:46:49 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1298243, 1313485, 1390370, 1442162, 1463144 | ||||||
Attachments: |
|
Description
Ingo Weiss
2015-11-24 14:07:06 UTC
Created attachment 1098224 [details]
Test case
Created attachment 1098225 [details]
Proposed patch
Assigning to Andrew to review/integrate the patch This bug is not similar to https://bugs.openjdk.java.net/browse/JDK-4858370, it *is* this bug. This is not related to the jdb alone, the jdb was used in the reproducer because the effect is easy to demonstrate by using jdb. Same effect happens if one debug JVM from Eclipse (using same API as jdb) and evaluates any methods there in the "Display" view. All objects and all arguments used for evaluation are leaked. Same effect happens also if we use JDWP API to get/set values in the debuggee JVM from our IDE code to reflect changes user made on test specifications. This was the original trigger for this request, because using JDWP API causes OOM in the application while accessing "bigger" data objects or "root" objects holding references to many other objects. Since references to accessed objects are never released, application dies sooner or later with OOM. While usually this could be treated as nasty but not that critical (restart the app and the problem is solved), the severity is critical for our application where the main use case is to allow continuous debugging of semiconductor chip tests. Restarting the application only because we leak memory is highly undesirable because the test initialization costs lot of time. Crashing with OOM while the chips are connected can leave system in unpredictable electrical state, causing hardware failures. What is the origin of this patch? The issue has not been fixed in OpenJDK (invoker.c is unchanged all the way up to OpenJDK 9) so, for upstream OpenJDK, it will need to go to 9 first then all the way back to 8, 7 and 6. I ran the reproducer on 6, 7 and 8 and all throw an OutOfMemoryError after the cont invocation: JDK: /usr/lib/jvm/icedtea-7 java version "1.7.0_91" OpenJDK Runtime Environment (IcedTea 2.6.3) (Gentoo icedtea-7.2.6.3) OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode) Default execution without jdb: Used: 440,536, free: 39,929,640, total: 40,370,176 Used: 24,663,488, free: 15,706,688, total: 40,370,176 Used: 286,496, free: 40,083,680, total: 40,370,176 Used: 24,727,064, free: 15,643,112, total: 40,370,176 Used: 285,584, free: 40,084,592, total: 40,370,176 Used: 24,505,848, free: 15,864,328, total: 40,370,176 Used: 285,584, free: 40,084,592, total: 40,370,176 Used: 24,505,848, free: 15,864,328, total: 40,370,176 Used: 285,584, free: 40,084,592, total: 40,370,176 Used: 24,505,848, free: 15,864,328, total: 40,370,176 Used: 285,584, free: 40,084,592, total: 40,370,176 Used: 285,584, free: 40,084,592, total: 40,370,176 After jdb starts, execute the commands below: eval Main.dummy(arr) cont ... repeat two steps above until JVM terminates with OOM ... Starting same program with jdb now: Listening for transport dt_socket at address: 8000 Set uncaught java.lang.Throwable Set deferred uncaught java.lang.Throwable Initializing jdb ... *** Reading commands from /home/andrew/projects/openjdk/tests/OOMonDebug/.jdbrc > VM Started: No frames on the current call stack main[1] Deferring breakpoint Main:12. It will be set after the class is loaded. main[1] > > Set deferred breakpoint Main:12 Used: 881,304, free: 39,488,872, total: 40,370,176 Used: 25,109,760, free: 15,260,416, total: 40,370,176 Breakpoint hit: "thread=main", Main.main(), line=12 bci=25 12 arr = null; main[1] eval Main.dummy(arr) Called dummy Main.dummy(arr) = <void value> main[1] cont > Used: 12,289,696, free: 28,080,480, total: 40,370,176 java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid27871.hprof ... Heap dump file created [49067140 bytes in 0.141 secs] Exception occurred: java.lang.OutOfMemoryError (uncaught)"thread=main", Main.main(), line=10 bci=18 10 arr = new Main[3000000]; main[1] eval Main.dummy(arr) Called dummy Main.dummy(arr) = <void value> main[1] cont Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at Main.main(Main.java:10) > The application exited (In reply to Andrew John Hughes from comment #5) > What is the origin of this patch? It is hand crafted by my colleague while he was tried to nail down the memory leak in our application :-) The base for the patch was hg forest from http://hg.openjdk.java.net/jdk7u/jdk7u. The patch code itself was inspired by the proposed patch from https://bugs.openjdk.java.net/browse/JDK-4858370. > The issue has not been fixed in OpenJDK > (invoker.c is unchanged all the way up to OpenJDK 9) so, for upstream > OpenJDK, it will need to go to 9 first then all the way back to 8, 7 and 6. > I ran the reproducer on 6, 7 and 8 and all throw an OutOfMemoryError after > the cont invocation: Nice to see the reproducer works reliably for each Java release :-) Ok, are you signatories of the Oracle Contributor Agreement? [0] I can't submit this patch upstream unless this is the case. I'll need to rebase it on OpenJDK 9 and refactor it a little before sending it upstream. [0] http://openjdk.java.net/contribute/ Neither me nor my colleague are signatories of the Oracle Contributor Agreement, so it will take some time. I guess you need it for the patch only, not for the reproducer test case? Well, it depends if we intend to include the reproducer. In its current form, it seems unlikely as it can be automated as part of the existing OpenJDK regression tests. I am no signatory of the Oracle Contributor Agreement Andrew. Am I or Andrey required to be in this case? (In reply to Ingo Weiss from comment #11) > I am no signatory of the Oracle Contributor Agreement Andrew. Am I or Andrey > required to be in this case? The authors of the patch need to be signatories so it can be submitted to the OpenJDK project. Ingo, as a Red Hat employee, you're covered by our corporate OCA (http://www.oracle.com/technetwork/community/oca-486395.html#r) but anyone else who contributed to the patch will need to sign, if they haven't already. Can this issue be investigated without the OCA concern if I remove the proposed patch from this bug report? I've asked for a review of my proposed patch for JDK 9 on upstream's serviceability-dev list: http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-March/019155.html This has been pushed to JDK 9 upstream[1]. I'll ask for an 8 backport shortly. [1] http://hg.openjdk.java.net/jdk9/hs-rt/jdk/rev/277d7584fa03 Upstream JDK 8 backport request posted: http://mail.openjdk.java.net/pipermail/jdk8u-dev/2016-March/005216.html This has been pushed to JDK 8 upstream[1]. I'll move on to a 7 backport soon. [1] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/rev/4548a4525cb9 (In reply to Severin Gehwolf from comment #26) > This has been pushed to JDK 8 upstream[1]. Thanks! Can you please add a note here in which 1.8.x OpenJDK release the fix will be visible? A best guess would be u102, but Oracle haven't yet updated their documentation, following the u77 update: http://openjdk.java.net/projects/jdk8u/ Upstream JDK 7 backport request posted: http://mail.openjdk.java.net/pipermail/jdk7u-dev/2016-April/010499.html The JDK 9 fix got backed out due to test regressions. Same for JDK 8 and 7. The new upstream bug is JDK-8153711 and I've posted an updated patch for review here: http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019410.html Removed from RPM. Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. (In reply to RHEL Product and Program Management from comment #36) > Development Management has reviewed and declined this request. > You may appeal this decision by reopening this request. I guess this is related to the Java 7 backport only? (In reply to Andrey Loskutov from comment #37) > (In reply to RHEL Product and Program Management from comment #36) > > Development Management has reviewed and declined this request. > > You may appeal this decision by reopening this request. > > I guess this is related to the Java 7 backport only? Sorry, no we are deferring this to 7.4 given upstream slowness. I did not mean to have the bug closed. Re-opening. Severin is continuing to work with upstream as fast as possible. (In reply to Deepak Bhole from comment #38) > (In reply to Andrey Loskutov from comment #37) > > (In reply to RHEL Product and Program Management from comment #36) > > > Development Management has reviewed and declined this request. > > > You may appeal this decision by reopening this request. > > > > I guess this is related to the Java 7 backport only? > > Sorry, no we are deferring this to 7.4 given upstream slowness. I did not > mean to have the bug closed. Re-opening. Severin is continuing to work with > upstream as fast as possible. There shouldn't be an issue with the 7 backport. Indeed, we got as far as having it in and ready to go, but then regressions were found with the original version of the patch in OpenJDK 9. We're waiting on those to be resolved upstream before backporting the updated version to 7 & 8. FYI, a fix for this (the redo) has been pushed to JDK 9 today: http://hg.openjdk.java.net/jdk9/hs/jdk/rev/4c843eb35b8a (In reply to Severin Gehwolf from comment #41) > FYI, a fix for this (the redo) has been pushed to JDK 9 today: > http://hg.openjdk.java.net/jdk9/hs/jdk/rev/4c843eb35b8a Thanks! But I hope there are still plans to backport this to JDK 8? (In reply to Andrey Loskutov from comment #42) > (In reply to Severin Gehwolf from comment #41) > > FYI, a fix for this (the redo) has been pushed to JDK 9 today: > > http://hg.openjdk.java.net/jdk9/hs/jdk/rev/4c843eb35b8a > > Thanks! But I hope there are still plans to backport this to JDK 8? The plan is to get it backported to 8 and then 7. In general, it's preferred to wait some time before starting the backport process into a stable release. That is to see whether a patch causes regressions elsewhere. Since the initial attempt was problematic already, I'll wait for a couple of over-night test cycles in 9 before asking for backports. (In reply to Severin Gehwolf from comment #43) > The plan is to get it backported to 8 and then 7. In case it is relevant and can save your time, while our initial request was to backport this patch to 7, we've moved meanwhile to the 8 and so do not require backport to 7 anymore. > In general, it's preferred > to wait some time before starting the backport process into a stable > release. That is to see whether a patch causes regressions elsewhere. Sure. (In reply to Andrey Loskutov from comment #44) > (In reply to Severin Gehwolf from comment #43) > > The plan is to get it backported to 8 and then 7. > > In case it is relevant and can save your time, while our initial request was > to backport this patch to 7, we've moved meanwhile to the 8 and so do not > require backport to 7 anymore. OK. Thanks for the heads-up. Resetting component to JDK 8 as per comment 44. So far no regressions showed up in JDK 9. JDK 8 backport requested upstream today: http://mail.openjdk.java.net/pipermail/jdk8u-dev/2016-October/006002.html Patch backported to 8u upstream: http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/rev/3243d893b0b2 I'll add it to the RPM once we have the ACKs. Are any more changes required for it to work with 7u? (In reply to Andrew John Hughes from comment #53) > Are any more changes required for it to work with 7u? No. The JDK 8 patch will work. Seems to have been pushed back to 8u152 (January 2018???) https://bugs.openjdk.java.net/browse/JDK-8153711 (In reply to Andrew John Hughes from comment #57) > Seems to have been pushed back to 8u152 (January 2018???) > > https://bugs.openjdk.java.net/browse/JDK-8153711 > Fixed In Version: java-1.8.0-openjdk-1.8.0.121-2.b13.el7 @Andrew, Severin: is the issue now fixed in openJDK or not? The two comments above seem to contradict each other. Thanks! (In reply to Andrey Loskutov from comment #65) > (In reply to Andrew John Hughes from comment #57) > > Seems to have been pushed back to 8u152 (January 2018???) > > > > https://bugs.openjdk.java.net/browse/JDK-8153711 > > > > Fixed In Version: java-1.8.0-openjdk-1.8.0.121-2.b13.el7 > > @Andrew, Severin: is the issue now fixed in openJDK or not? The two comments > above seem to contradict each other. Thanks! It's been fixed in upstream JDK 8 for a while now. The fix is scheduled to be included in RHEL 7.4 for java-1.8.0-openjdk (no guarantees, though). Incidentally, JDK 7, specifically - java-1.7.0-openjdk-1.7.0.131-2.6.9.0.el7_3 - should have the patch too, since it's IcedTea 2.6.9 based[1]. That package should have been released already. [1] http://blog.fuseyism.com/index.php/2017/02/14/security-icedtea-2-6-9-for-openjdk-7-released/ (In reply to Severin Gehwolf from comment #66) > It's been fixed in upstream JDK 8 for a while now. The fix is scheduled to > be included in RHEL 7.4 for java-1.8.0-openjdk (no guarantees, though). Thanks for fast response! Since we are on 7.2 and don't plan to move to 7.4 in the near future, we don't want to wait for RHEL 7.4. Can you please explain me - is there a java-1.8.0-openjdk rpm package available (provided by RH) containing the fix? Which minimum version of java-1.8.0-openjdk should we try to get the fix? To be more precise, I see today only 1) https://access.redhat.com/downloads/content/java-1.8.0-openjdk/1.8.0.121-0.b13.el7_3/x86_64/fd431d51/package available for RHEL 7.3 and 2) https://access.redhat.com/downloads/content/java-1.8.0-openjdk/1.8.0.121-1.b13.el6/x86_64/fd431d51/package available for RHEL 6.x. Are those builds containing the fix for this bug, and which of them can be used on RHEL 7.2? There is no generally available java-1.8.0-openjdk package with the fix yet. If there was, this bug would have a status of CLOSED ERRATA. (In reply to Severin Gehwolf from comment #69) > There is no generally available java-1.8.0-openjdk package with the fix yet. > If there was, this bug would have a status of CLOSED ERRATA. Severin, thanks again, but this was not the question. I *have* access to the RH network and *could* use packages mentioned above, but the questions are: 1) do they contain the fix and 2) can they be used on RHEL 7.2? (In reply to Andrey Loskutov from comment #71) > (In reply to Severin Gehwolf from comment #69) > > There is no generally available java-1.8.0-openjdk package with the fix yet. > > If there was, this bug would have a status of CLOSED ERRATA. > > Severin, thanks again, but this was not the question. I *have* access to the > RH network and *could* use packages mentioned above, but the questions are: > 1) do they contain the fix java-1.8.0-openjdk-1.8.0.121-2.b13.el7 and newer have the fix. If you are able to get those builds, then be aware those are unofficial builds. They'll still have to go through our testing machinery. I've verified that java-1.8.0-openjdk-1.8.0.121-2.b13.el7 and better pass the upstream OomDebugTest. That's about as good as it gets at this point. > and 2) can they be used on RHEL 7.2? Not sure what answer you are looking for. It's not recommended. All I can say is a) you'd be using non-released builds b) you'd run newer builds on an older version of RHEL. I.e. 7.4 build on 7.2 RHEL. You can use them, but you'd be sort of on your own. Neither of a) or b) are recommended practices. (In reply to Severin Gehwolf from comment #72) > still have to go through our testing machinery. I've verified that > java-1.8.0-openjdk-1.8.0.121-2.b13.el7 and better pass the upstream > OomDebugTest. That's about as good as it gets at this point. Thanks! > > and 2) can they be used on RHEL 7.2? > > Not sure what answer you are looking for. It's not recommended. All I can > say is a) you'd be using non-released builds b) you'd run newer builds on an > older version of RHEL. I.e. 7.4 build on 7.2 RHEL. You can use them, but > you'd be sort of on your own. Neither of a) or b) are recommended practices. A-ha, good to know. Do you know if a build for RHEL 7.2 is planned, or we are deemed to use not supported rpm (if we can install it at all) in order to get the fix? In case you need an official case number, it is CASE 01543682. (In reply to Andrey Loskutov from comment #73) > > > and 2) can they be used on RHEL 7.2? > > > > Not sure what answer you are looking for. It's not recommended. All I can > > say is a) you'd be using non-released builds b) you'd run newer builds on an > > older version of RHEL. I.e. 7.4 build on 7.2 RHEL. You can use them, but > > you'd be sort of on your own. Neither of a) or b) are recommended practices. > > A-ha, good to know. Do you know if a build for RHEL 7.2 is planned, or we > are deemed to use not supported rpm (if we can install it at all) in order > to get the fix? In case you need an official case number, it is CASE > 01543682. 7.3 is current RHEL 7. 7.4 is an upcoming release which will have the fix. There are no builds planned for 7.3 (let alone 7.2). Perhaps there are options to get the fix through the support channel faster, but that's beyond me. Hi Andrey, the 7.3 RPMs should work on 7.2 as well in theory and are worth a try. (In reply to Deepak Bhole from comment #75) > Hi Andrey, the 7.3 RPMs should work on 7.2 as well in theory and are worth a > try. Thanks, I will try it. Great! Requesting z-stream (In reply to Severin Gehwolf from comment #72) > (In reply to Andrey Loskutov from comment #71) > > (In reply to Severin Gehwolf from comment #69) > > > There is no generally available java-1.8.0-openjdk package with the fix yet. > > > If there was, this bug would have a status of CLOSED ERRATA. > > > > Severin, thanks again, but this was not the question. I *have* access to the > > RH network and *could* use packages mentioned above, but the questions are: > > 1) do they contain the fix > > java-1.8.0-openjdk-1.8.0.121-2.b13.el7 and newer have the fix. Unfortunately I still see the regression (just tried with the attached test). rpm -qa | grep java-1.8.0-openjdk java-1.8.0-openjdk-headless-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-debug-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-src-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-headless-debug-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-src-debug-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-devel-1.8.0.121-0.b13.el7_3.x86_64 java-1.8.0-openjdk-debuginfo-1.8.0.121-0.b13.el7_3.x86_64 and still see that the memory is not freed up : $ ./runMainWithJdb.sh openjdk version "1.8.0_121" OpenJDK Runtime Environment (build 1.8.0_121-b13) OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode) Default execution without jdb: Used: 435,840, free: 39,934,336, total: 40,370,176 Used: 24,645,608, free: 15,724,568, total: 40,370,176 Used: 269,936, free: 40,100,240, total: 40,370,176 Used: 24,479,712, free: 15,890,464, total: 40,370,176 Used: 269,336, free: 40,100,840, total: 40,370,176 Used: 24,479,112, free: 15,891,064, total: 40,370,176 Used: 269,336, free: 40,100,840, total: 40,370,176 Used: 24,479,112, free: 15,891,064, total: 40,370,176 Used: 269,336, free: 40,100,840, total: 40,370,176 Used: 24,479,112, free: 15,891,064, total: 40,370,176 Used: 269,336, free: 40,100,840, total: 40,370,176 Used: 269,336, free: 40,100,840, total: 40,370,176 After jdb starts, execute the commands below: eval Main.dummy(arr) cont ... repeat two steps above until JVM terminates with OOM ... Starting same program with jdb now: Listening for transport dt_socket at address: 8000 Set uncaught java.lang.Throwable Set deferred uncaught java.lang.Throwable Initializing jdb ... *** Reading commands from /xxx/.jdbrc VM Started: > No frames on the current call stack main[1] Deferring breakpoint Main:12. It will be set after the class is loaded. main[1] > > Set deferred breakpoint Main:12 Used: 645,576, free: 39,724,600, total: 40,370,176 Used: 24,855,344, free: 15,514,832, total: 40,370,176 Breakpoint hit: "thread=main", Main.main(), line=12 bci=25 12 arr = null; main[1] eval Main.dummy(arr) Called dummy Main.dummy(arr) = <void value> main[1] cont > Used: 12,274,088, free: 28,096,088, total: 40,370,176 java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid11061.hprof ... Heap dump file created [48897181 bytes in 0.052 secs] Exception occurred: java.lang.OutOfMemoryError (uncaught)"thread=main", Main.main(), line=10 bci=18 10 arr = new Main[3000000]; main[1] exit Exception in thread "main" Listening for transport dt_socket at address: 8000 java.lang.OutOfMemoryError: Java heap space at Main.main(Main.java:10) (In reply to Andrey Loskutov from comment #79) > (In reply to Severin Gehwolf from comment #72) > > (In reply to Andrey Loskutov from comment #71) > > > (In reply to Severin Gehwolf from comment #69) > > > > There is no generally available java-1.8.0-openjdk package with the fix yet. > > > > If there was, this bug would have a status of CLOSED ERRATA. > > > > > > Severin, thanks again, but this was not the question. I *have* access to the > > > RH network and *could* use packages mentioned above, but the questions are: > > > 1) do they contain the fix > > > > java-1.8.0-openjdk-1.8.0.121-2.b13.el7 and newer have the fix. > > Unfortunately I still see the regression (just tried with the attached test). > Hi Andrey, I am terribly sorry, I was unclear in my original comment (and misinformed about the status of this bug). The fix is not yet 7.3 yet. Once we have approval on the request Andrew made this morning, we will have a clone of this bug for 7.3.z through which we will fix this as soon as we can. Once again, I apologize for the error on my part. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1831 |