| Summary: | systemtap jstack() support broken | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Mark Wielaard <mjw> | |
| Component: | java-1.6.0-openjdk | Assignee: | jiri vanek <jvanek> | |
| Status: | CLOSED ERRATA | QA Contact: | Lukáš Zachar <lzachar> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 6.3 | CC: | azelinka, dbhole, fche, psplicha | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | java-1.6.0-openjdk-1.6.0.0-1.43.1.11.1 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 825824 (view as bug list) | Environment: | ||
| Last Closed: | 2012-06-20 13:51:26 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 805204, 825824 | |||
Patch approved to be included in specfile Sadly after a systemtap upgrade from systemtap-1.6-4.el6.x86_64 (RHEL 6.2) to systemtap-1.7-5.el6.x86_64 (RHEL 6.3 Beta) this has turned into a different bug: Pass 1: parsed user script and 85 library script(s) using 198312virt/26832res/3028shr kb, in 110usr/10sys/132real ms. Pass 2: analyzed script: 3 probe(s), 42 function(s), 3 embed(s), 16 global(s) using 426332virt/198740res/126360shr kb, in 960usr/100sys/1101real ms. Pass 3: translated to C into "/tmp/stapi1CKcr/stap_4366_src.c" using 423224virt/77760res/6072shr kb, in 90usr/60sys/157real ms. cc1: warnings being treated as errors /tmp/stapi1CKcr/stap_4366_src.c: In function ‘function_print_jstack’: /tmp/stapi1CKcr/stap_4366_src.c:5201: error: the frame size of 272 bytes is larger than 256 bytes make[1]: *** [/tmp/stapi1CKcr/stap_4366_src.o] Error 1 make: *** [_module_/tmp/stapi1CKcr] Error 2 WARNING: make exited with status: 2 Pass 4: compiled C into "stap_4366.ko" in 6500usr/770sys/7448real ms. Pass 4: compilation failed. Try again with another '--vp 0001' option. Keeping temporary directory "/tmp/stapi1CKcr" Just confirmed the bug from comment #9 is also in systemtap git trunk. Even slightly worse: /tmp/stap4JLjXj/stap_10562_src.c: In function ‘function_print_jstack’: /tmp/stap4JLjXj/stap_10562_src.c:5183: error: the frame size of 288 bytes is larger than 256 bytes The generated frame size grew from 272 bytes to 288 bytes with systemtap git. More confirming. Upstream systemtap 1.6 works fine against the new java-1.6.0-openjdk-devel-1.6.0.0-1.44.1.11.1.el6.x86_64 package jstack.stp, upstream systemtap 1.7 is broken (error: the frame size of 272 bytes is larger than 256 bytes). Please try again with -DSTP_LEGACY_PRINT (In reply to comment #12) > Please try again with -DSTP_LEGACY_PRINT That makes no difference (error: the frame size of 272 bytes is larger than 256 bytes) I created a new bug against systemtap-1.7-5.el6.x86_64 for the issue mentioned in comment #9. https://bugzilla.redhat.com/show_bug.cgi?id=825244 systemtap 1.7 trips over the added try-catch construct. This makes it "work" again:
--- jstack.stp.newpkg 2012-05-25 17:25:16.761481432 +0200
+++ /usr/share/systemtap/tapset/x86_64/jstack.stp 2012-05-25 17:31:41.809479955 +0200
@@ -311,7 +311,7 @@
// Some of this is "fuzzy" so catch any read error in case we
// "guessed" wrong.
- try
+ /* try */
{
// Do some sanity checking.
@@ -440,6 +440,7 @@
}
}
+/*
catch
{
// Some assumption above totally failed and we got an address
@@ -447,6 +448,7 @@
frame = sprintf("<unknown_frame@0x%x>", pc);
trust_fp = 0;
}
+*/
}
else
{
Sadly, that try-catch is precisely the part of the fix that is needed...
That catch part didn't really come out good in the diff, here it is in full to show it isn't anything "fancy":
/*
catch
{
// Some assumption above totally failed and we got an address
// read error. Give up and mark frame pointer as suspect.
frame = sprintf("<unknown_frame@0x%x>", pc);
trust_fp = 0;
}
*/
Also confirmed to other way around. The old jstack.stp version "works" with systemtap 1.6, but adding just the try/catch (and none of the other tweaks between the old and new/fixed jstack.stp makes it fail with error: the frame size of 272 bytes is larger than 256 bytes
--- jstack.stp 2012-05-25 17:21:22.375931710 +0200
+++ /usr/share/systemtap/tapset/x86_64/jstack.stp 2012-05-25 17:54:26.938477606 +0200
@@ -299,7 +299,7 @@
segments++;
}
block = CodeCache_low + (segment << CodeHeap_log2_segment_size);
-
+try {
// Do some sanity checking.
used = @cast(block, "HeapBlock",
"/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so")->_header->_used;
@@ -424,6 +424,13 @@
frame_size = @cast(blob, "CodeBlob",
"/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so")->_frame_size;
}
+} catch
+{
+ // Some assumption above totally failed and we got an address
+ frame = sprintf("<unknown_frame@0x%x>", pc);
+ trust_fp = 0;
+}
+
}
else
{
(In reply to comment #21) > Also confirmed to other way around. The old jstack.stp version "works" with > systemtap 1.6, but adding just the try/catch (and none of the other tweaks > between the old and new/fixed jstack.stp makes it fail with error: the frame > size of 272 bytes is larger than 256 bytes Sorry, that should say systemtap 1.7. So to summarize the additional try-catch is what makes it fail (produce the error) with systemtap 1.7, but the same addition of the try-catch is what fixes jstack.stp against systemtap 1.6 (where the new jstack.stp works just fine). So then since RHEL-6 has 1.7, we need to remove the try/catch to fix this bug? (In reply to comment #24) > So then since RHEL-6 has 1.7, we need to remove the try/catch to fix this > bug? except that would bring back the original bug that the try/catch is supposed to fix. But it seems tweaking the script a little to make it a little less informative in case of other "issues" makes it work again with systemtap 1.7: --- jstack.stp.newpkg 2012-05-25 17:25:16.761481432 +0200 +++ /usr/share/systemtap/tapset/x86_64/jstack.stp 2012-05-25 19:13:31.777479659 +0200 @@ -320,7 +320,7 @@ if (used != 1) { // Something very odd has happened. - frame = sprintf("<unused_code_block@0x%x>", pc); + frame = "<unused_code_block>"; blob_name = "unused"; trust_fp = 0; frame_size = 0; @@ -444,7 +444,7 @@ { // Some assumption above totally failed and we got an address // read error. Give up and mark frame pointer as suspect. - frame = sprintf("<unknown_frame@0x%x>", pc); + frame = "<unknown_frame>"; trust_fp = 0; } } I can just confirm... Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0836.html |
Description of problem: Using the jstack() systemtap support in java-1.6.0-openjdk sometimes breaks and aborts the stap script. Version-Release number of selected component (if applicable): java-1.6.0-openjdk-devel-1.6.0.0-1.43.1.10.6.el6_2.x86_64 How reproducible: Almost always. Steps to Reproduce: 1. Make sure to have systemtap, java-1.6.0-openjdk-devel and java-1.6.0-openjdk 2. Have some simple java program around. The following Hello.java will do: public class Hello { public static void main(String args[]) { System.out.println("Hello World!"); } } javac Hello.java 3. Run some stap script that uses jstack(). e.g. stap -v -e 'probe hotspot.jni.GetStringUTFChars { log(probestr); print_jstack_full(); log(" === "); }' -c 'java Hello' 2>&1 | c++filt Actual results: Some backtraces, but then... ERROR: kernel read fault at 0x0000000000000018 (addr) near identifier '@cast' at /usr/share/systemtap/tapset/x86_64/jstack.stp:362:29 WARNING: Number of errors: 1, skipped probes: 0 Warning: /usr/bin/staprun exited with status: 1 Expected results: No ERRORs. Lots more backtraces. Additional info: This is a bug in the jstack.stp tapset, already fixed upstream: http://thread.gmane.org/gmane.comp.java.openjdk.distro-packaging.devel/17667