Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 804632

Summary: systemtap jstack() support broken
Product: Red Hat Enterprise Linux 6 Reporter: Mark Wielaard <mjw>
Component: java-1.6.0-openjdkAssignee: jiri vanek <jvanek>
Status: CLOSED ERRATA QA Contact: Lukáš Zachar <lzachar>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: azelinka, dbhole, fche, psplicha
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: java-1.6.0-openjdk-1.6.0.0-1.43.1.11.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 825824 (view as bug list) Environment:
Last Closed: 2012-06-20 13:51:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 805204, 825824    

Description Mark Wielaard 2012-03-19 13:35:28 UTC
Description of problem:

Using the jstack() systemtap support in java-1.6.0-openjdk sometimes
breaks and aborts the stap script.

Version-Release number of selected component (if applicable):

java-1.6.0-openjdk-devel-1.6.0.0-1.43.1.10.6.el6_2.x86_64

How reproducible:

Almost always.

Steps to Reproduce:
1. Make sure to have systemtap, java-1.6.0-openjdk-devel and java-1.6.0-openjdk

2. Have some simple java program around. The following Hello.java will do:

public class Hello
{
  public static void main(String args[])
  {
    System.out.println("Hello World!");
  }
}

javac Hello.java

3. Run some stap script that uses jstack(). e.g.

stap -v -e 'probe hotspot.jni.GetStringUTFChars { log(probestr); print_jstack_full(); log(" === "); }' -c 'java Hello' 2>&1 | c++filt
  
Actual results:

Some backtraces, but then...
ERROR: kernel read fault at 0x0000000000000018 (addr) near identifier '@cast' at /usr/share/systemtap/tapset/x86_64/jstack.stp:362:29
WARNING: Number of errors: 1, skipped probes: 0
Warning: /usr/bin/staprun exited with status: 1

Expected results:

No ERRORs. Lots more backtraces.

Additional info:

This is a bug in the jstack.stp tapset, already fixed upstream:
http://thread.gmane.org/gmane.comp.java.openjdk.distro-packaging.devel/17667

Comment 7 jiri vanek 2012-03-26 13:40:23 UTC
Patch approved to be included in specfile

Comment 9 Mark Wielaard 2012-05-25 10:36:28 UTC
Sadly after a systemtap upgrade from systemtap-1.6-4.el6.x86_64 (RHEL 6.2) to systemtap-1.7-5.el6.x86_64 (RHEL 6.3 Beta) this has turned into a different bug:

Pass 1: parsed user script and 85 library script(s) using 198312virt/26832res/3028shr kb, in 110usr/10sys/132real ms.
Pass 2: analyzed script: 3 probe(s), 42 function(s), 3 embed(s), 16 global(s) using 426332virt/198740res/126360shr kb, in 960usr/100sys/1101real ms.
Pass 3: translated to C into "/tmp/stapi1CKcr/stap_4366_src.c" using 423224virt/77760res/6072shr kb, in 90usr/60sys/157real ms.
cc1: warnings being treated as errors
/tmp/stapi1CKcr/stap_4366_src.c: In function ‘function_print_jstack’:
/tmp/stapi1CKcr/stap_4366_src.c:5201: error: the frame size of 272 bytes is larger than 256 bytes
make[1]: *** [/tmp/stapi1CKcr/stap_4366_src.o] Error 1
make: *** [_module_/tmp/stapi1CKcr] Error 2
WARNING: make exited with status: 2
Pass 4: compiled C into "stap_4366.ko" in 6500usr/770sys/7448real ms.
Pass 4: compilation failed.  Try again with another '--vp 0001' option.
Keeping temporary directory "/tmp/stapi1CKcr"

Comment 10 Mark Wielaard 2012-05-25 10:46:16 UTC
Just confirmed the bug from comment #9 is also in systemtap git trunk. Even slightly worse:

/tmp/stap4JLjXj/stap_10562_src.c: In function ‘function_print_jstack’:
/tmp/stap4JLjXj/stap_10562_src.c:5183: error: the frame size of 288 bytes is larger than 256 bytes

The generated frame size grew from 272 bytes to 288 bytes with systemtap git.

Comment 11 Mark Wielaard 2012-05-25 10:57:11 UTC
More confirming. Upstream systemtap 1.6 works fine against the new java-1.6.0-openjdk-devel-1.6.0.0-1.44.1.11.1.el6.x86_64 package jstack.stp, upstream systemtap 1.7 is broken (error: the frame size of 272 bytes is larger than 256 bytes).

Comment 12 Frank Ch. Eigler 2012-05-25 11:26:41 UTC
Please try again with -DSTP_LEGACY_PRINT

Comment 13 Mark Wielaard 2012-05-25 11:43:46 UTC
(In reply to comment #12)
> Please try again with -DSTP_LEGACY_PRINT

That makes no difference (error: the frame size of 272 bytes is larger than 256 bytes)

Comment 14 Mark Wielaard 2012-05-25 12:46:25 UTC
I created a new bug against systemtap-1.7-5.el6.x86_64 for the issue mentioned in comment #9. https://bugzilla.redhat.com/show_bug.cgi?id=825244

Comment 19 Mark Wielaard 2012-05-25 17:34:33 UTC
systemtap 1.7 trips over the added try-catch construct. This makes it "work" again:

--- jstack.stp.newpkg	2012-05-25 17:25:16.761481432 +0200
+++ /usr/share/systemtap/tapset/x86_64/jstack.stp	2012-05-25 17:31:41.809479955 +0200
@@ -311,7 +311,7 @@
 
           // Some of this is "fuzzy" so catch any read error in case we
           // "guessed" wrong.
-          try
+          /* try */
             {
 
               // Do some sanity checking.
@@ -440,6 +440,7 @@
                 }
 
             }
+/*
           catch
             {
               // Some assumption above totally failed and we got an address
@@ -447,6 +448,7 @@
               frame = sprintf("<unknown_frame@0x%x>", pc);
               trust_fp = 0;
             }
+*/
         }
       else
         {

Sadly, that try-catch is precisely the part of the fix that is needed...

Comment 20 Mark Wielaard 2012-05-25 17:36:46 UTC
That catch part didn't really come out good in the diff, here it is in full to show it isn't anything "fancy":

/*
          catch
            {
              // Some assumption above totally failed and we got an address
              // read error. Give up and mark frame pointer as suspect.
              frame = sprintf("<unknown_frame@0x%x>", pc);
              trust_fp = 0;
            }
*/

Comment 21 Mark Wielaard 2012-05-25 17:57:21 UTC
Also confirmed to other way around. The old jstack.stp version "works" with systemtap 1.6, but adding just the try/catch (and none of the other tweaks between the old and new/fixed jstack.stp makes it fail with error: the frame size of 272 bytes is larger than 256 bytes

--- jstack.stp	2012-05-25 17:21:22.375931710 +0200
+++ /usr/share/systemtap/tapset/x86_64/jstack.stp	2012-05-25 17:54:26.938477606 +0200
@@ -299,7 +299,7 @@
               segments++;
             }
           block = CodeCache_low + (segment << CodeHeap_log2_segment_size);
-
+try {
           // Do some sanity checking.
           used = @cast(block, "HeapBlock",
                        "/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so")->_header->_used;
@@ -424,6 +424,13 @@
               frame_size = @cast(blob, "CodeBlob",
                                  "/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so")->_frame_size;
             }
+} catch
+{
+ // Some assumption above totally failed and we got an address
+ frame = sprintf("<unknown_frame@0x%x>", pc);
+ trust_fp = 0;
+}
+
         }
       else
         {

Comment 22 Mark Wielaard 2012-05-25 17:58:25 UTC
(In reply to comment #21)
> Also confirmed to other way around. The old jstack.stp version "works" with
> systemtap 1.6, but adding just the try/catch (and none of the other tweaks
> between the old and new/fixed jstack.stp makes it fail with error: the frame
> size of 272 bytes is larger than 256 bytes

Sorry, that should say systemtap 1.7.

Comment 23 Mark Wielaard 2012-05-25 18:00:02 UTC
So to summarize the additional try-catch is what makes it fail (produce the error) with systemtap 1.7, but the same addition of the try-catch is what fixes jstack.stp against systemtap 1.6 (where the new jstack.stp works just fine).

Comment 24 Deepak Bhole 2012-05-25 18:42:02 UTC
So then since RHEL-6 has 1.7, we need to remove the try/catch to fix this bug?

Comment 25 Mark Wielaard 2012-05-25 19:17:40 UTC
(In reply to comment #24)
> So then since RHEL-6 has 1.7, we need to remove the try/catch to fix this
> bug?

except that would bring back the original bug that the try/catch is supposed to fix.

But it seems tweaking the script a little to make it a little less informative in case of other "issues" makes it work again with systemtap 1.7:

--- jstack.stp.newpkg	2012-05-25 17:25:16.761481432 +0200
+++ /usr/share/systemtap/tapset/x86_64/jstack.stp	2012-05-25 19:13:31.777479659 +0200
@@ -320,7 +320,7 @@
               if (used != 1)
                 {
                   // Something very odd has happened.
-                  frame = sprintf("<unused_code_block@0x%x>", pc);
+                  frame = "<unused_code_block>";
                   blob_name = "unused";
                   trust_fp = 0;
                   frame_size = 0;
@@ -444,7 +444,7 @@
             {
               // Some assumption above totally failed and we got an address
               // read error. Give up and mark frame pointer as suspect.
-              frame = sprintf("<unknown_frame@0x%x>", pc);
+              frame = "<unknown_frame>";
               trust_fp = 0;
             }
         }

Comment 29 jiri vanek 2012-05-28 07:53:10 UTC
rebuilding http://brewweb.devel.redhat.com/brew/taskinfo?taskID=4453614

Comment 36 jiri vanek 2012-05-28 14:45:40 UTC
I can just confirm...

Comment 40 errata-xmlrpc 2012-06-20 13:51:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0836.html