RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1602008 - java-1.8.0-openjdk: jstack: mixed mode stack trace issue for i686
Summary: java-1.8.0-openjdk: jstack: mixed mode stack trace issue for i686
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: java-1.8.0-openjdk
Version: 7.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Severin Gehwolf
QA Contact: OpenJDK QA
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-17 15:14 UTC by zzambers
Modified: 2020-06-09 17:53 UTC (History)
3 users (show)

Fixed In Version: java-1.8.0-openjdk-1.8.0.212.b02-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-09 17:53:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Proposed patch, fixing the FP validity check on i686 (942 bytes, patch)
2018-07-23 14:05 UTC, Severin Gehwolf
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 808293 1 None None None 2021-01-20 06:05:38 UTC
openjdk bug system JDK-8208091 0 None None None 2018-07-23 15:36:48 UTC

Internal Links: 808293

Description zzambers 2018-07-17 15:14:36 UTC
While doing some investigation (on broken beaker tests), I discovered, that printing mixed mode stack trace does not seem to work for i686 multilib packages.

How reproducible:
1. run java application using 32-bit JDK on x86_64 system
2. use jstack -m [pid], from the same JDK on that application

Actual results:
Jstack prints:

Attaching to process ID 22818, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.171-b10
Deadlock Detection:

No deadlocks found.

----------------- 22819 -----------------
0xf77ca430		????????
----------------- 22820 -----------------
0xf77ca430		????????
----------------- 22821 -----------------
0xf77ca430		????????
----------------- 22822 -----------------
0xf77ca430		????????
----------------- 22823 -----------------
0xf77ca430		????????
----------------- 22824 -----------------
0xf77ca430		????????
----------------- 22825 -----------------
0xf77ca430		????????
----------------- 22826 -----------------
0xf77ca430		????????
----------------- 22827 -----------------
0xf77ca430		????????
sun.jvm.hotspot.debugger.UnmappedAddressException: f0
	at sun.jvm.hotspot.debugger.PageCache.checkPage(PageCache.java:208)
	at sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63)
	at sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225)
	at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.readCInteger(LinuxDebuggerLocal.java:498)
	at sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
	at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.readAddress(LinuxDebuggerLocal.java:433)
	at sun.jvm.hotspot.debugger.linux.LinuxAddress.getAddressAt(LinuxAddress.java:74)
	at sun.jvm.hotspot.debugger.linux.x86.LinuxX86CFrame.sender(LinuxX86CFrame.java:69)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:161)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:58)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:53)
	at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)
	at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)
	at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)
	at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
	at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)
	at sun.tools.jstack.JStack.main(JStack.java:106)
----------------- 22828 -----------------
0xf77ca430		????????
----------------- 22829 -----------------
0xf77ca430		????????
----------------- 22830 -----------------
0xf77ca430		????????
----------------- 22831 -----------------
0xf77ca430		????????
----------------- 22832 -----------------
0xf77ca430		????????
----------------- 22818 -----------------
0xf77ca430		????????



OS (x86_64):
Red Hat Enterprise Linux Server release 7.5 (Maipo)

uname -a
Linux rhel7-x64-vagrant 3.10.0-862.6.3.el7.x86_64 #1 SMP Fri Jun 15 17:57:37 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

installed openjdk packages:
java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i686
java-1.8.0-openjdk-debuginfo-1.8.0.171-8.b10.el7_5.i686
java-1.8.0-openjdk-devel-1.8.0.171-8.b10.el7_5.i686
java-1.8.0-openjdk-headless-1.8.0.171-8.b10.el7_5.i686

installed debuginfo packages:
gcc-base-debuginfo-4.8.5-28.el7_5.1.i686
gcc-debuginfo-4.8.5-28.el7_5.1.i686
glibc-debuginfo-2.17-222.el7.i686
glibc-debuginfo-common-2.17-222.el7.i686
java-1.8.0-openjdk-debuginfo-1.8.0.171-8.b10.el7_5.i686
libX11-debuginfo-1.6.5-1.el7.i686
zlib-debuginfo-1.2.7-17.el7.i686


Expected results:
No exceptions are thrown

Additional info:
No exception is thrown for x86_64 OJDK8 packages on rhel-7. Exception is also not thrown on rhel-6, neither i686 (not multilib) nor x86_64 OJDK8 packages.

Comment 4 Severin Gehwolf 2018-07-20 15:50:56 UTC
This is also reproducible in a plain i686 mock. Hence I've adjusted the synopsis of this bug.

This looks to be an issue with glibc not being compiled with -fno-omit-frame-pointer and the serviceability code isn't being smart enough to figure it out without the frame pointer being compiled in. In this case glibc. The serviceability code currently pretty much relies on a valid ebp register value. From LinuxCDebugger.topFrameForThread():

    String cpu = dbg.getCPU();
    if (cpu.equals("x86")) {
       X86ThreadContext context = (X86ThreadContext) thread.getContext();
       Address ebp = context.getRegisterAsAddress(X86ThreadContext.EBP);
       if (ebp == null) return null;
       Address pc  = context.getRegisterAsAddress(X86ThreadContext.EIP);
       if (pc == null) return null;
       return new LinuxX86CFrame(dbg, ebp, pc);

Then, when getting the next CFrame this trips over an implicit is-address-mapped-in-check for ebp (zero-offset) value 'f0' in LinuxX86CFrame.sender():

 Address nextEBP = ebp.getAddressAt( 0 * ADDRESS_SIZE);

Hence, the exception. Not sure what to do about this.

# jps
935 Jps
825 Foo

# jstack -m 825
[...]
----------------- 834 -----------------
0xf771f430              ????????
sun.jvm.hotspot.debugger.UnmappedAddressException: f0
        at sun.jvm.hotspot.debugger.PageCache.checkPage(PageCache.java:208)
        at sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63)
        at sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225)
        at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.readCInteger(LinuxDebuggerLocal.java:498)
        at sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
        at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.readAddress(LinuxDebuggerLocal.java:433)
        at sun.jvm.hotspot.debugger.linux.LinuxAddress.getAddressAt(LinuxAddress.java:74)
        at sun.jvm.hotspot.debugger.linux.x86.LinuxX86CFrame.sender(LinuxX86CFrame.java:69)
        at sun.jvm.hotspot.tools.PStack.run(PStack.java:161)
        at sun.jvm.hotspot.tools.PStack.run(PStack.java:58)
        at sun.jvm.hotspot.tools.PStack.run(PStack.java:53)
        at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)
        at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)
        at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)
        at sun.tools.jstack.JStack.main(JStack.java:106)

Note that '834' above is the LWP id of the thread. Then, doing the same in gdb, attaching to 825 we see for LWP 834, Thread 7, this backtrace:

# gdb -ex 'set confirm off' -ex 'handle SIGSEGV pass nostop noprint' -ex 'set breakpoint pending on' -ex 'thread apply all bt' -ex 'quit' -p 825
[...]
Thread 7 (Thread 0xa3863b40 (LWP 834)):
#0  0xf771f430 in __kernel_vsyscall ()
#1  0xf7703acc in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=1, futex=0xf770f000) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
#2  do_futex_wait (sem=0xf770f000, sem@entry=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
#3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
#4  0xf6cc18d4 in check_pending_signals (wait=true) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:2522
#5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, __the_thread__=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/os.cpp:250
#6  0xf6e0f5f7 in JavaThread::thread_main_inner (this=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/thread.cpp:1699
#7  0xf6cc336a in java_start (thread=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
#8  0xf76fdb1c in start_thread (arg=0xa3863b40) at pthread_create.c:308
#9  0xf75fd45e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:131
[...]

So we are indeed in glibc code where jstack fails to provide a backtrace.

Looking at the registers for that thread in that moment we see that ebp holds value 'f0' which doesn't look like a reasonable frame pointer address. Yet, the serviceability agent code assumes that. When it tries to verify if the address at ebp is mapped in, it fails and the exception is being thrown.

# gdb -ex 'set confirm off' -ex 'handle SIGSEGV pass nostop noprint' -ex 'set breakpoint pending on' -ex 'thread apply 7 info registers' -ex 'quit' -p 825
Thread 7 (Thread 0xa3863b40 (LWP 834)):
eax            0xfffffe00       -512
ecx            0x80     128
edx            0x1      1
ebx            0xf70ea854       -150034348
esp            0xa3862fa0       0xa3862fa0
ebp            0xf0     0xf0
esi            0x0      0
edi            0xf770f000       -143593472
eip            0xf771f430       0xf771f430 <__kernel_vsyscall+16>
eflags         0x282    [ SF IF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x0      0
gs             0x63     99

Looking some more into the ebp register values via gdb we see that there are more of the dont-look-like-valid-frame-pointer-values kind:

# for i in $(seq 1 15); do gdb -ex 'set confirm off' -ex 'handle SIGSEGV pass nostop noprint' -ex 'set breakpoint pending on' -ex "thread apply $i info registers" -ex 'quit' -p 825; done > gdb_register_info.825.log
# grep ebp gdb_register_info.825.log
ebp            0xf642ab40       0xf642ab40
ebp            0xa2ec91b0       0xa2ec91b0
ebp            0x3      0x3
ebp            0xa2f9b0f0       0xa2f9b0f0
ebp            0xa301c0f0       0xa301c0f0
ebp            0xa309d0f0       0xa309d0f0
ebp            0xf0     0xf0
ebp            0x3      0x3
ebp            0x3      0x3
ebp            0xa38e40a0       0xa38e40a0
ebp            0x3      0x3
ebp            0x3      0x3
ebp            0x3      0x3
ebp            0x3      0x3
ebp            0xf6429e40       0xf6429e40

However, only 'f0' is throwing the exception because 0x3 isn't 4-bytes aligned and the serviceablity code checks for this. From LinuxX86CFrame.java:

      // Check alignment of ebp
      if ( dbg.getAddressValue(ebp) % ADDRESS_SIZE != 0) {
        return null;
      }

Comment 7 Severin Gehwolf 2018-07-23 09:42:52 UTC
What's curious is that there should be a check for "valid" frame pointers, and there is from LinuxX86CFrame.sender():

      X86ThreadContext context = (X86ThreadContext) thread.getContext();
      Address esp = context.getRegisterAsAddress(X86ThreadContext.ESP);

      if ( (ebp == null) || ebp.lessThan(esp) ) {
        return null;
      }

So given that esp is supposed to be "0xa3862fa0", it's strange that this check would pass. Debugging this reveals:

main[1] print thread.getContext().getRegisterAsAddress(sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP)   
 thread.getContext().getRegisterAsAddress(sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP) = null
main[1] print thread.getContext().getRegisterAsAddress(sun.jvm.hotspot.debugger.x86.X86ThreadContext.SP)
 thread.getContext().getRegisterAsAddress(sun.jvm.hotspot.debugger.x86.X86ThreadContext.SP) = "0xa3862fa0"

So we are getting a null and falsely pass this check.

Comment 8 Severin Gehwolf 2018-07-23 11:45:50 UTC
Digging some deeper I've traced it down to the native ptrace() system call for LWP thread 834. At that point we seem to be getting the correct values for ebp and esp:

(gdb) p user
$1 = (struct user_regs_struct *) 0xa34feee8
(gdb) next
142      return true;
(gdb) list
137     #ifdef PTRACE_GETREGS_REQ
138      if (ptrace_getregs(PTRACE_GETREGS_REQ, pid, user, NULL) < 0) {
139        print_debug("ptrace(PTRACE_GETREGS, ...) failed for lwp %d\n", pid);
140        return false;
141      }
142      return true;
143     #elif defined(PTRACE_GETREGSET)
144      struct iovec iov;
145      iov.iov_base = user;
146      iov.iov_len = sizeof(*user);
(gdb) p user->ebp
$2 = 240
(gdb) p user->esp
$3 = -1551487072
(gdb) p/x user->esp
$4 = 0xa3862fa0
(gdb) p pid
$5 = 834

Question is where things get wrong...

Comment 9 Severin Gehwolf 2018-07-23 12:27:49 UTC
Aha! The native function to fill in native register values uses SP as the index into the data array holding register values. From agent/src/os/linux/LinuxDebuggerLocal.c around line 378:

#ifdef i386
#define REG_INDEX(reg) sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg

  regs[REG_INDEX(GS)]  = (uintptr_t) gregs.xgs;
  regs[REG_INDEX(FS)]  = (uintptr_t) gregs.xfs;
  regs[REG_INDEX(ES)]  = (uintptr_t) gregs.xes;
  regs[REG_INDEX(DS)]  = (uintptr_t) gregs.xds;
  regs[REG_INDEX(EDI)] = (uintptr_t) gregs.edi;
  regs[REG_INDEX(ESI)] = (uintptr_t) gregs.esi;
  regs[REG_INDEX(FP)] = (uintptr_t) gregs.ebp;
  regs[REG_INDEX(SP)] = (uintptr_t) gregs.esp;
  regs[REG_INDEX(EBX)] = (uintptr_t) gregs.ebx;
  regs[REG_INDEX(EDX)] = (uintptr_t) gregs.edx;
  regs[REG_INDEX(ECX)] = (uintptr_t) gregs.ecx;
  regs[REG_INDEX(EAX)] = (uintptr_t) gregs.eax;
  regs[REG_INDEX(PC)] = (uintptr_t) gregs.eip;
  regs[REG_INDEX(CS)]  = (uintptr_t) gregs.xcs;
  regs[REG_INDEX(SS)]  = (uintptr_t) gregs.xss;

#endif /* i386 */

Note: SP is aliased to UESP in serviceability agent Java code. But neither are being used in the frame pointer validity check in LinuxX86CFrame.sender. ESP is!

(gdb) info frame
Stack level 0, frame at 0xa34fef40:
 eip = 0xe721b1f0 in Java_sun_jvm_hotspot_debugger_linux_LinuxDebuggerLocal_getThreadIntegerRegisterSet0
    (/usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/agent/src/os/linux/LinuxDebuggerLocal.c:370); saved eip 0xe769339a
 called by frame at 0xa34fef80
 source language c.
 Arglist at 0xa34fef38, args: env=0xa3cf8d2c, this_obj=0xa34fef8c, lwp_id=834
 Locals at 0xa34fef38, Previous frame's sp is 0xa34fef40
 Saved registers:
  ebx at 0xa34fef30, ebp at 0xa34fef38, esi at 0xa34fef34, eip at 0xa34fef3c
(gdb) p/x gregs->esp
$12 = 0xa3862fa0
(gdb) p/x gregs->ebp
$13 = 0xf0

So the native bits fill in values correctly, but the Java code then uses the wrong value.

Comment 11 Severin Gehwolf 2018-07-23 14:05:09 UTC
Created attachment 1469966 [details]
Proposed patch, fixing the FP validity check on i686

Comment 12 Severin Gehwolf 2018-07-23 15:36:49 UTC
I've filed upstream bug JDK-8208091 for this, since it's reproducible on latest JDK sources too.

Comment 13 Severin Gehwolf 2018-08-24 13:03:46 UTC
The upstream bug has been fixed:
http://hg.openjdk.java.net/jdk/jdk/rev/2e98c7737d8f

What's pending is a backport to 8. I've asked for JDK 11 push approval but it wasn't deemed critical enough for that.

Comment 14 Severin Gehwolf 2018-10-10 16:34:09 UTC
The fix has been pushed to JDK 8u:
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/rev/dbcf38c78dae

Comment 15 Andrew John Hughes 2020-06-09 17:53:24 UTC
This was fixed over a year ago in 8u211-b01 and first shipped in RHEL 7 in the 8u212 updates.

Closing.


Note You need to log in before you can comment on or make changes to this bug.