Bug 473938

Summary: glibc-debuginfo wrong on s390x
Product: Red Hat Enterprise Linux 5 Reporter: Petr Muller <pmuller>
Component: glibcAssignee: Andreas Schwab <schwab>
Status: CLOSED NOTABUG QA Contact: BaseOS QE <qe-baseos-auto>
Severity: medium Docs Contact:
Priority: low    
Version: 5.3CC: fweimer, jakub, jan.kratochvil, ohudlick
Target Milestone: rc   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-08-25 08:31:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Petr Muller 2008-12-01 15:04:17 UTC
Description of problem:
During testing of gdb erratum for RHEL5.3, the following problem was found on the gdb.threads/threadcrash.exp testcase, which is suppoesd to be a bug in glibc-debuginfo and thusly probably a gcc problem:

gdb.threads/gdb.threadcrash.exp: old 39/12, new 36/15, with these differences:
+FAIL: gdb.threads/threadcrash.exp: core file: gdb output contains ??
+FAIL: gdb.threads/threadcrash.exp: gcore file: gdb output contains ??
+FAIL: gdb.threads/threadcrash.exp: live process: gdb output contains ??

gdb shows this:
Thread 7 (process 10885):
#0  0x00000200001a0fc4 in ?? () from /lib64/libc.so.6

where the old one contained this:
Thread 7 (process 10885):
#0  0x00000200001a0fc4 in __pause_nocancel ()
----------------------------------------------------------

The problem was investigated by Jan Kratochvil, who wrote this about the issue:

Fortunately not.  In fact it was a bug in RHEL-5.2 GDB that it falsely resolved
the address as `__pause_nocancel'.  In fact it should be resolved as `pause' but
s390x RHEL-5.3 glibc .symtab is wrong.  So RHEL-5.3 GDB is right.  One can
bugreport it for glibc it does not have proper .symtab (which is probably a gcc
problem as sysdeps/posix/pause.c has no assembly variant for s390x).

The fix for this resolving was before
https://bugzilla.redhat.com/show_bug.cgi?id=238352
and it got committed upstream as
http://sources.redhat.com/ml/gdb-patches/2007-01/msg00274.html
so currently it is an i386 specific testcase `gdb.arch/i386-size-overlap.exp'.

s390x info:
------------------------------------------------------------------------------
00000000000aff50    70 FUNC    LOCAL  DEFAULT   11 __GI___libc_pause
00000000000aff50    70 FUNC    LOCAL  DEFAULT   11 __GI_pause
00000000000aff50    70 FUNC    LOCAL  DEFAULT   11 __libc_pause
00000000000aff50    70 FUNC    WEAK   DEFAULT   11 pause
00000000000aff50    70 FUNC    WEAK   DEFAULT   11 pause@@GLIBC_2.2
00000000000aff5e    16 FUNC    LOCAL  DEFAULT   11 __pause_nocancel
00000000000affe0    70 FUNC    LOCAL  DEFAULT   11 __GI___libc_nanosleep
00000000000affe0    70 FUNC    LOCAL  DEFAULT   11 __GI___nanosleep
00000000000affe0    70 FUNC    LOCAL  DEFAULT   11 __GI_nanosleep
00000000000affe0    70 FUNC    LOCAL  DEFAULT   11 __libc_nanosleep
00000000000affe0    70 FUNC    WEAK   DEFAULT   11 __nanosleep
00000000000affe0    70 FUNC    WEAK   DEFAULT   11 __nanosleep@@GLIBC_2.2.6
00000000000affe0    70 FUNC    WEAK   DEFAULT   11 nanosleep
00000000000affe0    70 FUNC    WEAK   DEFAULT   11 nanosleep@@GLIBC_2.2
00000000000affee    16 FUNC    LOCAL  DEFAULT   11 __nanosleep_nocancel

#0  0x00000200001a0fc4 in ?? () from /lib64/libc.so.6
#0  0x00000200001a0fc4 in __pause_nocancel () at
../nptl/sysdeps/unix/sysv/linux/s390/lowlevellock.h:230
(gdb) p/x 0x00000200001a0fc4-0x200000f1000
$2 = 0xaffc4
0xaff50+70==0xaff96
0xaffc4-0xaff50==116

   affc2:       0a a2                   svc     162
   affc4:       b9 04 00 d2             lgr     %r13,%r2

Syscall is from the pause() function but .symtab does not cover it.

x86_64 (F10):
------------------------------------------------------------------------------
0000003f616a7ea0    98 FUNC    LOCAL  DEFAULT   12 __GI___libc_pause
0000003f616a7ea0    98 FUNC    LOCAL  DEFAULT   12 __GI_pause
0000003f616a7ea0    98 FUNC    LOCAL  DEFAULT   12 __libc_pause
0000003f616a7ea0    98 FUNC    WEAK   DEFAULT   12 pause
0000003f616a7ea9    16 FUNC    LOCAL  DEFAULT   12 __pause_nocancel
0000003f616a7f10   118 FUNC    LOCAL  DEFAULT   12 __GI___libc_nanosleep
0000003f616a7f10   118 FUNC    LOCAL  DEFAULT   12 __GI___nanosleep
0000003f616a7f10   118 FUNC    LOCAL  DEFAULT   12 __GI_nanosleep
0000003f616a7f10   118 FUNC    LOCAL  DEFAULT   12 __libc_nanosleep
0000003f616a7f10   118 FUNC    WEAK   DEFAULT   12 __nanosleep
0000003f616a7f10   118 FUNC    WEAK   DEFAULT   12 nanosleep

  3f616a7ecb:   0f 05                   syscall
  3f616a7ecd:   48 8b 3c 24             mov    (%rsp),%rdi

(gdb) p/x 0x0000003f616a7ea0+98
$2 = 0x3f616a7f02

Syscall is covered by .symtab as `pause'.

--------------------------------------------------------

Reproduction information:
[1] install glibc-debuginfo, gdb-6.8-27.src.rpm, prebuild it and run testcase gdb.threads/threadcrash.exp which compiles the files
[2] run 'threadcrash' binary, it segfaults and a core is generated
[3] gdb threadcrash core.XXXXX
[4] (gdb) t a a bt
[5] see the wrong debuginfo

Comment 1 Jakub Jelinek 2008-12-08 14:51:48 UTC
In what way is the .symtab wrong?  pause and __pause_nocancel functions really overlap in the code, the .symtab correctly describes that.

Comment 2 Jan Kratochvil 2008-12-08 15:03:28 UTC
The instruction at 0xaffc2 (svc==syscall) is really used during execution but 0xaffc2 does not belong to any function according to .symtab on s390x.
Any overlapping would be OK but for all the functions either
{function start} + {function size} < {syscall PC}
or
{syscall PC} < {function start}
x86_64 is correct.

Comment 3 Andreas Schwab 2009-06-25 13:49:50 UTC
On s390(x) the real body of canceable syscalls is actually located _before_ the entry point.  Thus the pc is actually in the nanosleep syscall (162 == SYS_nanosleep).

Comment 4 Andreas Schwab 2009-08-25 08:31:20 UTC
Not really a bug, since the pc is inside an anonymous function.