Bug 1204249

Summary: prologue search for compat_SyS_ipc fail
Product: Red Hat Enterprise Linux 7 Reporter: Martin Cermak <mcermak>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
Status: CLOSED ERRATA QA Contact: Martin Cermak <mcermak>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: dsmith, lberk, mbenitez, mcermak, mjw
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 11:47:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Cermak 2015-03-20 17:08:25 UTC
When testing upsteam stap (version 2.8/0.160, commit release-2.7-54-g4811ecab39e6 + changes) on top of 3.10.0-229.ael7b.ppc64le, I see following issue:



# stap -P -vvvv -e 'probe syscall.* {next}' -c /bin/true
=======

... stuff deleted ...


Resolution problem with probe kernel.function("compat_SyS_ipc@ipc/compat.c:332").call?
{
if ((($call) & (65535)) != (%{  MSGCTL  %})) next

next
}
semantic error: unhandled DW_OP_GNU_entry_value in DWARF expression [0] at 0 (0xf3: 1, 70366896194430): identifier '$call' at /root/mcermak-systemtap/systemtap-build/share/systemtap/tapset/linux/syscalls.stp:3970:7
   thrown from: dwflpp.cxx:2662
        source:         if (($call & 0xffff) != %{ MSGCTL %}) next;
                             ^
=======


# eu-readelf -N --debug-dump=info /usr/lib/debug/lib/modules/3.10.0-229.ael7b.ppc64le/vmlinux | less    # and (search for compat_SyS_ipc)
=======
 [1ced0f5]    subprogram
             external             (flag_present) Yes
             name                 (strp) "compat_SyS_ipc"
             decl_file            (data1) 1
             decl_line            (data2) 332
             prototyped           (flag_present) Yes
             type                 (ref4) [1cdda3d]
             low_pc               (addr) +0x00000000003b9480
             high_pc              (data8) 1096 (+0x00000000003b98c8)
             frame_base           (exprloc) 
              [   0] call_frame_cfa
             GNU_all_call_sites   (flag_present) Yes
             sibling              (ref4) [1ced6b6]
 [1ced117]      formal_parameter
               name                 (strp) "call"
               decl_file            (data1) 1
               decl_line            (data2) 332
               type                 (ref4) [1cdda3d]
               location             (sec_offset) location list [a8fb18]
 [1ced127]      formal_parameter
               name                 (strp) "first"
               decl_file            (data1) 1
               decl_line            (data2) 332
               type                 (ref4) [1cdda3d]
               location             (sec_offset) location list [a8fb51]
=======


# eu-readelf -N --debug-dump=loc /usr/lib/debug/lib/modules/3.10.0-229.ael7b.ppc64le/vmlinux | less    # and (search for a8fb18)
=======
 [a8fb18]  +0x00000000003b9488..+0x00000000003b9490 [   0] reg3
           +0x00000000003b9490..+0x00000000003b98c8 [   0] GNU_entry_value:
       [   0] reg3
                                                  [   3] stack_value
 [a8fb51]  +0x00000000003b9488..+0x00000000003b94d0 [   0] reg4
           +0x00000000003b94d0..+0x00000000003b98c8 [   0] GNU_entry_value:
       [   0] reg4
                                                  [   3] stack_value
 [a8fb8a]  +0x00000000003b9488..+0x00000000003b955c [   0] reg5
           +0x00000000003b955c..+0x00000000003b9598 [   0] GNU_entry_value:
       [   0] reg5
                                                  [   3] stack_value
           +0x00000000003b9598..+0x00000000003b95ac [   0] reg5
           +0x00000000003b95ac..+0x00000000003b95c0 [   0] GNU_entry_value:
       [   0] reg5
                                                  [   3] stack_value
           +0x00000000003b95c0..+0x00000000003b95cc [   0] reg5
           +0x00000000003b95cc..+0x00000000003b95dc [   0] GNU_entry_value:
       [   0] reg5
                                                  [   3] stack_value
           +0x00000000003b95dc..+0x00000000003b9630 [   0] reg5
           +0x00000000003b9630..+0x00000000003b963c [   0] GNU_entry_value:
=======

Comment 1 Mark Wielaard 2015-03-20 17:15:42 UTC
I am wondering why we don't pick the first range:
+0x00000000003b9488..+0x00000000003b9490
Then we would be able to resolve the value of $call (it is in reg3).
But we pick something a bit further in the function, where reg3 isn't really available (without DW_OP_GNU_entry_value support, which we don't have).

Does -P make a difference?
And could you make the probe more explicit?
This seems to be __syscall.compat_ipc.msgctl (or probe syscall.compat_sys_msgctl). If you use that, does the -vvv tell you why/where stap picks the PC to place the probe?

Comment 2 Martin Cermak 2015-03-20 17:31:20 UTC
Putting the probe to the kernel.function doesn't cause any issues:

=======
# stap -P -e 'probe kernel.function("compat_SyS_ipc@ipc/compat.c:332").call {next}' -c /bin/true
#
=======

Probing for __syscall.compat_ipc.msgctl or syscall.compat_sys_msgctl gives similar result as Comment 0. Using -vv, PC can be seen:

=======
# stap -P -vv -e 'probe syscall.compat_sys_msgctl {next}' -c /bin/true

... stuff deleted ...

Pass 1: parsed user script and 107 library script(s) using 160704virt/43456res/7552shr/32512data kb, in 230usr/0sys/236real ms.
Attempting to extract kernel debuginfo build ID from /lib/modules/3.10.0-229.ael7b.ppc64le/build/vmlinux.id
focused on module 'kernel' = [0xc000000000000000-0xc0000000014c272c, bias 0 file /usr/lib/debug/lib/modules/3.10.0-229.ael7b.ppc64le/vmlinux ELF machine powerpc| (code 21)
probe compat_sys_msgctl@ipc/compat.c:501 kernel reloc=.dynamic pc=0xc0000000003b871c
focused on module 'kernel' = [0xc000000000000000-0xc0000000014c272c, bias 0 file /usr/lib/debug/lib/modules/3.10.0-229.ael7b.ppc64le/vmlinux ELF machine powerpc| (code 21)
probe compat_SyS_ipc@ipc/compat.c:332 kernel reloc=.dynamic pc=0xc0000000003b9490
semantic error: unhandled DW_OP_GNU_entry_value in DWARF expression [0] at 0 (0xf3: 1, 70366475518846): identifier '$call' at /root/mcermak-systemtap/systemtap-build/share/systemtap/tapset/linux/syscalls.stp:3970:7
   thrown from: dwflpp.cxx:2662
        source:         if (($call & 0xffff) != %{ MSGCTL %}) next;
                             ^

Pass 2: analyzed script: 3 probe(s), 0 function(s), 76 embed(s), 33 global(s) using 192320virt/77568res/9472shr/64128data kb, in 600usr/70sys/668real ms.
Pass 2: analysis failed.  [man error::pass2]
Running rm -rf /tmp/stapCqm7hz
Spawn waitpid result (0x0): 0
Removed temporary directory "/tmp/stapCqm7hz"
=======

Comment 3 Mark Wielaard 2015-03-20 21:22:02 UTC
I notice all your examples include -P. Does leaving it off make a difference?

Comment 4 Mark Wielaard 2015-03-20 21:30:00 UTC
I don't have the correct kernel source around, but it looks like compat_sys_ipc is defined as asmlinkage. I wonder if that is what confuses something.

Comment 5 Martin Cermak 2015-03-20 23:05:30 UTC
(In reply to Mark Wielaard from comment #3)
> I notice all your examples include -P. Does leaving it off make a difference?

Without -P we'd face debuginfo quality issues:

=======
 7.1 S ppc64le # stap -e 'probe syscall.compat_sys_msgctl {next}' -c /bin/true
semantic error: not accessible at this address (pc: 0xc0000000003b9480) [man error::dwarf]: identifier '$call' at /root/mcermak-systemtap/systemtap-build/share/systemtap/tapset/linux/syscalls.stp:3970:7
        dieoffset: 0x1ced117 from unknown debug file for kernel
        function: compat_SyS_ipc at ipc/compat.c:332
        alternative locations: [0xc0000000003b9488,0xc0000000003b9490], [0xc0000000003b9490,0xc0000000003b98c8]
        source:         if (($call & 0xffff) != %{ MSGCTL %}) next;
                             ^

Pass 2: analysis failed.  [man error::pass2]
 7.1 S ppc64le # 
=======

But I guess that without -P it is completely different story...

(In reply to Mark Wielaard from comment #4)
> I don't have the correct kernel source around, but it looks like
> compat_sys_ipc is defined as asmlinkage. I wonder if that is what confuses
> something.

Right, it is asmlinkage:

=======
 7.1 S ppc64le # grep compat_sys_ipc include/linux/compat.h
asmlinkage long compat_sys_ipc(u32, int, int, u32, compat_uptr_t, u32);
 7.1 S ppc64le # 
=======

Comment 6 Martin Cermak 2015-03-20 23:26:45 UTC
I'm surprised mainly by that putting the probe directly to the kernel function doesn't cause issues (Comment 2), but putting the probe to (any of) it's (existing tapset) aliases does.

Comment 7 Frank Ch. Eigler 2015-03-25 17:36:57 UTC
(In reply to Martin Cermak from comment #6)
> I'm surprised mainly by that putting the probe directly to the kernel
> function doesn't cause issues (Comment 2), but putting the probe to (any of)
> it's (existing tapset) aliases does.

(That's because the tapset alias includes logic to demultiplex amongst alternative calls based on $parameter tests.)

Comment 8 Mark Wielaard 2015-04-13 12:37:07 UTC
Note with the patches from upstream for https://sourceware.org/bugzilla/show_bug.cgi?id=17638 the original stap query work fine (with or without -P).
The patches aren't upstream yet, but still discussed on the list.

Comment 9 Mark Wielaard 2015-04-22 14:08:42 UTC
All patches are committed upstream, resolving upstream bugs 17638 and 18154. With current systemtap git master the query "stap -e 'probe syscall.* {next}' -c /bin/true" now runs fine (also when -P is given).

Comment 11 David Smith 2015-06-23 12:34:46 UTC
This seems to work fine on ppc64le (3.10.0-229.4.2.el7.ppc64le) with systemtap 2.8:

====
# stap -v -e 'probe syscall.* {next}' -c /bin/truePass 1: parsed user script and 113 library script(s) using 162880virt/45504res/7232shr/35008data kb, in 460usr/10sys/483real ms.
Pass 2: analyzed script: 487 probe(s), 24 function(s), 91 embed(s), 34 global(s) using 282880virt/168320res/9408shr/155008data kb, in 24770usr/290sys/25850real ms.
Pass 3: translated to C into "/tmp/stap42SH4y/stap_776a2eb31489cd11de00c1a7f45f6c7b_160135_src.c" using 282880virt/168512res/9600shr/155008data kb, in 50usr/70sys/118real ms.
Pass 4: compiled C into "stap_776a2eb31489cd11de00c1a7f45f6c7b_160135.ko" in 6710usr/1060sys/9048real ms.
Pass 5: starting run.
Pass 5: run completed in 0usr/20sys/783real ms.
====

Comment 14 errata-xmlrpc 2015-11-19 11:47:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-2124.html