Bug 643109

Summary: print_ubacktrace does not unwind
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: dsmith, fche, jistone, mjw, mjw, roland, wcohen
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-17 21:32:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
doc patch to suggest -d/--ldd at print_ubacktrace(). none

Description Jan Kratochvil 2010-10-14 16:57:30 UTC
Version-Release number of selected component (if applicable):
systemtap-1.3-2.fc13.x86_64
kernel-2.6.34.7-59.fc13.x86_64
kernel-debuginfo-2.6.34.7-59.fc13.x86_64
util-linux-ng-2.17.2-8.fc13.x86_64
glibc-2.12.1-2.x86_64

How reproducible:
Always.

Steps to Reproduce:
stap -e 'probe syscall.kill { print_ubacktrace (); }'
kill -0 $$

Actual results:
 0x38f6232c87
(single address per one `kill' syscall)

Expected results:
Some unwound backtrace.

Additional info:
Dump of assembler code for function kill:
   0x00000038f6232c80 <+0>:	mov    $0x3e,%eax
   0x00000038f6232c85 <+5>:	syscall 
## 0x00000038f6232c87 #################

OK but I would expect also some unwind.

Or is this feature F14+?  Thanks.

Comment 1 Mark Wielaard 2010-10-14 20:17:53 UTC
stap needs to know which process you want the ubacktrace for.
Try something like the following to tell stap that it needs the unwind info for bash and all shared libraries.

$ stap -d /bin/bash --ldd -e 'probe syscall.kill { print_ubacktrace () }'
$ kill -0 $$

 0x39b416 : __kernel_vsyscall+0x2/0x0 [vdso]
 0xbbbfe6 : kill+0x16/0x40 [libc-2.12.1.so]
 0x8080f7f : kill_pid+0x2f/0x260 [bash]
 0x80b3788 : kill_builtin+0x188/0x420 [bash]
 0x8070bc1 : execute_builtin+0x81/0x310 [bash]
 0x8072dfb : execute_simple_command+0xc7b/0xfc0 [bash]
 0x8073a22 : execute_command_internal+0x8e2/0x15f0 [bash]
 0x8074794 : execute_command+0x64/0xd0 [bash]
 0x80607d7 : reader_loop+0x97/0x2e0 [bash]
 0x805fefc : main+0xe1c/0x13e0 [bash]

Comment 2 Jan Kratochvil 2010-10-14 20:44:23 UTC
Created attachment 453571 [details]
doc patch to suggest -d/--ldd at print_ubacktrace().

OK, thanks.  Still getting
# stap -d /bin/bash --ldd -e 'probe syscall.kill { print_ubacktrace () }'
WARNING: Couldn't register module '/lib64/ld-2.12.1.so' for pid 29040
WARNING: Couldn't register module '/lib64/libc-2.12.1.so' for pid 29040
WARNING: Couldn't register module '/lib64/libdl-2.12.1.so' for pid 29040
WARNING: Couldn't register module '/lib64/libtinfo.so.5.7' for pid 29040
WARNING: Couldn't register module '/lib64/ld-2.12.1.so' for pid 14435
WARNING: Couldn't register module '/lib64/libc-2.12.1.so' for pid 14435
WARNING: Couldn't register module '/lib64/libdl-2.12.1.so' for pid 14435
[...]
0x7fe8c7369c87
WARNING: Couldn't register module '/lib64/libdl-2.12.1.so' for pid 16967

Comment 3 Mark Wielaard 2010-10-14 20:59:38 UTC
I'll look into updating/clarifying the documentation. When hitting normal user space probe points (process("/bin/bash") for example) stap already knows it needs to collect the unwind data.

If one of those pids in those WARNINGS is equal to $$ then that would indeed be a problem. The WARNING doesn't say, but I assume it is because the vma table space is out of memory. Could you try with something like:

-DTASK_FINDER_VMA_ENTRY_ITEMS=4096

Comment 4 Frank Ch. Eigler 2010-10-14 21:01:13 UTC
Jan, you may be hitting resource limits, as your last invocation of stap
says to probe the entire system (i.e., every process that links in any of
those basic libraries).

Either rerun stap with a -c/-x type process hierarchy filter, or consider
rerunning stap with a bigger -DMAXUPROBES=NNNN and/or -DTASK_FINDER_VMA_ENTRY_ITEMS=MMMM values.  Try MMMM=2000, and double it
for each retried failure.

(Mark, that latter parameter is not in the stap man page, but should be.)

Comment 5 Jan Kratochvil 2010-10-15 07:47:52 UTC
-DTASK_FINDER_VMA_ENTRY_ITEMS=2000 works but it still gives WARNINGs, 4096 is OK.

My box is now running 449 tasks.  With 6GB RAM I do not mind if Systemtap takes for example 1GB.  Could be -DTASK_FINDER_VMA_ENTRY_ITEMS autodetected more appropriately for the system size?  Or at least that WARNING message should suggest -DTASK_FINDER_VMA_ENTRY_ITEMS.

In KVM (F14.x86_64) no -DTASK_FINDER_VMA_ENTRY_ITEMS is needed (114 tasks).

But it works great then, thanks.

Comment 6 Mark Wielaard 2010-10-17 21:32:47 UTC
I added the documentation suggestions and the warning message now mentions -DTASK_FINDER_VMA_ENTRY_ITEMS if the registration was because the limit was too low to upstream git.

Making the TASK_FINDER_VMA_ENTRY_ITEMS dynamically allocated is upstream bug http://sourceware.org/bugzilla/show_bug.cgi?id=11671