Bug 596933 - [RHEL6] stap segfaults on PPC
Summary: [RHEL6] stap segfaults on PPC
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: systemtap (Show other bugs)
(Show other bugs)
Version: 6.0
Hardware: All Linux
high
medium
Target Milestone: rc
: ---
Assignee: Frank Ch. Eigler
QA Contact: Petr Muller
URL: https://beaker.engineering.redhat.com...
Whiteboard:
Keywords:
Depends On: 602359 640321
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-27 19:11 UTC by Jeff Burke
Modified: 2018-10-27 10:48 UTC (History)
7 users (show)

Fixed In Version: systemtap-1.2-9.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-10 21:44:29 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description Jeff Burke 2010-05-27 19:11:03 UTC
Description of problem:
 When trying to run the tracepoint tests for RHEL6 on ppc it fails

Version-Release number of selected component (if applicable):
 Kernel Package = kernel-2.6.32-30.el6.ppc64
 systemtap version = systemtap-1.2-3.el6.ppc64

How reproducible:
 Always

Steps to Reproduce:
1. Install the RHEL6.0-Snapshot-5 PPC Server variant onto a PPC host 
2. run the following command 
/usr/bin/stap -L 'kernel.trace("*")' | grep -o "\".*\"" | xargs -I [] -n1 stap -c "sleep 1" -vvvve 'probe kernel.trace("[]") { exit() }'
  
Actual results:
Segmentation fault (core dumped) stap -c "sleep 1" -vvvve 'probe kernel.trace('$i') { exit() }' > $VERBOSETRACELOG 2>&1

Expected results:
 Should pass

Additional info:

Comment 2 Petr Muller 2010-06-02 14:15:51 UTC
We hit this too in our Tier Tests. Basically everything you run on ppc ends with a segfault (or abort).

With certain probes, stap seems to be a bit more verbose when going down, so I'm putting it here:

# stap -e 'probe begin{printf("PWN!\n")}' 
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_S_create
Aborted (core dumped)

glibc even gave a backtrace in one case:

*** glibc detected *** stap: free(): invalid next size (fast): 0x000000004a121180 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x807a11f704)[0xfff9d83f704]
/usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv-0x5e5a4)[0xfff9dbce46c]
/usr/lib64/libstdc++.so.6(+0x807a70b3e4)[0xfff9dbcb3e4]
/usr/lib64/libstdc++.so.6(_ZSt9terminatev-0x611c0)[0xfff9dbcb430]
/usr/lib64/libstdc++.so.6(__cxa_throw-0x6104c)[0xfff9dbcb5e4]
/usr/lib64/libstdc++.so.6(_ZSt20__throw_length_errorPKc-0xde140)[0xfff9db45980]
/usr/lib64/libstdc++.so.6(_ZNSs4_Rep9_S_createEmmRKSaIcE-0x9412c)[0xfff9db94e14]
/usr/lib64/libstdc++.so.6(_ZNSs4_Rep8_M_cloneERKSaIcEm-0x92ab4)[0xfff9db9674c]
/usr/lib64/libstdc++.so.6(_ZNSs7reserveEm-0x92404)[0xfff9db96eac]
/usr/lib64/libstdc++.so.6(_ZNSs6appendERKSs-0x91da0)[0xfff9db97580]
stap(_ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_-0x196200)[0x233d3110]
stap(+0x12d234)[0x234ad234]
stap(+0x12b074)[0x234ab074]
stap(+0x51058)[0x233d1058]
/lib64/libc.so.6(+0x807a0bbc38)[0xfff9d7dbc38]
/lib64/libc.so.6(__libc_start_main-0x184ea0)[0xfff9d7dbe30]

Comment 3 Frank Ch. Eigler 2010-06-10 13:32:11 UTC
From http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44492, 
it seems like this was an unforeseen interaction between
our sys/sdt.h style of inline assembly vs. gcc powerpc
optimizer.  stap will adopt a more restricted inline-asm
constraint for STAP_PROBE(...) which should make it work
even if the compiler issue is not settled.

Comment 4 Issue Tracker 2010-06-16 20:33:35 UTC
Event posted on 06-16-2010 02:34pm EDT by Glen Johnson

------- Comment From anibalca@linux.ibm.com 2010-06-16 14:24 EDT-------
I hit this bug on RHEL6 pre-beta2


This event sent from IssueTracker by jkachuck 
 issue 989583

Comment 5 Frank Ch. Eigler 2010-06-16 20:38:23 UTC
Upstream bug http://sourceware.org/bugzilla/show_bug.cgi?id=11708
is a corequisite, as the powerpc fix listed above unfortunately
breaks i686.  Work in progress.  scox, please append here the commit#
for the PR11708 fix, when that is ready.

Comment 6 Frank Ch. Eigler 2010-06-23 16:11:54 UTC
Let me backtrack a bit from comment #5.  rhel6's systemtap does not have
the SDT_V2 stuff yet, so scox's patch for sdt.h wouldn't apply there directly
anyhow.  And as it only has SDT_V1, the $XXX translator support from
http://sourceware.org/PR11708 is not actually needed.

So for RHEL6, it may be sufficient to change all the "g" constraints to "ron"
in sys/sdt.h and make no other change.

Comment 7 Frank Ch. Eigler 2010-06-29 02:45:10 UTC
See bug #608768 for ppc code negatively impacted by "nro", however.

Comment 8 Frank Ch. Eigler 2010-06-29 14:42:52 UTC
Our current plan is to revert the current fix for this bug, and
instead prereq/buildreq the version of gcc (4.4.4-9) that includes
an alternate fix for the "g" assembly constraint.  That way, 
bug #608768 is fixed too.

Comment 11 releng-rhel@redhat.com 2010-11-10 21:44:29 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.