Bug 499677

Summary: 'Unknown symbol in module' in Pass 5 when probing userspace
Product: Red Hat Enterprise Linux 5 Reporter: Petr Muller <pmuller>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
Status: CLOSED ERRATA QA Contact: BaseOS QE <qe-baseos-auto>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.4CC: ddomingo, ebachalo, mjw, ohudlick, rlerch, syeghiay
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Running some user-space probe test cases provided by the systemtap-testsuite package fail with an 'Unknown symbol in module' error on some architectures. These test cases include (but are not limited to): systemtap.base/uprobes.exp systemtap.base/bz10078.exp systemtap.base/bz6850.exp systemtap.base/bz5274.exp Because of a known bug in the latest SystemTap update, new SystemTap installations do not properly unload old versions of the uprobes.ko module. Some updated user-space probes use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As such, running these user-space probe tests result in the error mentioned earlier. If you encounter this error, simply run 'rmmod uprobes' to manually remove the older uprobes.ko module before running the user-space probe test again.
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 10:00:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 513501    
Attachments:
Description Flags
stap-report output none

Description Petr Muller 2009-05-07 15:55:11 UTC
Description of problem:
Some of the uprobes testcases in systemtap-testsuite which were working in RHEL5.3 systemtap stopped working in RHEL5.4 candidate systemtap. It seems architecture independent. I'm speaking about these testcases, at minimum (maybe there are more, I'll add them as I work through the logs):

systemtap.base/uprobes.exp
systemtap.base/bz10078.exp
systemtap.base/bz6850.exp

Version-Release number of selected component (if applicable):
# rpm -q systemtap
systemtap-0.9.7-1.el5

How reproducible:
always

Steps to Reproduce:
1. cd /usr/share/systemtap/testsuite/systemtap.base
2. gcc -g -o bz6850 bz6850.c 
3. stap bz6850.stp -v -c ./bz6850
  
Actual results:
# stap bz6850.stp -v -c ./bz6850
Pass 1: parsed user script and 52 library script(s) in 340usr/0sys/346real ms.
Pass 2: analyzed script: 10 probe(s), 1 function(s), 1 embed(s), 0 global(s) in 0usr/0sys/5real ms.
Pass 3: translated to C into "/tmp/stapIaP4ye/stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.c" in 40usr/180sys/217real ms.
Pass 4, preamble: (re)building SystemTap's version of uprobes.
Pass 4: compiled C into "stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.ko" in 1510usr/190sys/1663real ms.
Pass 5: starting run.
Error inserting module '/tmp/stapIaP4ye/stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.ko': Unknown symbol in module
Retrying, after attempted removal of module stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089 (rc -1)
Error inserting module '/tmp/stapIaP4ye/stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.ko': Unknown symbol in module
Pass 5: run completed in 0usr/20sys/24real ms.
Pass 5: run failed.  Try again with another '--vp 00001' option.


Expected results:
# stap bz6850.stp -v -c ./bz6850
Pass 1: parsed user script and 45 library script(s) in 320usr/0sys/330real ms.
Pass 2: analyzed script: 10 probe(s), 1 function(s), 0 embed(s), 0 global(s) in 0usr/0sys/4real ms.
Pass 3: translated to C into "/tmp/stapmY7fyM/stap_ca87291723d3deb3bc778f1dbb434e33_3683.c" in 50usr/160sys/204real ms.
Pass 4, preamble: (re)building SystemTap's version of uprobes.
Uprobes (re)build complete.
Pass 4: compiled C into "stap_ca87291723d3deb3bc778f1dbb434e33_3683.ko" in 2530usr/320sys/2919real ms.
Pass 5: starting run.
main called
fork_and_exec1 called
fork_and_exec2 called
fork1 called
fork2 called
fork2 returns
fork1 returns
fork_and_exec2 returns
fork_and_exec1 returns
Pass 5: run completed in 10usr/30sys/50real ms.

Additional info:
Seems architecture independent, I'm seeing this on all five RHEL architectures. I'll attach the stap-report output

Comment 1 Petr Muller 2009-05-07 15:59:44 UTC
systemtap.base/bz5274.exp is the next one affected by this

Comment 3 Petr Muller 2009-05-07 16:03:16 UTC
Created attachment 342867 [details]
stap-report output

... on the i386 box. The environment on another arch boxes should be identical.

Comment 4 Mark Wielaard 2009-05-07 16:06:47 UTC
This happens to be from time to time if an olde uprobes is still loaded in the kernel and the systemtap build update brings in a new one.

Could you look is uprobes is still loaded: lsmod | grep uprobes
And if so rmmod uprobes and try the tests again?

Comment 5 Petr Muller 2009-05-07 16:29:10 UTC
The module was loaded, but even rmmod-ing it haven't solved it. I see the same output, at least for the stap 'bz6850.stp -v -c ./bz6850' and uprobes.exp testcases (I stopped trying after these). 
I also tried to erase the cache to make systemtap rebuild everything from scratch - it didn't help.

Comment 6 Frank Ch. Eigler 2009-05-07 18:38:17 UTC
What does dmesg say as to the missing symbol?

We haven't encountered anything like this, except when (as mjw says) an obsolete uprobes.ko was hanging around.

Comment 7 Petr Muller 2009-05-15 16:41:48 UTC
That's strange... when I try it on totally fresh RHEL5.4 machine, the problem won't occur. Even when I downgrade, run the testcase alone, and upgrade - I'm not seeing it. But I can safely reproduce it  by runnning whole new testsuite with old systemtap. After this happens, I am not able to do anything to make it run.

# dmesg -c
# stap bz10078.stp
Error inserting module '/tmp/stapSl51E6/stap_6ae410f1dbc5c182a165216c8a418246_2053.ko': Unknown symbol in module
Retrying, after attempted removal of module stap_6ae410f1dbc5c182a165216c8a418246_2053 (rc -1)
Error inserting module '/tmp/stapSl51E6/stap_6ae410f1dbc5c182a165216c8a418246_2053.ko': Unknown symbol in module
Pass 5: run failed.  Try again with another '--vp 00001' option.
# dmesg -c
stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uretprobe
stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uprobe
stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uretprobe
stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uprobe

Comment 8 Frank Ch. Eigler 2009-05-15 16:46:33 UTC
Then this is the situation mentioned in comment #4.
The manual mitigation is "rmmod uprobes" after installing
a new systemtap version.  (We could add that into
systemtap.spec, I suppose.)

Comment 9 Petr Muller 2009-05-18 09:25:12 UTC
Well the funny thing is that rmmod uprobes doesnt help...

Comment 10 Frank Ch. Eigler 2009-05-20 17:14:59 UTC
I have a suspicion why this might be.

Maybe the uprobes.ko file built from the previous version of systemtap
is hanging around (it's not specifically cleaned out by the spec file).
And rpm timestamps might collude in such a way that when the new version
of systemtap rpm is installed, the old uprobes.ko still seems to be
newer than the (new) sources, thus not rebuilt.

Try a "make -C /usr/share/systemtap/runtime/uprobes clean" around the
time of the rpm upgrade.

Comment 11 Petr Muller 2009-05-21 12:10:35 UTC
Yeah. I've managed to reproduce the situation.

# rmmod uprobe;                                       doesn't help
# rm -rf ~/.systemtap/cache                           doesn't help
# rmmod uprobe; rm -rf ~/.systemtap/cache             doesn't help
# rm /usr/share/systemtap/runtime/uprobes/uprobes.ko  doesn't help
# everything above                                    doesn't help

but if I make the whole clean as suggested in comment 10, it start's to work.

so we should probably either clean the module in %post, or at least document this

Comment 12 Frank Ch. Eigler 2009-05-21 12:48:16 UTC
Thank you for testing.  I can hack together a patch to that effect momentarily.

Comment 13 Frank Ch. Eigler 2009-05-21 14:33:54 UTC
See http://sources.redhat.com/PR10182.
Backporting commit #1208cc2.

Comment 17 Mark Wielaard 2009-07-21 19:34:38 UTC
One additional idea discussed on irc was to also add the following to the spec file:

diff --git a/systemtap.spec b/systemtap.spec
index c3f6ea0..a8e0d9d 100644
--- a/systemtap.spec
+++ b/systemtap.spec
@@ -265,10 +265,12 @@ exit 0
 %post
 # Remove any previously-built uprobes.ko materials
 (make -C /usr/share/systemtap/runtime/uprobes clean) >/dev/null 3>&1 || true
+(/sbin/rmmod uprobes) >/dev/null 3>&1 || true
 
 %preun
 # Ditto
 (make -C /usr/share/systemtap/runtime/uprobes clean) >/dev/null 3>&1 || true
+(/sbin/rmmod uprobes) >/dev/null 3>&1 || true
 
 %files
 %defattr(-,root,root)

That won't work if there is a stap script running at the time, but if it isn't then we make sure all traces of the old uprobes module are gone.

Comment 18 Don Domingo 2009-07-22 01:25:51 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Running some user-space probe test cases provided by the systemtap-testsuite package fail with an 'Unknown symbol in module' error on some architectures. These
test cases include (but are not limited to):

systemtap.base/uprobes.exp
systemtap.base/bz10078.exp
systemtap.base/bz6850.exp
systemtap.base/bz5274.exp

Because of a known bug in the latest SystemTap update, new SystemTap installations do not unload old versions of the uprobes.ko module. Some updated user-space
probe provided by systemtap-testsuite package use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As
such, running these user-space probe tests result in the error mentioned earlier. 

If you encounter this error, simply run 'rmmod uprobes' to manually remove the older uprobes.ko module before running the user-space probe test again.

Comment 19 Don Domingo 2009-07-22 01:27:01 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -6,7 +6,7 @@
 systemtap.base/bz6850.exp
 systemtap.base/bz5274.exp
 
-Because of a known bug in the latest SystemTap update, new SystemTap installations do not unload old versions of the uprobes.ko module. Some updated user-space
+Because of a known bug in the latest SystemTap update, new SystemTap installations do not properly unload old versions of the uprobes.ko module. Some updated user-space
 probe provided by systemtap-testsuite package use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As
 such, running these user-space probe tests result in the error mentioned earlier.

Comment 20 Don Domingo 2009-07-22 01:36:20 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -7,7 +7,7 @@
 systemtap.base/bz5274.exp
 
 Because of a known bug in the latest SystemTap update, new SystemTap installations do not properly unload old versions of the uprobes.ko module. Some updated user-space
-probe provided by systemtap-testsuite package use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As
+probes use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As
 such, running these user-space probe tests result in the error mentioned earlier. 
 
 If you encounter this error, simply run 'rmmod uprobes' to manually remove the older uprobes.ko module before running the user-space probe test again.

Comment 26 errata-xmlrpc 2009-09-02 10:00:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1313.html