Description of problem: Some of the uprobes testcases in systemtap-testsuite which were working in RHEL5.3 systemtap stopped working in RHEL5.4 candidate systemtap. It seems architecture independent. I'm speaking about these testcases, at minimum (maybe there are more, I'll add them as I work through the logs): systemtap.base/uprobes.exp systemtap.base/bz10078.exp systemtap.base/bz6850.exp Version-Release number of selected component (if applicable): # rpm -q systemtap systemtap-0.9.7-1.el5 How reproducible: always Steps to Reproduce: 1. cd /usr/share/systemtap/testsuite/systemtap.base 2. gcc -g -o bz6850 bz6850.c 3. stap bz6850.stp -v -c ./bz6850 Actual results: # stap bz6850.stp -v -c ./bz6850 Pass 1: parsed user script and 52 library script(s) in 340usr/0sys/346real ms. Pass 2: analyzed script: 10 probe(s), 1 function(s), 1 embed(s), 0 global(s) in 0usr/0sys/5real ms. Pass 3: translated to C into "/tmp/stapIaP4ye/stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.c" in 40usr/180sys/217real ms. Pass 4, preamble: (re)building SystemTap's version of uprobes. Pass 4: compiled C into "stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.ko" in 1510usr/190sys/1663real ms. Pass 5: starting run. Error inserting module '/tmp/stapIaP4ye/stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.ko': Unknown symbol in module Retrying, after attempted removal of module stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089 (rc -1) Error inserting module '/tmp/stapIaP4ye/stap_43bc8a9e4cf003378d3da8dc7eb20d4f_4089.ko': Unknown symbol in module Pass 5: run completed in 0usr/20sys/24real ms. Pass 5: run failed. Try again with another '--vp 00001' option. Expected results: # stap bz6850.stp -v -c ./bz6850 Pass 1: parsed user script and 45 library script(s) in 320usr/0sys/330real ms. Pass 2: analyzed script: 10 probe(s), 1 function(s), 0 embed(s), 0 global(s) in 0usr/0sys/4real ms. Pass 3: translated to C into "/tmp/stapmY7fyM/stap_ca87291723d3deb3bc778f1dbb434e33_3683.c" in 50usr/160sys/204real ms. Pass 4, preamble: (re)building SystemTap's version of uprobes. Uprobes (re)build complete. Pass 4: compiled C into "stap_ca87291723d3deb3bc778f1dbb434e33_3683.ko" in 2530usr/320sys/2919real ms. Pass 5: starting run. main called fork_and_exec1 called fork_and_exec2 called fork1 called fork2 called fork2 returns fork1 returns fork_and_exec2 returns fork_and_exec1 returns Pass 5: run completed in 10usr/30sys/50real ms. Additional info: Seems architecture independent, I'm seeing this on all five RHEL architectures. I'll attach the stap-report output
systemtap.base/bz5274.exp is the next one affected by this
Created attachment 342867 [details] stap-report output ... on the i386 box. The environment on another arch boxes should be identical.
This happens to be from time to time if an olde uprobes is still loaded in the kernel and the systemtap build update brings in a new one. Could you look is uprobes is still loaded: lsmod | grep uprobes And if so rmmod uprobes and try the tests again?
The module was loaded, but even rmmod-ing it haven't solved it. I see the same output, at least for the stap 'bz6850.stp -v -c ./bz6850' and uprobes.exp testcases (I stopped trying after these). I also tried to erase the cache to make systemtap rebuild everything from scratch - it didn't help.
What does dmesg say as to the missing symbol? We haven't encountered anything like this, except when (as mjw says) an obsolete uprobes.ko was hanging around.
That's strange... when I try it on totally fresh RHEL5.4 machine, the problem won't occur. Even when I downgrade, run the testcase alone, and upgrade - I'm not seeing it. But I can safely reproduce it by runnning whole new testsuite with old systemtap. After this happens, I am not able to do anything to make it run. # dmesg -c # stap bz10078.stp Error inserting module '/tmp/stapSl51E6/stap_6ae410f1dbc5c182a165216c8a418246_2053.ko': Unknown symbol in module Retrying, after attempted removal of module stap_6ae410f1dbc5c182a165216c8a418246_2053 (rc -1) Error inserting module '/tmp/stapSl51E6/stap_6ae410f1dbc5c182a165216c8a418246_2053.ko': Unknown symbol in module Pass 5: run failed. Try again with another '--vp 00001' option. # dmesg -c stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uretprobe stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uprobe stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uretprobe stap_6ae410f1dbc5c182a165216c8a418246_2053: Unknown symbol unmap_uprobe
Then this is the situation mentioned in comment #4. The manual mitigation is "rmmod uprobes" after installing a new systemtap version. (We could add that into systemtap.spec, I suppose.)
Well the funny thing is that rmmod uprobes doesnt help...
I have a suspicion why this might be. Maybe the uprobes.ko file built from the previous version of systemtap is hanging around (it's not specifically cleaned out by the spec file). And rpm timestamps might collude in such a way that when the new version of systemtap rpm is installed, the old uprobes.ko still seems to be newer than the (new) sources, thus not rebuilt. Try a "make -C /usr/share/systemtap/runtime/uprobes clean" around the time of the rpm upgrade.
Yeah. I've managed to reproduce the situation. # rmmod uprobe; doesn't help # rm -rf ~/.systemtap/cache doesn't help # rmmod uprobe; rm -rf ~/.systemtap/cache doesn't help # rm /usr/share/systemtap/runtime/uprobes/uprobes.ko doesn't help # everything above doesn't help but if I make the whole clean as suggested in comment 10, it start's to work. so we should probably either clean the module in %post, or at least document this
Thank you for testing. I can hack together a patch to that effect momentarily.
See http://sources.redhat.com/PR10182. Backporting commit #1208cc2.
One additional idea discussed on irc was to also add the following to the spec file: diff --git a/systemtap.spec b/systemtap.spec index c3f6ea0..a8e0d9d 100644 --- a/systemtap.spec +++ b/systemtap.spec @@ -265,10 +265,12 @@ exit 0 %post # Remove any previously-built uprobes.ko materials (make -C /usr/share/systemtap/runtime/uprobes clean) >/dev/null 3>&1 || true +(/sbin/rmmod uprobes) >/dev/null 3>&1 || true %preun # Ditto (make -C /usr/share/systemtap/runtime/uprobes clean) >/dev/null 3>&1 || true +(/sbin/rmmod uprobes) >/dev/null 3>&1 || true %files %defattr(-,root,root) That won't work if there is a stap script running at the time, but if it isn't then we make sure all traces of the old uprobes module are gone.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Running some user-space probe test cases provided by the systemtap-testsuite package fail with an 'Unknown symbol in module' error on some architectures. These test cases include (but are not limited to): systemtap.base/uprobes.exp systemtap.base/bz10078.exp systemtap.base/bz6850.exp systemtap.base/bz5274.exp Because of a known bug in the latest SystemTap update, new SystemTap installations do not unload old versions of the uprobes.ko module. Some updated user-space probe provided by systemtap-testsuite package use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As such, running these user-space probe tests result in the error mentioned earlier. If you encounter this error, simply run 'rmmod uprobes' to manually remove the older uprobes.ko module before running the user-space probe test again.
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -6,7 +6,7 @@ systemtap.base/bz6850.exp systemtap.base/bz5274.exp -Because of a known bug in the latest SystemTap update, new SystemTap installations do not unload old versions of the uprobes.ko module. Some updated user-space +Because of a known bug in the latest SystemTap update, new SystemTap installations do not properly unload old versions of the uprobes.ko module. Some updated user-space probe provided by systemtap-testsuite package use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As such, running these user-space probe tests result in the error mentioned earlier.
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -7,7 +7,7 @@ systemtap.base/bz5274.exp Because of a known bug in the latest SystemTap update, new SystemTap installations do not properly unload old versions of the uprobes.ko module. Some updated user-space -probe provided by systemtap-testsuite package use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As +probes use symbols available only in the latest uprobes.ko module (also provided by the latest SystemTap update). As such, running these user-space probe tests result in the error mentioned earlier. If you encounter this error, simply run 'rmmod uprobes' to manually remove the older uprobes.ko module before running the user-space probe test again.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1313.html