Created attachment 334340 [details] tcl spec patch to enable sdt Please consider adding static marker support for Systemtap. This is part of the Fedora Systemtap Static Probes effort: https://fedoraproject.org/wiki/Features/SystemtapStaticProbes We are still creating a good use case for tcl static markers, but some examples of what is possible when enabling this for postgres can be seen in this blog post: http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/ Attached is a patch for the tcl.spec file.
Hello Stan, you've mentioned some patch on your project patch. Does it mean only the patch of spec? I tried to build this in rawhide but I wasn't able to compile it. There will be needed some patches of source code. I'll contact upstream. As you can see at http://wiki.tcl.tk/19923 it wasn't build on linux yet.
Upstream ticket: https://sourceforge.net/tracker2/index.php?func=detail&aid=2685152&group_id=10894&atid=110894 Are you sure you want this feature in F-10? There is not enough time for testing dependent application.
You will need a newer version of the systemtap sys/sdt.h file to get rid of those duplicate definitions. Unfortunately I see the fix isn't in the current rawhide systemtap-sdt-devel package. We will have to create a new version. I would suggest to first get this working perfectly in rawhide before backporting to f10.
I try it first in rawhide. Ping me when the fix will be in rawhides systemtap.
systemtap 0.9.5 was released with a host of sdt.h cleanups. It is already in rawhide and has a pending update for f10: http://sourceware.org/ml/systemtap/2009-q1/msg00951.html https://admin.fedoraproject.org/updates/systemtap-0.9.5-1.fc10 One item missing though, that would make things more efficient is real support for the probe_ENABLED() macro. I hadn't noticed tcl uses this facility before. It will still work (because the macro is defined), but there might be a small performance impact. If you have good tcl benchmark then we could try and measure if it really has an impact and possibly add that extension.
There are some tests in tcl source but I don't think any of them can measure performance impact well. You can check them. What's the _small_ impact? Could be huge for someone who depend on tcl. Hard to say if we should push it into F-11 without further testing.
The probes themselves are lightweight. The primary overhead is that in some cases gcc may require loading probe arguments, e.g. if the source does: TCL_CMD_ARGS(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) then the probe macro makes available: arg0=a0 arg1=a1 ... arg9=a9 and gcc may need to load those Supporting probe_ENABLED() would mean there would need to be a way for the application to determine whether stap is handling a particular probe. There is currently no method for determining that, and any method would need to be lightweight.
(In reply to comment #6) > What's the _small_ impact? Could be huge for someone who depend on tcl. Hard to > say if we should push it into F-11 without further testing. We don't know with respect to TCL. In theory there shouldn't be any overhead, except for setting up the arguments as Stan said. But none of us is a tcl expert. If you have a pointer to a good TCL benchmark or a program that is performance sensitive we can do some tests with and without probes being compiled in for you and provide some numbers. BTW. Upstream bug tracking addition of (working) probe_ENABLED() is http://sourceware.org/bugzilla/show_bug.cgi?id=10013
I didn't find any useful benchmarks yet, but I try tests which are included in tcl source. The obj.test is eating all cpu and fail with dst because is trying to run additional dst tests. So that's probably the first thing to solve. I'll be looking at this, but tcl with dst definitely can't make it to F-11.
Created attachment 338514 [details] tcl probes
I'm not quite sure how to run the tests, but I tried this and did not seem to have any particular problems. tcl.stp is my systemtap probe file (attached) LD_LIBRARY_PATH=LIBTCLPATH stap -c '../unix/tclsh all.tcl' ../../../../tcl.stp ../unix/libtcl8.5.so ... Tests ended at Tue Apr 07 10:49:22 EDT 2009 all.tcl: Total 24302 Passed 19615 Skipped 4653 Failed 34 Sourced 137 Test Files. Files with failing tests: clock.test dict.test execute.test expr.test fileName.test info.test regexp.test regexpComp.test
I have the latest kernel and kernel-devel. LD_LIBRARY_PATH=LIBTCLPATH stap -c '../unix/tclsh all.tcl' ~/tcl.stp ./unix/libtcl8.5.so SystemTap's version of uprobes is out of date. As root, run "make -C /usr/share/systemtap/runtime/uprobes". Pass 4: compilation failed. Try again with another '--vp 0001' option. A one test could be run as make test TESTFLAGS="-file ../tests/obj.test" For running test you should be in tcl8.5.6/unix directory.
I wonder if you are familiar with koji failure on ppc(64) with systemtap support. The build pass just fine on local ppc, but fails on koji. http://koji.fedoraproject.org/koji/getfile?taskID=1362745&name=build.log Maybe koji needs to set some other BR or environment variable...
Ha, so the build passed on local ppc with systemtap-0.9.5-1.fc11.ppc and the failure is there since systemtap-0.9.7-1.fc11.ppc. Could be something broken since 0.9.7?
Error: junk at end of line: `0' sounds familiar, that was most likely introduced in: commit 1ce4311fbb8e3191449f919b784cea355a0cfe00 Author: Stan Cox <scox> Date: Thu Apr 23 16:43:44 2009 -0400 Avoid a uprobe break setting problem by avoiding 'nop 0' on x86. * sdt.h (STAP_NOP): New. (STAP_PROBE): Use it. I don't have a powerpc machine around. But I suspect it might be fixed by something like: diff --git a/includes/sys/sdt.h b/includes/sys/sdt.h index 5899549..fd2c55f 100644 --- a/includes/sys/sdt.h +++ b/includes/sys/sdt.h @@ -59,7 +59,7 @@ #define STAP_UNINLINE_LABEL(label) \ __extension__ static volatile long labelval __attribute__ ((unused)) = (long -#if defined(__x86_64__) || defined(__i386__) +#if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) #define STAP_NOP "\tnop " #else #define STAP_NOP "\tnop 0 "
The "nop problem" has been reported upstream: http://sourceware.org/bugzilla/show_bug.cgi?id=10172
Upstream "nop problem" was tested and fixed in git. http://sourceware.org/git/?p=systemtap.git;a=commit;h=78c9d72e18da2d5c930fb39460c236ea24fee423
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Regarding measuring the performance impact of probes, there is 'tclbench', described on the Tcl wiki here: http://wiki.tcl.tk/1611