Systemtap failed to run with error: /usr/share/systemtap/runtime/sym.c:1159:5: error: conflicting types for ‘kallsyms_on_each_symbol’; have ‘int(int (*)(void *, const char *, struct module *, long unsigned int), void *)’ 1159 | int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *, | ^~~~~~~~~~~~~~~~~~~~~~~ Reproducible: Always Steps to Reproduce: 1. Get a F38 System. 2. Create a file with the following content: cat > simple-test.stp # Script to test basic System Tap functionality. global tickCounter = 0; global vmallocCounter = 0; function sayHello() %{ printk("systemtap script says hello\n"); %} probe begin { sayHello(); printf("hello\n"); } probe timer.ms(100) { tickCounter++; } function sayGoodbye() %{ printk("systemtap script says goodbye\n"); %} // Force use of some basic debug info. probe kernel.function("vmalloc") { vmallocCounter++; } probe end { sayGoodbye(); printf("counter = %d\nvmalloc = %d\nbye!\n", tickCounter, vmallocCounter); } 3. run the following commands: # stap -v -F -o out.log -g --disable-cache ./simple-test.stp Actual Results: Pass 1: parsed user script and 486 library scripts using 135860virt/107056res/15488shr/90928data kb, in 170usr/40sys/444real ms. Pass 2: analyzed script: 4 probes, 2 functions, 0 embeds, 2 globals using 299308virt/201360res/20784shr/188804data kb, in 6030usr/1270sys/160938real ms. Pass 3: translated to C into "/tmp/stapA4qqSi/stap_723_src.c" using 299328virt/201552res/20976shr/188824data kb, in 120usr/60sys/191real ms. In file included from /usr/share/systemtap/runtime/linux/runtime.h:288, from /usr/share/systemtap/runtime/runtime.h:26, from /tmp/stapA4qqSi/stap_723_src.c:21: /usr/share/systemtap/runtime/sym.c:1159:5: error: conflicting types for ‘kallsyms_on_each_symbol’; have ‘int(int (*)(void *, const char *, struct module *, long unsigned int), void *)’ 1159 | int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *, | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from ./include/linux/ftrace.h:13, from ./include/linux/kprobes.h:28, from /usr/share/systemtap/runtime/linux/runtime.h:21: ./include/linux/kallsyms.h:70:5: note: previous declaration of ‘kallsyms_on_each_symbol’ with type ‘int(int (*)(void *, const char *, long unsigned int), void *)’ 70 | int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long), | ^~~~~~~~~~~~~~~~~~~~~~~ /usr/share/systemtap/runtime/sym.c: In function ‘kallsyms_on_each_symbol’: /usr/share/systemtap/runtime/sym.c:1166:85: error: passing argument 1 of ‘(int (*)(int (*)(void *, const char *, long unsigned int), void *))_stp_kallsyms_on_each_symbol’ from incompatible pointer type [-Werror=incompatible-pointer-types] 1166 | return (* (kallsyms_on_each_symbol_fn)_stp_kallsyms_on_each_symbol)(fn, data); | ^~ | | | int (*)(void *, const char *, struct module *, long unsigned int) /usr/share/systemtap/runtime/sym.c:1166:85: note: expected ‘int (*)(void *, const char *, long unsigned int)’ but argument is of type ‘int (*)(void *, const char *, struct module *, long unsigned int)’ In file included from ./include/linux/kernel.h:30, from ./arch/x86/include/asm/percpu.h:27, from ./arch/x86/include/asm/preempt.h:6, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:56, from ./include/linux/mmzone.h:8, from ./include/linux/gfp.h:7, from /usr/share/systemtap/runtime/linux/runtime_defines.h:20, from /usr/share/systemtap/runtime/runtime_defines.h:8, from /tmp/stapA4qqSi/stap_723_src.c:12: /usr/share/systemtap/runtime/linux/print.c: In function ‘_stp_print_kernel_info’: /usr/share/systemtap/runtime/linux/print.c:365:43: error: ‘struct module’ has no member named ‘module_core’ 365 | (unsigned long) THIS_MODULE->module_core, | ^~ ./include/linux/printk.h:427:33: note: in definition of macro ‘printk_index_wrap’ 427 | _p_func(_fmt, ##__VA_ARGS__); \ | ^~~~~~~~~~~ /usr/share/systemtap/runtime/linux/print.c:348:9: note: in expansion of macro ‘printk’ 348 | printk(KERN_DEBUG | ^~~~~~ /usr/share/systemtap/runtime/linux/print.c:366:44: error: ‘struct module’ has no member named ‘core_size’ 366 | (unsigned long) (THIS_MODULE->core_size - THIS_MODULE->core_text_size)/1024, | ^~ ./include/linux/printk.h:427:33: note: in definition of macro ‘printk_index_wrap’ 427 | _p_func(_fmt, ##__VA_ARGS__); \ | ^~~~~~~~~~~ /usr/share/systemtap/runtime/linux/print.c:348:9: note: in expansion of macro ‘printk’ 348 | printk(KERN_DEBUG | ^~~~~~ /usr/share/systemtap/runtime/linux/print.c:366:71: error: ‘struct module’ has no member named ‘core_text_size’; did you mean ‘kprobes_text_size’? 366 | (unsigned long) (THIS_MODULE->core_size - THIS_MODULE->core_text_size)/1024, | ^~~~~~~~~~~~~~ ./include/linux/printk.h:427:33: note: in definition of macro ‘printk_index_wrap’ 427 | _p_func(_fmt, ##__VA_ARGS__); \ | ^~~~~~~~~~~ /usr/share/systemtap/runtime/linux/print.c:348:9: note: in expansion of macro ‘printk’ 348 | printk(KERN_DEBUG | ^~~~~~ /usr/share/systemtap/runtime/linux/print.c:367:46: error: ‘struct module’ has no member named ‘core_text_size’; did you mean ‘kprobes_text_size’? 367 | (unsigned long) (THIS_MODULE->core_text_size)/1024, | ^~~~~~~~~~~~~~ ./include/linux/printk.h:427:33: note: in definition of macro ‘printk_index_wrap’ 427 | _p_func(_fmt, ##__VA_ARGS__); \ | ^~~~~~~~~~~ /usr/share/systemtap/runtime/linux/print.c:348:9: note: in expansion of macro ‘printk’ 348 | printk(KERN_DEBUG | ^~~~~~ cc1: all warnings being treated as errors make[1]: *** [scripts/Makefile.build:252: /tmp/stapA4qqSi/stap_723_src.o] Error 1 Expected Results: No error.
Please share your version of stap, and dnf-update if you haven't already.
I can reproduce this with systemtap-4.9-1.fc38.x86_64 and kernel-6.4.7-200.fc38.x86_64. This looks like some of the fixes for the 6.4 kernels need to be backported to the fedora 38 (and 37) systemtap packages. Looking through the git commits should need: 33fae2d0107fb6166b4eac3fdffd277829849ab0 5251b3060790faafa9f94c14801baaa76a2bf8ea fc6519089d3f9366470ce442b648d69ed9b56f53 56054abb4efb3ef95808306b2f22339ab5c96352 might also need the following if complains about "zero length arrays" 788c58ced532537b87f596355d3e9b6dec30e61a As a quick test did a local rpm build of the current systemtap git repo checkout, systemtap-5.0-0.1.202308081455.fc38.x86_64, installed it and run the reproducer. It ran without issue: [wcohen@fedora38 bz2230079]$ rpm -q systemtap kernel systemtap-5.0-0.1.202308081455.fc38.x86_64 kernel-6.3.11-200.fc38.x86_64 kernel-6.4.7-200.fc38.x86_64 [wcohen@fedora38 bz2230079]$ sudo stap -v -F -o out.log -g --disable-cache ./simple-test.stp Pass 1: parsed user script and 535 library scripts using 532936virt/289256res/14848shr/273804data kb, in 610usr/80sys/692real ms. Pass 2: analyzed script: 4 probes, 2 functions, 0 embeds, 2 globals using 632092virt/389468res/16476shr/372960data kb, in 1290usr/30sys/1340real ms. Pass 3: translated to C into "/tmp/stapFxAjgw/stap_40645_src.c" using 632092virt/389596res/16604shr/372960data kb, in 10usr/10sys/12real ms. Pass 4: compiled C into "stap_40645.ko" in 18950usr/2470sys/11794real ms. Pass 5: starting run. Pass 5: run completed in 10usr/20sys/29real ms. 41504
The systemtap version is systemtap.x86_64 4.9-1.fc38 I make sure the kernel and debug version are the same. 6.4.7 chuung
I have encountered the same problem with: systemtap-4.9-1.fc38.x86_64 kernel-6.4.8-200.fc38.x86_64 kernel-devel-6.4.8-200.fc38.x86_64 kernel-debuginfo-6.4.8-200.fc38.x86_64 I think it will affect all scripts, since I see it even with this trivial one: probe begin { printf("BEGIN\n"); } probe end { printf("END\n"); } Raising severity to "high", since AFAICT this makes systemtap unusable on the current kernel.
I am working on getting f37/f38 systemtap rpms with patches mentioned for linux-6.4 support.
Bohdi updates filed for systemtap builds with the fixes: f38 https://bodhi.fedoraproject.org/updates/FEDORA-2023-4617bf01d3 f37 https://bodhi.fedoraproject.org/updates/FEDORA-2023-1fb197c4ff
Alas, the version from comment 6 does not appear to work correctly either. Indeed, it's arguably worse with problems now showing up on the kernel side. I no longer get a compile error, but running 'sudo stap trivial.stp' hangs before printing the "BEGIN" message. Weirdly strace indicates that stap is in wait4() waiting on a specific PID, and ps shows that process (stapio) already in zombie state. It also triggers a kernel oops, I'll attach the trace. This occurs with: systemtap-runtime-4.9-2.fc38.x86_64 systemtap-client-4.9-2.fc38.x86_64 systemtap-devel-4.9-2.fc38.x86_64 systemtap-4.9-2.fc38.x86_64 kernel-modules-core-6.4.9-200.fc38.x86_64 kernel-core-6.4.9-200.fc38.x86_64 kernel-modules-6.4.9-200.fc38.x86_64 kernel-debuginfo-common-x86_64-6.4.9-200.fc38.x86_64 kernel-debuginfo-6.4.9-200.fc38.x86_64 kernel-6.4.9-200.fc38.x86_64 kernel-modules-extra-6.4.9-200.fc38.x86_64 kernel-devel-6.4.9-200.fc38.x86_64 $ uname -a Linux zatzit 6.4.9-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Aug 8 21:21:11 UTC 2023 x86_64 GNU/Linux
Created attachment 1982902 [details] kernel oops triggered by systemtap-4.9-2.fc38.x86_64 Oops log as described.
@dgibson The kernel oops indicates this is related to Intel CET support being enabled: [ 684.201414] traps: Missing ENDBR: kallsyms_lookup_name+0x0/0xd0 [ 684.201461] RIP: 0010:kallsyms_lookup_name+0x0/0xd0 [ 684.201463] Code: 79 0a 48 f7 d0 48 03 05 56 41 5b 01 c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <66> 0f 1f 00 0f 1f 44 00 00 53 48 83 ec 10 65 48 8b 04 25 28 00 00 [ 684.201464] RSP: 0018:ffffa1d0853a7db8 EFLAGS: 00010282 [ 684.201465] RAX: ffffffff9e206980 RBX: 00007ffc71cedf54 RCX: 0000000000000000 [ 684.201465] RDX: 0000000080000000 RSI: ffff902a4040fb50 RDI: ffffffffc1b4c3b5 [ 684.201466] RBP: 0000000000000008 R08: ffff902a4040fb78 R09: ffffffffa005d6a0 [ 684.201467] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 684.201467] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 684.201468] ? __pfx_kallsyms_lookup_name+0x10/0x10 [ 684.201472] _stp_ctl_write_cmd+0x46b/0xbe0 [stap_6f8266ac9ff80bbafe256ed5ed9b11a_2890] [ 684.201478] ? inode_security+0x22/0x60 Would it be possible for you to disable Intel CET on the machine and verify that? Looking at the disassembled /usr/lib/debug/lib/modules/6.4.9-200.fc38.x86_64/vmlinux there appears to be an endbr64 instruction there, so not sure why it it would trip on that: ffffffff811dda40 <module_kallsyms_lookup_name>: ffffffff811dda40: f3 0f 1e fa endbr64 ffffffff811dda44: e8 c7 6a ea ff call ffffffff81084510 <__fentry__> ffffffff811dda49: 55 push %rbp ffffffff811dda4a: 48 89 fd mov %rdi,%rbp ffffffff811dda4d: 53 push %rbx
Ah, Looks like systemtap is using __pfx_kallsyms_lookup_names rather than kallsyms_lookup_name. From /proc/kallsyms ffffffffa7206970 T __pfx_kallsyms_lookup_name ffffffffa7206980 T kallsyms_lookup_name see that it is calling __pfx_kallsyms_lookup_name rather than kallsyms_lookup_name. ffffffff81206970 <__pfx_kallsyms_lookup_name>: ffffffff81206970: 90 nop ffffffff81206971: 90 nop ffffffff81206972: 90 nop ffffffff81206973: 90 nop ffffffff81206974: 90 nop ffffffff81206975: 90 nop ffffffff81206976: 90 nop ffffffff81206977: 90 nop ffffffff81206978: 90 nop ffffffff81206979: 90 nop ffffffff8120697a: 90 nop ffffffff8120697b: 90 nop ffffffff8120697c: 90 nop ffffffff8120697d: 90 nop ffffffff8120697e: 90 nop ffffffff8120697f: 90 nop ffffffff81206980 <kallsyms_lookup_name>: ffffffff81206980: f3 0f 1e fa endbr64
I have a machine that has the CET-IBT support and I have verified that turning off the X86_FEATURE_IBT by adding the following to the kernel boot parameters will allow the systemtap instrumentation to run correctly: clearcpuid=596 That doesn't address the basic problem that systemtap is using __pfx_kallsyms_lookup_name rather than kallsyms_lookup_name, but it will allow one to use systemtap on the Intel systems with CET-IBT support (have "CET detected: Indirect Branch Tracking enabled" in the boot up dmesgs and ibt in /proc/cpuinfo flags).
I can confirm systemtap seems to work again (at least with the trivial script) when I add 'clearcpuid=596' to the kernel command line. I also updated to kernel-6.4.10-200.fc38.x86_64 and verified that it still fails with that kernel but without the change to the command line.
The issue with the Intel IBT support is not related to this particular bug of compiling code for the linux 6.4 kernels. There has been an upstream systemtap bug filed about systemtap not working with Intel IBT support, https://sourceware.org/bugzilla/show_bug.cgi?id=30777
The original reported issue with the kallsyms_on_each_symbol() has been addressed. The fix for the IBT issue (https://sourceware.org/bugzilla/show_bug.cgi?id=30777) is in the upstream systemtap git repo and should be in the next release of systemtap.