Bug 2094937 - kretprobe variable access broken on recent kernels (5.11+?)
Summary: kretprobe variable access broken on recent kernels (5.11+?)
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: systemtap
Version: 36
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Frank Ch. Eigler
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-08 16:35 UTC by Bryn M. Reeves
Modified: 2023-05-25 16:14 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-25 16:14:37 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Bryn M. Reeves 2022-06-08 16:35:38 UTC
Description of problem:
Attempting to access context variables from a kernel function .return probe gives a compiler error on Fedora 36 / 5.17.5-300.fc36.x86_64:

  # stap -e 'probe kernel.function("dm_lock_for_deletion").return { printf("%s", kernel_string($md->name)) }'
  WARNING: confusing usage, value is captured as @entry($md->name) in .return probe [man stapprobes] RETURN PROBES: identifier '$md' at <input>:1:83
   source: probe kernel.function("dm_lock_for_deletion").return { printf("%s", kernel_string($md->name)) }
                                                                                             ^
  warning: the compiler differs from the one used to build the kernel
    The kernel was built by: gcc (GCC) 12.0.1 20220413 (Red Hat 12.0.1-0)
    You are using:           gcc (GCC) 12.1.1 20220507 (Red Hat 12.1.1-1)
  /tmp/stap3oyVT2/stap_f6ccae714cb9f51153235f45abadb532_2993_src.c: In function ‘_kretprobe_data’:
  /tmp/stap3oyVT2/stap_f6ccae714cb9f51153235f45abadb532_2993_src.c:34:46: error: ‘struct kretprobe_instance’ has no member named ‘rp’; did you mean ‘rph’?
     34 |         if (end > offset && pi && end <= pi->rp->data_size)
        |                                              ^~
        |                                              rph
  make[1]: *** [scripts/Makefile.build:288: /tmp/stap3oyVT2/stap_f6ccae714cb9f51153235f45abadb532_2993_src.o] Error 1
  make: *** [Makefile:1841: /tmp/stap3oyVT2] Error 2
  WARNING: kbuild exited with status: 2
  Pass 4: compilation failed.  [man error::pass4]

This is due to upstream commit d741bf4 (v5.11-rc1~176^2~6):

  commit d741bf41d7c7db4898bacfcb020353cddc032fd8
  Author: Peter Zijlstra <peterz>
  Date:   Sat Aug 29 22:03:24 2020 +0900

    kprobes: Remove kretprobe hash
    
    The kretprobe hash is mostly superfluous, replace it with a per-task
    variable.
    
    This gets rid of the task hash and it's related locking.
    
    Note that this may change the kprobes module-exported API for kretprobe
    handlers. If any out-of-tree kretprobe user uses ri->rp, use
    get_kretprobe(ri) instead.
    
    Signed-off-by: Peter Zijlstra (Intel) <peterz>
    Signed-off-by: Masami Hiramatsu <mhiramat>
    Signed-off-by: Ingo Molnar <mingo>
    Link: https://lore.kernel.org/r/159870620431.1229682.16325792502413731312.stgit@devnote2

This removes the "kretprobe *rp" member from struct kretprobe_instance and inserts a new "struct kretprobe_holder *rph" in its place: accessing the kretprobe now goes via ri->rhp->rh rather than directly ri->rh.

Applying the following diff to tapset/linux/kretprobe.stp fixes the problem for me on F36 and allowed me to get at the variables of interest:

--- /usr/share/systemtap/tapset/linux/kretprobe.stp.orig	2022-05-06 15:13:40.000000000 +0100
+++ /usr/share/systemtap/tapset/linux/kretprobe.stp	2022-06-08 17:28:59.324207123 +0100
@@ -20,7 +20,7 @@ _kretprobe_data(struct kretprobe_instanc
 {
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,25)
 	size_t end = offset + length;
-	if (end > offset && pi && end <= pi->rp->data_size)
+	if (end > offset && pi && end <= pi->rph->rp->data_size)
 		return &pi->data[offset];
 #endif
 	return NULL;

Version-Release number of selected component (if applicable):
systemtap-runtime-4.7-1.fc36.x86_64
systemtap-client-4.7-1.fc36.x86_64
systemtap-4.7-1.fc36.x86_64
systemtap-devel-4.7-1.fc36.x86_64


How reproducible:
100% on affected kernels

Steps to Reproduce:
1. uname -r >= 5.11
2. stap -e 'probe kernel.function("dm_lock_for_deletion").return { printf("%s", kernel_string($md->name)) }' (any kernel.function().return probe should work, as long as it has a context variable you can access in the probe body - without that I do not see the compile error as the _kretprobe_data() bits don't get compiled).

Actual results:
  /tmp/stap3oyVT2/stap_f6ccae714cb9f51153235f45abadb532_2993_src.c: In function ‘_kretprobe_data’:
  /tmp/stap3oyVT2/stap_f6ccae714cb9f51153235f45abadb532_2993_src.c:34:46: error: ‘struct kretprobe_instance’ has no member named ‘rp’; did you mean ‘rph’?
     34 |         if (end > offset && pi && end <= pi->rp->data_size)
        |                                              ^~
        |                                              rph
  make[1]: *** [scripts/Makefile.build:288: /tmp/stap3oyVT2/stap_f6ccae714cb9f51153235f45abadb532_2993_src.o] Error 1

Expected results:
No error. Variable values available in probe.

Additional info:

I'm happy to fix up the diff to do a proper check for the different version ranges and submit the patch upstream if this seems correct (or switch it to using get_kretprobe() for those versions as suggested in the commit).

Comment 1 Bryn M. Reeves 2022-06-08 16:36:34 UTC
# stap -L 'kernel.function("dm_lock_for_deletion").return'
kernel.function("dm_lock_for_deletion@drivers/md/dm.c:358").return $return:int $md:struct mapped_device* $mark_deferred:bool $only_deferred:bool $r:int

Comment 2 Ben Cotton 2023-04-25 17:23:10 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 3 Ludek Smid 2023-05-25 16:14:37 UTC
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16.

Fedora Linux 36 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.