Bug 2081102

Summary: The PR28634 patch efae8a3dc not playing well with 5.14.0-80.el9
Product: Red Hat Enterprise Linux 9 Reporter: Martin Cermak <mcermak>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
systemtap sub component: system-version QA Contact: Martin Cermak <mcermak>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: lberk, mcermak, mjw, wcohen
Version: 9.1Keywords: Bugfix, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: systemtap-4.7-2.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 10:18:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Cermak 2022-05-02 18:21:47 UTC
The PR28634 patch efae8a3dc, as propagated to systemtap-4.7-1.el9, does not play well with 5.14.0-80.el9.  Please, fix.

Comment 1 William Cohen 2022-05-04 14:18:39 UTC
Is there a reproducer for this?  The only test that I found that used ioscheduler.* probes was testsuite/buildok/ioscheduler-detailed.stp .  When running on both rhel8 (4.18.0-387.el8.x86_64) and rhel9 (5.14.0-83.el9.x86_64) got the following which is complaining about the probe point and not the target variable access:

$ stap -p4 buildok/ioscheduler-detailed.stp 
semantic error: while resolving probe point: identifier 'ioscheduler' at buildok/ioscheduler-detailed.stp:9:7
        source: probe ioscheduler.elv_next_request
                      ^

semantic error: no match

semantic error: while resolving probe point: identifier 'kernel' at /usr/share/systemtap/tapset/linux/ioscheduler.stp:44:3
        source: 		kernel.function("elv_next_request").return
                		^

semantic error: no match (similar functions: blk_get_request, blk_put_request, elv_latter_request, elv_merge_requests, elv_merged_request)

Pass 2: analysis failed.  [man error::pass2]
Number of similar error messages suppressed: 4.
Rerun with -v to see them.

Comment 2 Frank Ch. Eigler 2022-05-04 14:23:08 UTC
Any script that invokes the linux/ioscheduler.stp tapset, with its 

    #include <linux/elevator.h>

embedded-c block will fail during pass 4 on rhel9ish kernels with
a backported linux commit:

commit 2e9bc3465ac54d282b855b073409c2c3a7d1ae00
Author: Christoph Hellwig <hch>
Date:   Mon Sep 20 14:33:23 2021 +0200

    block: move elevator.h to block/

Comment 3 Martin Cermak 2022-05-09 08:32:03 UTC
Following update seems to fix the problem for me:

$ git diff
diff --git a/tapset/linux/ioscheduler.stp b/tapset/linux/ioscheduler.stp
index 3096a73ea..5fc3a8d71 100644
--- a/tapset/linux/ioscheduler.stp
+++ b/tapset/linux/ioscheduler.stp
@@ -11,7 +11,7 @@
 // </tapsetdescription>
 %{
 #include <linux/blkdev.h>
-#if LINUX_VERSION_CODE < KERNEL_VERSION(5,16,0)
+#ifdef _LINUX_ELEVATOR_H
 #include <linux/elevator.h>
 #endif
 %}
$

Comment 4 Martin Cermak 2022-05-09 08:45:29 UTC
(In reply to William Cohen from comment #1)
> Is there a reproducer for this? 

stap -c sync --suppress-handler-errors -e 'probe ioscheduler.elv_add_request{println(elevator_name) exit()}'

Comment 6 William Cohen 2022-05-09 14:07:34 UTC
Thanks.  I was able to reproduce the issue on RHEL9.

$ sudo stap -v -c sync --suppress-handler-errors -e 'probe ioscheduler.elv_add_request{println(elevator_name) exit()}'
[sudo] password for wcohen: 
Pass 1: parsed user script and 493 library scripts using 124308virt/98812res/17028shr/81176data kb, in 150usr/120sys/385real ms.
Pass 2: analyzed script: 1 probe, 3 functions, 1 embed, 0 globals using 130188virt/106488res/18772shr/87056data kb, in 78000usr/13650sys/104355real ms.
Pass 3: translated to C into "/tmp/stapeRnIvK/stap_75905b72ec83770c7d7051139141ae82_2368_src.c" using 130188virt/106616res/18900shr/87056data kb, in 0usr/60sys/84real ms.
/tmp/stapeRnIvK/stap_75905b72ec83770c7d7051139141ae82_2368_src.c:31:10: fatal error: linux/elevator.h: No such file or directory
   31 | #include <linux/elevator.h>
      |          ^~~~~~~~~~~~~~~~~~
compilation terminated.
make[1]: *** [scripts/Makefile.build:271: /tmp/stapeRnIvK/stap_75905b72ec83770c7d7051139141ae82_2368_src.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:1862: /tmp/stapeRnIvK] Error 2
WARNING: kbuild exited with status: 2
Pass 4: compiled C into "stap_75905b72ec83770c7d7051139141ae82_2368.ko" in 12130usr/2300sys/17172real ms.
Pass 4: compilation failed.  [man error::pass4]

However, on fedora rawhide with 5.18.0-0.rc5.20220506gitfe27d189e3f42e3.44.fc37.x86_64 (and f35) the reproducer didn't fail:

$ sudo stap -v -c sync --suppress-handler-errors -e 'probe ioscheduler.elv_add_request{println(elevator_name) exit()}'
[sudo] password for wcohen: 
Pass 1: parsed user script and 506 library scripts using 469432virt/226272res/15968shr/210160data kb, in 430usr/330sys/969real ms.
Pass 2: analyzed script: 1 probe, 3 functions, 1 embed, 0 globals using 482020virt/240940res/17576shr/222748data kb, in 102110usr/68720sys/179136real ms.
Pass 3: translated to C into "/tmp/staphYy5cJ/stap_563e652c79181592d71046724991d5ef_2607_src.c" using 482020virt/242868res/19500shr/222748data kb, in 40usr/790sys/953real ms.
Pass 4: compiled C into "stap_563e652c79181592d71046724991d5ef_2607.ko" in 31510usr/20390sys/55293real ms.
Pass 5: starting run.

Pass 5: run completed in 20usr/390sys/843real ms.

Comment 7 Martin Cermak 2022-05-09 15:54:56 UTC
My understanding is that while fedora kernel doesn't diverge from upstream, that doesn't need to be the case with other environments where the backport might have happened in a way that breaks the version based gate.

Comment 8 Martin Cermak 2022-05-09 19:24:12 UTC
Fixed in https://sourceware.org/bugzilla/show_bug.cgi?id=28634#c2

Comment 11 errata-xmlrpc 2022-11-15 10:18:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (systemtap bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8075