Bug 1596456

Summary: syscall probe aliases broken in 4.17+
Product: [Fedora] Fedora Reporter: JianHong Yin <jiyin>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 28CC: brolley, dsmith, fche, green, herdingcat, jistone, lberk, mjw, scox, wcohen, xzhou
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1596649 (view as bug list) Environment:
Last Closed: 2018-10-13 20:30:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1596649    

Description JianHong Yin 2018-06-29 03:37:05 UTC
Description of problem:
stap can not work on Fedora 28:

semantic error: resolution failed in alias expansion builder

semantic error: while resolving probe point: identifier 'syscall' at ./stat.stp:9:7
        source: probe syscall.lstat.return {
                      ^
semantic error: no match

Version-Release number of selected component (if applicable):
kernel-debuginfo-4.17.2-200.fc28.x86_64
kernel-4.17.2-200.fc28.x86_64
systemtap-3.3-1.fc28.x86_64

How reproducible:
always

Steps to Reproduce:
look down: 

Actual results:

'''
[yjh@test kernel]$ rpm -q kernel-debuginfo-4.17.2-200.fc28 kernel-4.17.2-200.fc28 systemtap
kernel-debuginfo-4.17.2-200.fc28.x86_64
kernel-4.17.2-200.fc28.x86_64
systemtap-3.3-1.fc28.x86_64
[yjh@dhcp-12-154 kernel]$ uname -r
4.17.2-200.fc28.x86_64
[yjh@test kernel]$ LANG=C sudo ./stat.stp  $file  -c "ls -l $file"
Pass 1: parsed user script and 478 library scripts using 156844virt/52472res/8288shr/44048data kb, in 160usr/10sys/171real ms.
semantic error: resolution failed in alias expansion builder

semantic error: while resolving probe point: identifier 'syscall' at ./stat.stp:9:7
        source: probe syscall.lstat.return {
                      ^

semantic error: no match

semantic error: resolution failed in alias expansion builder

semantic error: while resolving probe point: identifier 'syscall' at :15:7
        source: probe syscall.fstatat.return {
                      ^

semantic error: no match

Pass 2: analyzed script: 0 probes, 0 functions, 0 embeds, 0 globals using 222228virt/119416res/9660shr/109432data kb, in 8350usr/40sys/8475real ms.
Pass 2: analysis failed.  [man error::pass2]
[yjh@test kernel]$ cat stat.stp
#!/usr/bin/stap -vg
/*
 * usage:
 * file=new
 * sudo stap -g ./stat.stp  $file  -c "ls -l $file"
 * sudo stap -g ./stat.stp  $file  -c "du -s $file"
 */

probe syscall.lstat.return {
    if (kernel_string(@entry($filename)) == @1) {
        printf("end: $$vars$ = %s\n", @entry($$vars->$))
        printf("%s\n", @cast(@entry($statbuf), "struct stat")$$)
    }
}
probe syscall.fstatat.return {
    if (kernel_string(@entry($filename)) == @1) {
        printf("end: $$vars$ = %s\n", @entry($$vars->$))
        printf("%s\n", @cast(@entry($statbuf), "struct stat")$$)
    }
}
'''

Expected results:
works fine

Additional info:

Comment 1 JianHong Yin 2018-06-29 05:27:46 UTC
JFYI:

get same failure on latest Fedora-27.
and works fine on Fefora-26

'''
[root@ibm-x3550m3-07 ~]# file=.
[root@ibm-x3550m3-07 ~]# ./stat.stp  $file  -c "ls -l $file"
Pass 1: parsed user script and 476 library scripts using 138432virt/47792res/7252shr/40644data kb, in 220usr/50sys/454real ms.
Pass 2: analyzed script: 10 probes, 30 functions, 5 embeds, 30 globals using 347824virt/258712res/8688shr/250036data kb, in 8450usr/3290sys/16550real ms.
Pass 3: translated to C into "/tmp/stapzwdMfM/stap_4d0466daad27d11f0a9d3ca9f9fbf783_26077_src.c" using 347824virt/258908res/8884shr/250036data kb, in 20usr/0sys/19real ms.
Pass 4: compiled C into "stap_4d0466daad27d11f0a9d3ca9f9fbf783_26077.ko" in 20760usr/3410sys/25335real ms.
Pass 5: starting run.
total 48
-rw-r--r--. 1 root root     4 Jun 29 00:56 NETBOOT_METHOD.TXT
-rw-r--r--. 1 root root     8 Jun 29 00:56 RECIPE.TXT
-rw-------. 1 root root 15703 Jun 29 00:58 anaconda-ks.cfg
-rw-r--r--. 1 root root   259 Jun 29 00:58 my-ks-post.log
-rw-------. 1 root root 15422 Jun 29 00:58 original-ks.cfg
-rwxr-xr-x. 1 root root   574 Jun 29 01:16 stat.stp
end: $$vars$ = filename=140723443955028 statbuf=94380229644544 ret=?
{.st_dev=64768, .st_ino=1179649, .st_nlink=5, .st_mode=16744, .st_uid=0, .st_gid=0, .__pad0=0, .st_rdev=0, .st_size=4096, .st_blksize=4096, .st_blocks=8, .st_atime=1530249868, .st_atime_nsec=326138910, .st_mtime=1530249877, .st_mtime_nsec=195204357, .st_ctime=1530249877, .st_ctime_nsec=195204357, .__unused=[0, ...]}
Pass 5: run completed in 10usr/380sys/758real ms.
[root@ibm-x3550m3-07 ~]# uname -r
4.16.11-100.fc26.x86_64
[root@ibm-x3550m3-07 ~]# rpm -q kernel-debuginfo
kernel-debuginfo-4.16.11-100.fc26.x86_64
'''

Comment 2 Frank Ch. Eigler 2018-08-10 14:51:00 UTC
Work is under way.

Comment 3 Oleg Drokin 2018-08-20 04:08:19 UTC
For those that need for this to work right now, the workaround is to switch from syscall.* to actual function names.

So say if I need this script to work: https://sourceware.org/systemtap/examples/io/eatmydata.stp and only concentrate on fsync, the replacement would be:

#! /bin/sh

# note use of guru mode, to enable changing of syscall parameters
//bin/true && exec stap -g $0 ${1+"$@"}

# see also http://www.flamingspork.com/projects/libeatmydata/

global dummy_fd = -1 # invalid filehandle; to try stdout, run with -G dummy_fd=1
global guilt, agony, piety

probe kernel.function("do_fsync") {
  # We can't actually disable the syscall from here, but can try to
  # weaken it by redirecting the work toward a dummy file descriptor
  if (pid() == target())
     try { 
         $fd = dummy_fd
         guilt ++
     } catch {
         agony ++
     }
  else
     piety ++
}
probe kernel.function("do_fsync").return {
  # override result code, just in case kernel sent back -EINVAL or somesuch
  if (pid() == target())
      try { $return = 0 } catch { }
}


probe begin {
    printf("Redirecting f*sync by pid %d to fd %d\n",
        target(), dummy_fd)
}
probe error,end {
    printf("Redirected f*sync by pid %d to fd %d, success %d times, failed %d times.\n",
        target(), dummy_fd, guilt, agony)
    printf("Preserved f*sync by other processes %d times.\n", piety)
}

Comment 4 Hui Li 2018-08-29 03:00:20 UTC
Fedora 27 encounters the same issue as well:

[ericlee@localhost systemtap]$ uname -r
4.17.17-100.fc27.x86_64
[ericlee@localhost systemtap]$ sudo stap -v -e 'probe syscall.open {}'
Pass 1: parsed user script and 484 library scripts using 296460virt/89352res/8440shr/80824data kb, in 240usr/20sys/265real ms.
semantic error: resolution failed in alias expansion builder

semantic error: while resolving probe point: identifier 'syscall' at <input>:1:7
        source: probe syscall.open {}
                      ^

semantic error: no match

Pass 2: analyzed script: 0 probes, 0 functions, 0 embeds, 0 globals using 365972virt/160076res/9620shr/150336data kb, in 1530usr/30sys/1555real ms.
Pass 2: analysis failed.  [man error::pass2]
[ericlee@localhost systemtap]$ rpm -qa | grep `uname -r`
kernel-4.17.17-100.fc27.x86_64
kernel-debuginfo-4.17.17-100.fc27.x86_64
kernel-modules-extra-4.17.17-100.fc27.x86_64
kernel-devel-4.17.17-100.fc27.x86_64
kernel-debug-4.17.17-100.fc27.x86_64
kernel-modules-4.17.17-100.fc27.x86_64
kernel-debug-devel-4.17.17-100.fc27.x86_64
kernel-debug-core-4.17.17-100.fc27.x86_64
kernel-debug-modules-4.17.17-100.fc27.x86_64
kernel-debuginfo-common-x86_64-4.17.17-100.fc27.x86_64
kernel-core-4.17.17-100.fc27.x86_64

Do we have any update on this issue?

Thanks.

Hui

Comment 5 David Smith 2018-09-05 16:12:26 UTC
Work is going on upstream. About 2/3 of the syscall probes have been updated to handle the changes in the 4.17 kernel.

Feel free to try HEAD systemtap if you'd like a preview of the fixes.

Comment 6 Frank Ch. Eigler 2018-10-13 20:30:52 UTC
fixed in systemtap-4.0-1

Comment 7 Fedora Update System 2018-10-13 20:32:05 UTC
systemtap-4.0-1.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-9ca504df5d

Comment 8 Fedora Update System 2018-10-13 20:32:40 UTC
systemtap-4.0-1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-2dcac093ff

Comment 9 Fedora Update System 2018-10-14 21:06:38 UTC
systemtap-4.0-1.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-9ca504df5d

Comment 10 Fedora Update System 2018-10-15 00:40:53 UTC
systemtap-4.0-1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-2dcac093ff

Comment 11 Fedora Update System 2018-10-16 11:39:04 UTC
systemtap-4.0-1.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2018-10-30 17:26:27 UTC
systemtap-4.0-1.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.