Bug 713248 - kprobes instruction-boundary-checker buggy on x86-64
Summary: kprobes instruction-boundary-checker buggy on x86-64
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 726270
TreeView+ depends on / blocked
 
Reported: 2011-06-14 19:29 UTC by Frank Ch. Eigler
Modified: 2012-07-11 17:49 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
: 726270 (view as bug list)
Environment:
Last Closed: 2012-07-11 17:49:18 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Frank Ch. Eigler 2011-06-14 19:29:03 UTC
Description of problem:
systemtap and 'perf probe' probes on various valid kernel/module locations
break due to the 

Version-Release number of selected component (if applicable):
Linux vm-f15-64 2.6.38.7-30.fc15.x86_64 #1 SMP Fri May 27 05:15:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
always

Steps to Reproduce:
1. install kernel-debuginfo; modprobe ext2
2. stap -e 'probe module("ext2").statement("*@*:*") {}' -t;  ^C interrupt it
3. observe warnings of the form "WARNING: probe module ... registration error (rc -84)

alternately, with perf

2. find example instruction address, for example from stap -p2 ... run
3. perf probe -m ext2 -f -a ext2_rsv_window_add+0x29
4. observe "invalid argument" error and "[13325.075265] Probing 
   address(0xffffffffa00e356f) is not an instruction boundary." in dmesg

For this particular example:

0000000000000546 <ext2_rsv_window_add>:
     546:       55                      push   %rbp
     547:       48 89 e5                mov    %rsp,%rbp
     54a:       e8 00 00 00 00          callq  54f <ext2_rsv_window_add+0x9>
                        54b: R_X86_64_PC32      mcount-0x4
     54f:       48 8b bf 78 02 00 00    mov    0x278(%rdi),%rdi
     556:       4c 8b 46 20             mov    0x20(%rsi),%r8
     55a:       48 89 f0                mov    %rsi,%rax
     55d:       31 f6                   xor    %esi,%esi
     55f:       48 81 c7 48 01 00 00    add    $0x148,%rdi
     566:       48 89 f9                mov    %rdi,%rcx
     569:       eb 27                   jmp    592 <ext2_rsv_window_add+0x4c>
     56b:       4c 3b 42 20             cmp    0x20(%rdx),%r8
     56f:       48 8d 4a 10             lea    0x10(%rdx),%rcx
     573:       72 1a                   jb     58f <ext2_rsv_window_add+0x49>
     575:       4c 3b 42 28             cmp    0x28(%rdx),%r8

where one can see 0x56f is a lovely and talented 'lea' instruction.
A probe on the preceding 'cmp' is accepted.  Subsequent instructions
are not accepted due to the 'lea' rejection.

The problem appears to be with the kernel instruction-boundary decoder
widgetry in arch/x86/kernel/kprobes.c can_probe().

See also: http://sourceware.org/ml/systemtap/2011-q2/msg00286.html

Comment 1 Frank Ch. Eigler 2011-06-14 20:07:53 UTC
Note same thing happens on rawhide 2.6.39-0.rc7.git0.0.fc16.x86_64.

Comment 2 Masami Hiramatsu 2011-06-16 01:47:26 UTC
Thanks for reporting it!

It seems that the decoder failed to decode REX prefix even on x86-64,
because both of 4c and 48 are REX prefix on x86-64.
I've tested to probe on similar sequence on my 3.0.0-rc2-tip+ kernel on Fedora15


ffffffff810e3fa1 <zap_page_range>:
...
ffffffff810e3fb0:       e8 4b f9 31 00          callq  ffffffff81403900 <mcount>
ffffffff810e3fb5:       4c 8b 37                mov    (%rdi),%r14
ffffffff810e3fb8:       4c 8d 2c 32             lea    (%rdx,%rsi,1),%r13
ffffffff810e3fbc:       48 89 8d 68 ff ff ff    mov    %rcx,-0x98(%rbp)
ffffffff810e3fc3:       49 89 fc                mov    %rdi,%r12
ffffffff810e3fc6:       48 89 f3                mov    %rsi,%rbx
ffffffff810e3fc9:       48 c7 45 d8 00 00 00    movq   $0x0,-0x28(%rbp)

Here, you can see the lea and mov which start with 4c and 48, and I had
no problem with putting probes on those insns.

 # echo p zap_page_range+0x17  > kprobe_events 
 # echo p zap_page_range+0x1b >> kprobe_events
 # cat kprobe_events
p:kprobes/p_zap_page_range_23 zap_page_range+23
p:kprobes/p_zap_page_range_27 zap_page_range+27

And also, I've checked that test_get_len can decode your reported
instructions. The test_get_len is a standalone decoder for testing
insn.c, you can find it under /lib/modules/<kver>/build/arch/x86/tools/
when you build a kernel.

# cat test.dump
     56b:	4c 3b 42 20	cmp    0x20(%rdx),%r8
     56f:	48 8d 4a 10	lea    0x10(%rdx),%rcx
(Note that each item(addr, raw insn, decoded insn) are separated by a tab)

# ./test_get_len -vy < test.dump
Succeed: decoded and checked 2 instructions

(-y option means given insn are 64bit)
Actually, if I omit -y option, it fails to decode

# ./test_get_len -vn  < test.dump
Warning: ./test_get_len found difference at <unknown>
Warning:      56b:      4c 3b 42 20     cmp    0x20(%rdx),%r8
Warning: objdump says 4 bytes, but insn_get_length() says 1
Instruction = {
        .prefixes = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .rex_prefix = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .vex_prefix = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .opcode = {
                .value = 76, bytes[] = {4c, 0, 0, 0},
                .got = 1, .nbytes = 1},
        .modrm = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .sib = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .displacement = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .immediate1 = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .immediate2 = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 0, .nbytes = 0},
        .attr = c, .opnd_bytes = 4, .addr_bytes = 4,
        .length = 1, .x86_64 = 0, .kaddr = 0x7fff46c8d860}
Warning: ./test_get_len found difference at <unknown>
Warning:      56f:      48 8d 4a 10     lea    0x10(%rdx),%rcx
Warning: objdump says 4 bytes, but insn_get_length() says 1
Instruction = {
        .prefixes = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .rex_prefix = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .vex_prefix = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .opcode = {
                .value = 72, bytes[] = {48, 0, 0, 0},
                .got = 1, .nbytes = 1},
        .modrm = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .sib = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .displacement = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .immediate1 = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 1, .nbytes = 0},
        .immediate2 = {
                .value = 0, bytes[] = {0, 0, 0, 0},
                .got = 0, .nbytes = 0},
        .attr = c, .opnd_bytes = 4, .addr_bytes = 4,
        .length = 1, .x86_64 = 0, .kaddr = 0x7fff46c8d860}
Warning: decoded and checked 2 instructions with 2 warnings

So I guess something similar thing happened on Rawhide kernel.

Comment 3 Masami Hiramatsu 2011-06-16 03:36:50 UTC
Hmm... I couldn't see the problem on my x86-64 Fedora15

[root@fedora15 tracing]# uname -a
Linux fedora15 2.6.38.7-30.fc15.x86_64 #1 SMP Fri May 27 05:15:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@fedora15 tracing]# echo p ext2_rsv_window_add+0x29 > /tracing/kprobe_events
[root@fedora15 tracing]# cat /tracing/kprobe_events
p:kprobes/p_ext2_rsv_window_add_41 ext2_rsv_window_add+41

Comment 4 Dave Jones 2012-04-11 17:02:35 UTC
Frank, is this still a problem in 3.3 ?

Comment 5 Josh Boyer 2012-07-11 17:49:18 UTC
Fedora 15 has reached it's end of life as of June 26, 2012.  As a result, we will not be fixing any remaining bugs found in Fedora 15.

In the event that you have upgraded to a newer release and the bug you reported is still present, please reopen the bug and set the version field to the newest release you have encountered the issue with.  Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered.

Thank you for taking the time to file a report.  We hope newer versions of Fedora suit your needs.


Note You need to log in before you can comment on or make changes to this bug.