Bug 1599957 - 32-bit userspace segfaults after ioctl(KVM_RUN)
Summary: 32-bit userspace segfaults after ioctl(KVM_RUN)
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-11 01:56 UTC by Dmitry V. Levin
Modified: 2019-12-13 22:57 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-12-13 22:57:03 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
packages installed to the environment where the issue is reproduced (24.73 KB, text/plain)
2018-07-11 05:09 UTC, Masatake YAMATO
no flags Details

Description Dmitry V. Levin 2018-07-11 01:56:27 UTC
strace ioctl_kvm_run test started to fail on rawhide with the following symptoms:

strace/tests$ $ ./ioctl_kvm_run >/dev/null; echo \$?=$?
$?=0
strace/tests-m32$ ./ioctl_kvm_run >/dev/null; echo \$?=$?
Segmentation fault (core dumped)
$?=139
strace/tests-m32$ ../strace -enone ./ioctl_kvm_run >/dev/null
../strace: [ Process PID=12345 runs in 32 bit mode. ]
+++ exited with 0 +++

The segfault happens inside glibc right after the first ioctl(KVM_RUN)
invocation.

All this oddness is happening in the following system:
Linux rawhide-test.fedorainfracloud.org 4.18.0-0.rc2.git4.1.fc29.x86_64 #1 SMP Fri Jun 29 15:03:54 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

The same symptoms could be seen in the following system:
Linux f27-test.fedorainfracloud.org 4.17.3-100.fc27.x86_64 #1 SMP Tue Jun 26 14:19:03 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

There are no such problems in the following system:
Linux f28-test.fedorainfracloud.org 4.16.3-301.fc28.x86_64 #1 SMP Mon Apr 23 21:59:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

This essentially leaves the kernel as the only plausible cause of this bug.
I suspect it's a bug of kvm nesting in fresh Fedora kernels.

Comment 1 Masatake YAMATO 2018-07-11 02:13:32 UTC
(kernel-4.18.0-0.rc2.git4.1.fc29.x86_64 is available from
https://koji.fedoraproject.org/koji/buildinfo?buildID=1100807
.)

Comment 2 Masatake YAMATO 2018-07-11 02:23:15 UTC
I wonder whether we can reproduce this segmentation fault on the latest upstream kernel. If it can, we can use git bisect to spot the change causing the segmentatoin fault.

I will try 4.18.0-0.rc2.git4.1.fc29.x86_64 first. Then I will build the upstream kernel with "make localyesconfig", and try to reproduce the issue on the kernel.

Comment 3 Masatake YAMATO 2018-07-11 05:09:16 UTC
Created attachment 1457990 [details]
packages installed to the environment where the issue is reproduced

Comment 4 Masatake YAMATO 2018-07-11 05:38:16 UTC
Reproduced with the kernel package, on raw kvm, not on nested kvm.

gdb says:
...
ioctl(5<anon_inode:kvm-vcpu:0>, KVM_SET_REGS, {rax=0x2, ..., rsp=0, rbp=0, ..., rip=0x1000, rflags=0x2}) = 0

Program received signal SIGSEGV, Segmentation fault.
0xf7e6a6ac in _IO_vfprintf_internal (s=0xf7fc0d80 <_IO_2_1_stdout_>, format=0x80496d8 "ioctl(%d<%s>, KVM_RUN, 0) = 0\n", 
    ap=0xffffcb24 "\005") at vfprintf.c:1244
1244    {

(gdb) disassemble
Dump of assembler code for function _IO_vfprintf_internal:
   0xf7e6a680 <+0>:     push   %ebp
   0xf7e6a681 <+1>:     mov    %esp,%ebp
   0xf7e6a683 <+3>:     push   %edi
   0xf7e6a684 <+4>:     call   0xf7f4425d <__x86.get_pc_thunk.di>
   0xf7e6a689 <+9>:     add    $0x155977,%edi
   0xf7e6a68f <+15>:    push   %esi
   0xf7e6a690 <+16>:    push   %ebx
   0xf7e6a691 <+17>:    sub    $0x4fc,%esp
   0xf7e6a697 <+23>:    mov    0xc(%ebp),%eax
   0xf7e6a69a <+26>:    mov    0x8(%ebp),%esi
   0xf7e6a69d <+29>:    mov    %eax,-0x470(%ebp)
   0xf7e6a6a3 <+35>:    mov    0x10(%ebp),%eax
   0xf7e6a6a6 <+38>:    mov    %eax,-0x480(%ebp)
=> 0xf7e6a6ac <+44>:    mov    %gs:0x14,%eax
(gdb) p $gs
$2 = 99
(gdb) p/x $gs
$7 = 0x63

... something to do with segment?  If I understand kernel correctly, gs points thread local storage.

Comment 5 Masatake YAMATO 2018-07-11 11:47:11 UTC
I found the commit the bug was introduced.

commit 42b933b59721f288e3ce23ca79a17a973808dab9
Author: Vitaly Kuznetsov <vkuznets>
Date:   Tue Mar 13 18:48:04 2018 +0100

    x86/kvm/vmx: read MSR_{FS,KERNEL_GS}_BASE from current->thread
    
    vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
    the context is pretty well defined. Read MSR_{FS,KERNEL_GS}_BASE from
    current->thread after calling save_fsgs() which takes care of
    X86_BUG_NULL_SEG case now and will do RD[FG,GS]BASE when FSGSBASE
    extensions are exposed to userspace (currently they are not).
    
    Acked-by: Andy Lutomirski <luto>
    Signed-off-by: Vitaly Kuznetsov <vkuznets>
    Signed-off-by: Paolo Bonzini <pbonzini>

Comment 6 Masatake YAMATO 2018-07-11 12:56:48 UTC
How to reproduce:

1. install Fedora 28.
2. install packages listed in the file attached to this bz.
   i686 libraries are needed.
3. git clone https://github.com/strace/strace.git
4. cd strace; bash bootstrap; ./configure; make; cd tests; make check; cd ../tests-m32; make check
   This step is for making strace/tests-m32/ioctl_kvm_run binary.
5. cd; mkdir build
6. git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
7. cd linux; git checkout 42b933b59721f288e3ce23ca79a17a973808dab9
8. make O=$HOME/build localyesconfig;
9. make O=$HOME/build -j8
10. sudo make O=$HOME/build -j8 modules_install install
11. reboot to boot the kernel 42b933
12. login
13. strace/tests-m32/ioctl_kvm_run
14. you will see the segmentation fault.
15. Do the same for 42b933^, you will not see the segmentation fault.

Comment 7 Vitaly Kuznetsov 2018-07-11 13:10:38 UTC
(In reply to Masatake YAMATO from comment #5)
> I found the commit the bug was introduced.
> 
> commit 42b933b59721f288e3ce23ca79a17a973808dab9
> Author: Vitaly Kuznetsov <vkuznets>
> Date:   Tue Mar 13 18:48:04 2018 +0100
> 
>     x86/kvm/vmx: read MSR_{FS,KERNEL_GS}_BASE from current->thread
>     

Thank you for doing the bisection, I'll look into the issue ASAP.

Comment 8 Vitaly Kuznetsov 2018-07-19 13:44:05 UTC
commit b062b794c7831a70bda4dfac202c1a9418e06ac0
Author: Vitaly Kuznetsov <vkuznets>
Date:   Wed Jul 11 19:37:18 2018 +0200

    x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks

should land in v4.18-rc6

Comment 9 Dmitry V. Levin 2018-07-19 13:53:43 UTC
(In reply to Vitaly Kuznetsov from comment #8)
> commit b062b794c7831a70bda4dfac202c1a9418e06ac0
> Author: Vitaly Kuznetsov <vkuznets>
> Date:   Wed Jul 11 19:37:18 2018 +0200
> 
>     x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks
> 
> should land in v4.18-rc6

Thanks!


Note You need to log in before you can comment on or make changes to this bug.