Bug 1261797 - contents of MSR_TSC_AUX are not migrated
contents of MSR_TSC_AUX are not migrated
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
x86_64 Windows
medium Severity high
: rc
: ---
Assigned To: Amit Shah
Virtualization Bugs
Jiri Herrmann
:
Depends On:
Blocks: 1305606 1313485 1265427 1265428 1287070 1288337
  Show dependency treegraph
 
Reported: 2015-09-10 03:44 EDT by Xiaoqing Wei
Modified: 2016-11-07 15:37 EST (History)
13 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.5.0-1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1265427 1265428 (view as bug list)
Environment:
Last Closed: 2016-11-07 15:37:33 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
bsod minidump (271.53 KB, application/octet-stream)
2015-09-10 03:49 EDT, Xiaoqing Wei
no flags Details
windbg output (5.91 KB, text/plain)
2015-09-10 03:50 EDT, Xiaoqing Wei
no flags Details

  None (edit)
Description Xiaoqing Wei 2015-09-10 03:44:38 EDT
Description of problem:

Windows 10 x86_64 BSOD during migrating while formating emulated usb-storage

Version-Release number of selected component (if applicable):
kernel-3.10.0-314.el7.x86_64
qemu-kvm-rhev-2.3.0-22.el7.x86_64
spice-server-0.12.4-13.el7.x86_64


How reproducible:
1/5

happen once, 4 attempts to reproduce but failed.

Steps to Reproduce:
1. boot a vm with emulated usb-storage

/usr/libexec/qemu-kvm -monitor stdio \
    -S  \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga qxl  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-virt-tests-vm1-qmpmonitor1-20150908-154232-mKWvp7CD,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/tmp/monitor-virt-tests-vm1-catch_monitor-20150908-154232-mKWvp7CD,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idHrfoDK  \
    -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150908-154232-mKWvp7CD,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20150908-154232-mKWvp7CD,path=/tmp/seabios-20150908-154232-mKWvp7CD,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20150908-154232-mKWvp7CD,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=03 \
    -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file='/home/win10.qcow2' \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on \
    -device virtio-net-pci,mac=9a:3a:3b:3c:3d:3e,id=idNIRZOu,vectors=4,netdev=idZpUNpM,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on  \
    -netdev tap,id=idZpUNpM,vhost=on  \
    -m 2048  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'SandyBridge',+sep,+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=native,media=cdrom,file='/home/en_windows_10_enterprise_x64_dvd_6851151.iso' \
    -device ide-cd,id=cd1,drive=drive_cd1,bootindex=10,bus=ide.0,unit=0 \
    -drive id=drive_winutils,if=none,snapshot=off,aio=native,media=cdrom,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/isos/windows/winutils.iso \
    -device ide-cd,id=winutils,drive=drive_winutils,bootindex=2,bus=ide.0,unit=1 \
    -drive id=drive_unattended,if=none,snapshot=off,aio=native,media=cdrom,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/win8.1-64/autounattend.iso \
    -device ide-cd,id=unattended,drive=drive_unattended,bootindex=3,bus=ide.1,unit=0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -spice port=5900,disable-ticketing,addr=0,image-compression=auto_glz,zlib-glz-wan-compression=auto,streaming-video=all,agent-mouse=on,playback-compression=on,ipv4  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -drive id=drive_image2,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file='/home/1G.qcow2' \
    -enable-kvm \
    -device usb-storage,drive=drive_image2,id=virt0-0-1,bus=usb1.0,bootindex=20,serial=3ff69574-14e5-4139-8f5c-942ea8748ad7 \
    -boot order=cdn,once=d,menu=off,strict=off \
    -chardev spicevmc,id=charredir0,name=usbredir \
    -device usb-redir,chardev=charredir0,id=redir0 \



2. inside of guest:

diskpart
sel disk 1        -> this is the emulated usb
create part pri   -> create a primary partition
format fs=ntfs label=usb -> format it as NTFS, sector by sector, without the laziness 'quick' option

3. migrate this vm

/usr/libexec/qemu-kvm -monitor stdio \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga qxl  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-virt-tests-vm1-qmpmonitor1-20150908-154232-mKWvp7CD,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/tmp/monitor-virt-tests-vm1-catch_monitor-20150908-154232-mKWvp7CD,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idHrfoDK  \
    -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150908-154232-mKWvp7CD,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20150908-154232-mKWvp7CD,path=/tmp/seabios-20150908-154232-mKWvp7CD,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20150908-154232-mKWvp7CD,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=03 \
    -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file='/home/win10.qcow2' \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on \
    -device virtio-net-pci,mac=9a:3a:3b:3c:3d:3e,id=idNIRZOu,vectors=4,netdev=idZpUNpM,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on  \
    -netdev tap,id=idZpUNpM,vhost=on  \
    -m 2048  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'SandyBridge',+sep,+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=native,media=cdrom,file='/home/en_windows_10_enterprise_x64_dvd_6851151.iso' \
    -device ide-cd,id=cd1,drive=drive_cd1,bootindex=10,bus=ide.0,unit=0 \
    -drive id=drive_winutils,if=none,snapshot=off,aio=native,media=cdrom,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/isos/windows/winutils.iso \
    -device ide-cd,id=winutils,drive=drive_winutils,bootindex=2,bus=ide.0,unit=1 \
    -drive id=drive_unattended,if=none,snapshot=off,aio=native,media=cdrom,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/win8.1-64/autounattend.iso \
    -device ide-cd,id=unattended,drive=drive_unattended,bootindex=3,bus=ide.1,unit=0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -spice port=5910,disable-ticketing,addr=0,image-compression=auto_glz,zlib-glz-wan-compression=auto,streaming-video=all,agent-mouse=on,playback-compression=on,ipv4  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -drive id=drive_image2,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file='/home/1G.qcow2' \
    -enable-kvm \
    -device usb-storage,drive=drive_image2,id=virt0-0-1,bus=usb1.0,bootindex=20,serial=3ff69574-14e5-4139-8f5c-942ea8748ad7 \
    -boot order=cdn,once=d,menu=off,strict=off \
    -chardev spicevmc,id=charredir0,name=usbredir \
    -device usb-redir,chardev=charredir0,id=redir0 \
\
\
\
\
    -incoming tcp:0:4000


on src qemu:
migrate -d tcp:xx:yy


Actual results:

guest bsod in few min
should be able to see the minidump in C:\windows\minidump

Expected results:
both host and guest work well

Additional info:
Comment 1 Xiaoqing Wei 2015-09-10 03:49:31 EDT
Created attachment 1072052 [details]
bsod minidump
Comment 2 Xiaoqing Wei 2015-09-10 03:50:33 EDT
Created attachment 1072053 [details]
windbg output
Comment 3 Xiaoqing Wei 2015-09-10 03:51:24 EDT
1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

CRITICAL_STRUCTURE_CORRUPTION (109)
This bugcheck is generated when the kernel detects that critical kernel code or
data have been corrupted. There are generally three causes for a corruption:
1) A driver has inadvertently or deliberately modified critical kernel code
 or data. See http://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx
2) A developer attempted to set a normal kernel breakpoint using a kernel
 debugger that was not attached when the system was booted. Normal breakpoints,
 "bp", can only be set if the debugger is attached at boot time. Hardware
 breakpoints, "ba", can be set at any time.
3) A hardware corruption occurred, e.g. failing RAM holding kernel code or data.
Arguments:
Arg1: a3a01f58a88e3638, Reserved
Arg2: b3b72bdefb0f076f, Reserved
Arg3: 00000001c0000103, Failure type dependent information
Arg4: 0000000000000007, Type of corrupted region, can be
	0   : A generic data region
	1   : Modification of a function or .pdata
	2   : A processor IDT
	3   : A processor GDT
	4   : Type 1 process list corruption
	5   : Type 2 process list corruption
	6   : Debug routine modification
	7   : Critical MSR modification
	8   : Object type
	9   : A processor IVT
	a   : Modification of a system service function
	b   : A generic session data region
	c   : Modification of a session function or .pdata
	d   : Modification of an import table
	e   : Modification of a session import table
	f   : Ps Win32 callout modification
	10  : Debug switch routine modification
	11  : IRP allocator modification
	12  : Driver call dispatcher modification
	13  : IRP completion dispatcher modification
	14  : IRP deallocator modification
	15  : A processor control register
	16  : Critical floating point control register modification
	17  : Local APIC modification
	18  : Kernel notification callout modification
	19  : Loaded module list modification
	1a  : Type 3 process list corruption
	1b  : Type 4 process list corruption
	1c  : Driver object corruption
	1d  : Executive callback object modification
	1e  : Modification of module padding
	1f  : Modification of a protected process
	20  : A generic data region
	21  : A page hash mismatch
	22  : A session page hash mismatch
	23  : Load config directory modification
	24  : Inverted function table modification
	25  : Session configuration modification
	102 : Modification of win32k.sys
Comment 5 Gerd Hoffmann 2015-09-11 05:57:03 EDT
> CRITICAL_STRUCTURE_CORRUPTION (109)

> Arg4: 0000000000000007, Type of corrupted region, can be

> 	7   : Critical MSR modification

Hmm, that doesn't look usb-storage related at all.
Probably formating usb-storage just creates some load
which increases the chance to hit this.

Cc'ing paolo.
Comment 7 Paolo Bonzini 2015-09-22 18:21:02 EDT
The first three arguments are "reserved", but they strongly look like old value, new value (or a hash of it) and MSR index.  In fact the third is definitely the MSR index and it is MSR_TSC_AUX.

Do you remember if the crash happened before migration finished, or afterwards?
Comment 8 Paolo Bonzini 2015-09-22 18:22:15 EDT
It looks like QEMU is not saving and restoring MSR_TSC_AUX.
Comment 9 Xiaoqing Wei 2015-09-22 23:15:38 EDT
(In reply to Paolo Bonzini from comment #7)
> The first three arguments are "reserved", but they strongly look like old
> value, new value (or a hash of it) and MSR index.  In fact the third is
> definitely the MSR index and it is MSR_TSC_AUX.
> 
> Do you remember if the crash happened before migration finished, or
> afterwards?

not sure :(
I was started the formatting in guest and type 'migrate -d ' in qemu monitor and leave a while, when I back, the cmd terminal in guest is gone, so I check if it has a dump and it's there.

then I tried to reproduce as C#1, but failed, with 4 attempts with exactly identical steps, on origin host. no luck .
Comment 28 errata-xmlrpc 2016-11-07 15:37:33 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html

Note You need to log in before you can comment on or make changes to this bug.