Bug 1158779

Summary: qemu crashed if migrate a s3-state guest with migration_speed and downtime by default.
Product: Red Hat Enterprise Linux 7 Reporter: Qian Guo <qiguo>
Component: qemu-kvmAssignee: Amit Shah <amit.shah>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 7.1CC: amit.shah, hhuang, juzhang, michen, pbonzini, qzhang, rbalakri, shu, virt-maint
Target Milestone: rcKeywords: TestOnly
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1158780 (view as bug list) Environment:
Last Closed: 2015-09-19 06:36:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1178655    
Bug Blocks: 923626, 1158780    

Description Qian Guo 2014-10-30 08:33:13 UTC
Description of problem:
qemu crashed as summary when the migration need time, and the crash is related with tsc:
qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/hw/i386/kvm/clock.c:62: kvmclock_current_nsec: Assertion `time.tsc_timestamp <= migration_tsc' failed.
Aborted (core dumped)


And if set big downtime and migrate speed, then the migration will last for very little time, then will not hit such issue.

Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-77.el7.x86_64
kernel-3.10.0-196.el7.x86_64
How reproducible:
100%

Steps to Reproduce:
1.Boot guest in src host:
/usr/libexec/qemu-kvm -cpu Penryn -enable-kvm -m 4096 -smp 4,sockets=1,cores=4,threads=1 -name rhel7base  -drive file=/mnt/rhel7u1/rhel7u1cp1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -boot menu=on -monitor stdio -netdev tap,id=hostnet0,ifname=guest1,script=/etc/qemu-ifup,vhost=on,queues=4 -device virtio-net,netdev=hostnet0,mac=54:52:1b:35:3c:16,id=test,mq=on,vectors=9 -nodefaults -nodefconfig -spice disable-ticketing,port=5930,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864   -device virtio-balloon-pci,id=balloon1 -qmp tcp:0:4446,server,nowait -device intel-hda,id=hda1 -device hda-duplex -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/qiguo,server,nowait

2.Do s3 in guest
# pm-suspend

3.Migrate guest to dst host, and do not set the downtime and speed.


Actual results:
In dst host:

qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/hw/i386/kvm/clock.c:62: kvmclock_current_nsec: Assertion `time.tsc_timestamp <= migration_tsc' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff2c875e9 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff2c875e9 in raise () from /lib64/libc.so.6
#1  0x00007ffff2c88cf8 in abort () from /lib64/libc.so.6
#2  0x00007ffff2c80556 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff2c80602 in __assert_fail () from /lib64/libc.so.6
#4  0x0000555555772e3d in kvmclock_vm_state_change ()
#5  0x000055555574559b in vm_state_notify ()
#6  0x00005555557455db in vm_start ()
#7  0x000055555564d22a in coroutine_trampoline ()
#8  0x00007ffff2c991d0 in ?? () from /lib64/libc.so.6
#9  0x00007fffffffb030 in ?? ()
#10 0x0000000000000000 in ?? ()
(gdb) bt ful
#0  0x00007ffff2c875e9 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007ffff2c88cf8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007ffff2c80556 in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007ffff2c80602 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x0000555555772e3d in kvmclock_vm_state_change ()
No symbol table info available.
#5  0x000055555574559b in vm_state_notify ()
No symbol table info available.
#6  0x00005555557455db in vm_start ()
No symbol table info available.
#7  0x000055555564d22a in coroutine_trampoline ()
No symbol table info available.
#8  0x00007ffff2c991d0 in ?? () from /lib64/libc.so.6
No symbol table info available.
#9  0x00007fffffffb030 in ?? ()
No symbol table info available.
#10 0x0000000000000000 in ?? ()
No symbol table info available.


Expected results:
No crash occurs

Additional info:
1.The time is synced in all the systems before test.
2.If migration is finished in very shot time, won't hit such issue.

Comment 3 Qian Guo 2015-01-05 10:12:34 UTC
Hi, Amit 

Could you check if bug 1178655 and bug 1158039 are same trigger as this bug ?

Thanks