Bug 1575541

Summary: qemu core dump while installing win10 guest
Product: Red Hat Enterprise Linux 7 Reporter: jingzhao <jinzhao>
Component: qemu-kvm-rhevAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED ERRATA QA Contact: jingzhao <jinzhao>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: ailan, chayang, jinzhao, juzhang, knoel, kraxel, lists, virt-maint, xiaohli, yfu
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-3.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-01 11:07:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jingzhao 2018-05-07 09:02:30 UTC
Description of problem:
qemu core dump while installing win10 guest

Version-Release number of selected component (if applicable):
[root@dell-per730-29 home]# uname -r
3.10.0-883.el7.x86_64
[root@dell-per730-29 home]# rpm -qa |grep qemu-kvm-rhev
qemu-kvm-rhev-debuginfo-2.12.0-1.el7.x86_64
qemu-kvm-rhev-2.12.0-1.el7.x86_64
[root@dell-per730-29 home]# rpm -qa |grep ipxe
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch


How reproducible:
2/2

Steps to Reproduce:
1. Installing win10 guest with qemu command line [1]
2. qemu core dump while installing guest


Actual results:
qemu core dump

(gdb) bt full
#0  0x00007f043c5bf207 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f043c5c08f8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f043c5b8026 in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007f043c5b80d2 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x000055a9efe0cc81 in cpu_physical_memory_snapshot_get_dirty (snap=<optimized out>, start=<optimized out>, length=<optimized out>)
    at /usr/src/debug/qemu-2.12.0/exec.c:1252
        page = <optimized out>
        end = <optimized out>
        __PRETTY_FUNCTION__ = "cpu_physical_memory_snapshot_get_dirty"
#5  0x000055a9efe83005 in vga_update_display (opaque=<optimized out>) at /usr/src/debug/qemu-2.12.0/hw/display/vga.c:1671
        surface = <optimized out>
#6  0x000055a9f006641f in qemu_spice_display_refresh (ssd=0x55a9f36a6920) at ui/spice-display.c:478
No locals.
#7  0x000055a9f005cac2 in dpy_refresh (s=0x55a9f36981b0) at ui/console.c:1654
        dcl = 0x55a9f36a6928
#8  gui_update (opaque=0x55a9f36981b0) at ui/console.c:203
        interval = 3000
        dcl_interval = <optimized out>
        ds = 0x55a9f36981b0
        dcl = <optimized out>
        i = <optimized out>
#9  0x000055a9f0159001 in timerlist_run_timers (timer_list=0x55a9f22c7650) at util/qemu-timer.c:536
        ts = <optimized out>
        current_time = 20401409019310
        progress = <optimized out>
        cb = 0x55a9f005ca90 <gui_update>
        opaque = <optimized out>
#10 0x000055a9f01592e6 in qemu_clock_run_timers (type=<optimized out>) at util/qemu-timer.c:547
---Type <return> to continue, or q <return> to quit---
No locals.
#11 qemu_clock_run_all_timers () at util/qemu-timer.c:674
        progress = false
        type = <optimized out>
#12 0x000055a9f0159819 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:528
        ret = 0
        timeout = 4294967295
        timeout_ns = <optimized out>
#13 0x000055a9efe04577 in main_loop () at vl.c:1963
No locals.
#14 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4768
        i = <optimized out>
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
        boot_once = 0x0
        ds = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = <optimized out>
        olist = <optimized out>
        optind = 67
        optarg = 0x7ffe441695fe ":1"
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
        vga_model = 0x7ffe44169208 "qxl"
        qtest_chrdev = <optimized out>
---Type <return> to continue, or q <return> to quit---
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        dir = <optimized out>
        dirs = 0x0
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffe44167c90}
        __func__ = "main"
        __FUNCTION__ = "main"



Expected results:
No core dump and install guest successfully

Additional info:
[1]
/usr/libexec/qemu-kvm \
-M q35,accel=kvm,kernel-irqchip=split \
-device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
-cpu Haswell-noTSX \
-nodefaults -rtc base=utc \
-m 4G \
-smp 4,sockets=4,cores=1,threads=1 \
-enable-kvm \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-k en-us \
-nodefaults \
-chardev file,path=/home/seabios.log,id=seabios -device isa-debugcon,chardev=seabios,iobase=0x402 \
-boot menu=on \
-qmp tcp:0:6667,server,nowait \
-usb \
-device usb-tablet \
-vga qxl \
-device pcie-root-port,bus=pcie.0,id=root0,multifunction=on,chassis=1,addr=0xa.0 \
-drive file=/home/win10.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bus=root0 \
-device pcie-root-port,bus=pcie.0,id=root1,chassis=11,addr=0xa.1 \
-device virtio-net-pci,netdev=tap10,mac=00:52:68:26:31:03,bus=root1,id=net0 -netdev tap,id=tap10 \
-device pcie-root-port,bus=pcie.0,id=root2,chassis=12,addr=0xa.2 \
-device virtio-net-pci,netdev=tap11,mac=00:52:68:26:31:00,bus=root2,id=net1 -netdev tap,id=tap11 \
-device pcie-root-port,bus=pcie.0,id=root6,chassis=15,addr=0xa.5 \
-device pcie-root-port,bus=pcie.0,id=root7,chassis=4 \
-cdrom en_windows_10_business_editions_version_1803_updated_march_2018_x64_dvd_12063333.iso \
-device ahci,id=ahci1 \
-drive file=/usr/share/virtio-win/virtio-win-1.9.4.iso,if=none,id=drive-virtio-disk1,format=raw \
-device ide-cd,unit=0,drive=drive-virtio-disk1,id=virtio-disk1,bus=ahci1.0 \
-monitor stdio \
-vnc :1 \

Comment 6 Gerd Hoffmann 2018-05-09 18:14:44 UTC
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16142723
please test

Comment 8 Gerd Hoffmann 2018-05-14 10:42:38 UTC
https://patchwork.ozlabs.org/patch/912844/

Comment 9 Yanan Fu 2018-05-21 09:06:26 UTC
This is not a q35 only issue.
I hit this is on pc when install Win2008.i386.sp2 guest.

error:
"qemu-kvm: /builddir/build/BUILD/qemu-2.12.0/exec.c:1252: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed.\n/tmp/aexpect_O7d7kNfg/aexpect-6Teq1z.sh: line 1: 27553 Aborted


qemu: qemu-kvm-rhev-2.12.0-2.el7.x86_64
kernel: kernel-3.10.0-886.el7.x86_64


qemu command line:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga qxl \
    -device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x3,chassis_nr=1 \
    -device intel-hda,bus=pci.0,addr=0x4 \
    -device hda-duplex  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_grYFg2/monitor-qmpmonitor1-20180519-122305-qn3U5YMl,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_grYFg2/monitor-catch_monitor-20180519-122305-qn3U5YMl,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idYYrF14  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_grYFg2/serial-serial0-20180519-122305-qn3U5YMl,server,nowait \
    -device isa-serial,chardev=serial_id_serial0 \
    -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=0x5 \
    -chardev socket,path=/var/tmp/avocado_grYFg2/virtio_port-vs-20180519-122305-qn3U5YMl,nowait,id=idmDURkY,server \
    -device virtserialport,id=idv3qE85,name=vs,bus=virtio_serial_pci0.0,chardev=idmDURkY \
    -object rng-random,filename=/dev/random,id=passthrough-zSNfuif1 \
    -device virtio-rng-pci,id=virtio-rng-pci-eOn4YKGY,rng=passthrough-zSNfuif1,bus=pci.0,addr=0x6  \
    -chardev socket,id=seabioslog_id_20180519-122305-qn3U5YMl,path=/var/tmp/avocado_grYFg2/seabios-20180519-122305-qn3U5YMl,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180519-122305-qn3U5YMl,iobase=0x402 \
    -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \
    -device nec-usb-xhci,id=usb2,bus=pci.0,addr=0x7 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x8 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=qcow2,file=/home/kvm_autotest_root/images/win2008-sp2-32-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device virtio-net-pci,mac=9a:1e:1f:20:21:22,id=idrFnTtI,vectors=4,netdev=idYVmG0A,bus=pci.0,addr=0x9  \
    -netdev tap,id=idYVmG0A,vhost=on,vhostfd=21,fd=20 \
    -m 8192  \
    -smp 8,cores=4,threads=1,sockets=2  \
    -cpu 'Haswell-noTSX',hv_relaxed,+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=unsafe,media=cdrom,file=/home/kvm_autotest_root/iso/ISO/Win2008/32/en_windows_server_2008_datacenter_enterprise_standard_sp2_x86_dvd_342333.iso \
    -device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \
    -drive id=drive_winutils,if=none,snapshot=off,aio=threads,cache=unsafe,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \
    -device ide-cd,id=winutils,drive=drive_winutils,bus=ide.0,unit=1 \
    -drive id=drive_unattended,if=none,snapshot=off,aio=threads,cache=unsafe,media=cdrom,file=/home/kvm_autotest_root/images/win2008-sp2-32/autounattend.iso \
    -device ide-cd,id=unattended,drive=drive_unattended,bus=ide.1,unit=0 \
    -device usb-tablet,id=usb-tablet1,bus=usb2.0,port=1  \
    -spice port=3000,password=123456,addr=0,tls-port=3200,x509-dir=/tmp/spice_x509d,tls-channel=main,tls-channel=inputs,image-compression=auto_glz,zlib-glz-wan-compression=auto,streaming-video=all,agent-mouse=on,playback-compression=on,ipv4  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=d,menu=off,strict=off  \
    -no-hpet \
    -enable-kvm  \
    -watchdog i6300esb \
    -watchdog-action reset \
    -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa

Comment 10 Yanan Fu 2018-05-21 11:47:29 UTC
Hit another scenario with automation, cause qemu abort with same output.
case name: qemu_disk_img.rebase.snB.to_base

create snapshot: base--> SnA --> SnB, then rebase SnB to base, boot VM with SnB, cause qemu abort.


Test Step:
1. Boot with base image "win2016-64-virtio-scsi.qcow2".
2. write file to VM (IO, create new data)
3. shutdown VM
4. create new image base on "win2016-64-virtio-scsi.qcow2"
#qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/win2016-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/snA.qcow2 30G

5. Boot with SnA.qcow2
6. write file to VM (IO, create new data)
7. shutdown VM
8. create new image base on "SnA.qcow2"
#qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/snA.qcow2 -F qcow2 /home/kvm_autotest_root/images/snB.qcow2 30G

9. Boot with SnB.qcow2
10.write file to VM (IO, create new data)
11. shutdown VM
12. rebase snapshot to the backingfile
#qemu-img rebase -f qcow2 -b /home/kvm_autotest_root/images/win2016-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/snB.qcow2

13. Boot with SnB.qcow2
qemu abort as:
[qemu output] qemu-kvm: /builddir/build/BUILD/qemu-2.12.0/exec.c:1252: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed.



Test version:
qemu: qemu-kvm-rhev-2.12.0-2.el7.x86_64
kernel: kernel-3.10.0-886.el7.x86_64

Comment 11 Gerd Hoffmann 2018-05-29 05:19:53 UTC
upstream commit a89fe6c329799e47aaa1663650f076b28808e186

Comment 12 Gerd Hoffmann 2018-05-29 05:20:50 UTC
*** Bug 1580355 has been marked as a duplicate of this bug. ***

Comment 14 Gerd Hoffmann 2018-05-29 10:57:34 UTC
posted to rhvirt-patches.

Comment 17 lists 2018-05-30 08:56:17 UTC
I face the same (?) error on gentoo and look for a solution.
Will your patch trickle into upstream?

//

guest: windows server 2012 R2
host: gentoo amd64 server, stable

Guest shuts down without visible reason.
Today I found:

# cat windows-server.log
qemu-system-x86_64: /var/tmp/portage/app-emulation/qemu-2.11.1-r2/work/qemu-2.11.1/exec.c:1212: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed.
2018-05-30 04:20:13.979+0000: shutting down, reason=crashed

Comment 18 Miroslav Rezanina 2018-06-01 09:05:17 UTC
Fix included in qemu-kvm-rhev-2.12.0-3.el7

Comment 20 Ademar Reis 2018-06-01 16:41:29 UTC
(In reply to lists from comment #17)
> I face the same (?) error on gentoo and look for a solution.
> Will your patch trickle into upstream?
> 

Yes, see comment #11. BTW, all of our patches get merged upstream first.

Comment 22 errata-xmlrpc 2018-11-01 11:07:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443