Bug 1052856
Summary: | boot vm -M rhel6.0.0 with qxl would cause qemu crash | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | xhan | ||||
Component: | spice | Assignee: | Default Assignee for SPICE Bugs <rh-spice-bugs> | ||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.0 | CC: | cfergeau, dblechte, hhuang, juzhang, marcandre.lureau, mazhang, michen, qzhang, rbalakri, shuang, tpelka, virt-maint, xhan, xuhan | ||||
Target Milestone: | rc | Keywords: | OtherQA, Reopened | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | spice-0.12.4-8.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, invalid drawing commands from guests using older computer types could cause QEMU to terminate unexpectedly.
To fix this bug, detection of drawing commands of invalid bounding box has been introduced and they are now being rejected.
As a result, QEMU no longer terminates in this situation.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-03-05 07:56:06 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1135372 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
xhan
2014-01-14 08:55:33 UTC
Created attachment 849819 [details] bt_full core file: http://fileshare.englab.nay.redhat.com/pub/section2/coredump/var/crash/bz1052856.tar trace: qxl_io_write: qid=0x0 mode=0x7fdace4e61a6 addr=0x6 val=0x19 size=0x1 async=0x0 qxl_set_mode: qid=0x0 modenr=0x19 x_res=0x400 y_res=0x300 bits=0x20 devmem=0xf8000000 guests sets mode (old spice 0.4 way, which is what rhel6.0.0 aka qxl rev-1 supports). 1024x768 [ ... events snipped ... ] qxl_spice_update_area: qid=0x0 surface_id=0x0 left=0x0 right=0x400 top=0x0 bottom=0x300 qxl_spice_update_area_rest: qid=0x0 num_dirty_rects=0x0 clear_dirty_region=0x1 qxl_interface_update_area_complete: qid=0x0 surface_id=0x0 dirty_left=0x135 dirty_right=0x2ca dirty_top=0x123 dirty_bottom=0x1dc qxl_interface_update_area_complete_rest: qid=0x0 num_updated_rects=0x1 qxl_interface_update_area_complete_schedule_bh: qid=0x0 num_dirty=0x1 qxl_render_update_area_done: cookie=0x7fdacfce8800 qxl_render_blit: stride=0xfffffffffffff000 left=0x135 right=0x2ca top=0x123 bottom=0x1dc one screen update cycle (probably requested by vnc server via update_hw) qxl_spice_update_area: qid=0x0 surface_id=0x0 left=0x0 right=0x400 top=0x0 bottom=0x300 qxl_spice_update_area_rest: qid=0x0 num_dirty_rects=0x0 clear_dirty_region=0x1 qxl_interface_update_area_complete: qid=0x0 surface_id=0x0 dirty_left=0x135 dirty_right=0x2ca dirty_top=0x123 dirty_bottom=0x1dc qxl_interface_update_area_complete_rest: qid=0x0 num_updated_rects=0x5 qxl_interface_update_area_complete_schedule_bh: qid=0x0 num_dirty=0x5 qxl_render_update_area_done: cookie=0x7fdacfce8800 qxl_render_blit: stride=0xfffffffffffff000 left=0x135 right=0x2ca top=0x123 bottom=0x1dc qxl_render_blit: stride=0xfffffffffffff000 left=0x0 right=0x400 top=0x2fe bottom=0x300 qxl_render_blit: stride=0xfffffffffffff000 left=0x31e right=0x354 top=0x300 bottom=0x320 Next screen update cycle. Third dirty rectangle returned by spice-server has out-of-bounds rectangle (bottom=0x320 > y_res=0x300). (In reply to Gerd Hoffmann from comment #3) > qxl_spice_update_area: qid=0x0 surface_id=0x0 left=0x0 right=0x400 top=0x0 > bottom=0x300 > qxl_spice_update_area_rest: qid=0x0 num_dirty_rects=0x0 > clear_dirty_region=0x1 > qxl_interface_update_area_complete: qid=0x0 surface_id=0x0 dirty_left=0x135 > dirty_right=0x2ca dirty_top=0x123 dirty_bottom=0x1dc > qxl_interface_update_area_complete_rest: qid=0x0 num_updated_rects=0x5 > qxl_interface_update_area_complete_schedule_bh: qid=0x0 num_dirty=0x5 > qxl_render_update_area_done: cookie=0x7fdacfce8800 > qxl_render_blit: stride=0xfffffffffffff000 left=0x135 right=0x2ca top=0x123 > bottom=0x1dc > qxl_render_blit: stride=0xfffffffffffff000 left=0x0 right=0x400 top=0x2fe > bottom=0x300 > qxl_render_blit: stride=0xfffffffffffff000 left=0x31e right=0x354 top=0x300 > bottom=0x320 > > Next screen update cycle. Third dirty rectangle returned by spice-server > has out-of-bounds rectangle (bottom=0x320 > y_res=0x300). where did you get the y_res from? did you manage to reproduce it? I am using a slightly modified command line, and I can't reproduce: qemu-kvm-1.5.3-47.el7.x86_64 spice-server-0.12.4-5.el7.x86_64 Can you reproduce with the following command line? thanks /usr/libexec/qemu-kvm \ -snapshot \ -name 'virt-tests-vm1'\ -sandbox off \ -M rhel6.0.0 \ -nodefaults \ -vga qxl \ -global qxl-vga.vram_size=33554432 \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20140112-142311-ZB032Q8J,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20140112-142311-ZB032Q8J,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20140112-142311-ZB032Q8J,path=/tmp/seabios-20140112-142311-ZB032Q8J,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20140112-142311-ZB032Q8J,iobase=0x402 \ -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \ -device ahci,id=ahci0,bus=pci.0,addr=04 \ -drive id=drive_image1,if=none,cache=unsafe,snapshot=off,aio=native,file=/var/lib/libvirt/images/rhel6 \ -device ide-hd,id=image1,drive=drive_image1,bus=ahci0.0,unit=0 \ -m 2048 \ -smp 1,maxcpus=1,cores=1,threads=1,sockets=2 \ -cpu 'Opteron_G2',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off \ -no-kvm-pit-reinjection \ -bios /usr/share/seabios/bios.bin \ -enable-kvm use the command in #c5, also hit this crash on host: qemu-kvm-rhev-1.5.3-52.el7.x86_64 kernel-3.10.0-107.el7.x86_64 (gdb) bt #0 0x00007f2fe5f28ef9 in __memcpy_ssse3_back () from /lib64/libc.so.6 #1 0x00007f2feb2b6c3e in qxl_render_update_area_unlocked () #2 0x00007f2feb2b6f30 in qxl_render_update_area_bh () #3 0x00007f2feb22fae7 in aio_bh_poll () #4 0x00007f2feb22f738 in aio_poll () #5 0x00007f2feb22f9f0 in aio_ctx_dispatch () #6 0x00007f2fea66bac6 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #7 0x00007f2feb307d1a in main_loop_wait () #8 0x00007f2feb22b460 in main () To reproduce this problem, it need wait for around 10 minutes after launching guest using qemu-kvm command line, then input some command to view if it is running, such as "info status". I suggest using -S and -monitor stdio \ in the command line to monitor the vm status. I haven't been able to reproduce this either, though X would fail to start with -M rhel-6.0.0 on the f20 livecd I tried. Couple of questions: - which guest OS are you testing with? - do you connect a client to the VM after starting it, or does it happen even without a client connection? - in comment #6, you mention using -monitor stdio and typing 'info status' in order to reproduce, is it required to type 'info status' in the QEMU monitor to trigger the crash? - in comment #6 you suggest using -S, I'm not sure what is the next step that should be followed to reproduce the bug after starting qemu with -S? Could you also try reducing the command line size? /usr/libexec/qemu-kvm \ -snapshot \ -name 'virt-tests-vm1'\ -sandbox off \ -M rhel6.0.0 \ -nodefaults \ -vga qxl \ -global qxl-vga.vram_size=33554432 \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20140112-142311-ZB032Q8J,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20140112-142311-ZB032Q8J,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20140112-142311-ZB032Q8J,path=/tmp/seabios-20140112-142311-ZB032Q8J,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20140112-142311-ZB032Q8J,iobase=0x402 \ -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \ -device ahci,id=ahci0,bus=pci.0,addr=04 \ -drive id=drive_image1,if=none,cache=unsafe,snapshot=off,aio=native,file=/var/lib/libvirt/images/rhel6 \ -device ide-hd,id=image1,drive=drive_image1,bus=ahci0.0,unit=0 \ -m 2048 \ -smp 1,maxcpus=1,cores=1,threads=1,sockets=2 \ -cpu 'Opteron_G2',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off \ -no-kvm-pit-reinjection \ -bios /usr/share/seabios/bios.bin \ -enable-kvm eg, is the bug still happening if you remove -sandbox off ? if you only keep something like /usr/libexec/qemu-kvm \ -snapshot \ -sandbox off \ -M rhel6.0.0 \ -nodefaults \ -vga qxl \ -global qxl-vga.vram_size=33554432 \ -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \ -device ahci,id=ahci0,bus=pci.0,addr=04 \ -drive id=drive_image1,if=none,cache=unsafe,snapshot=off,aio=native,file=/var/lib/libvirt/images/rhel6 \ -device ide-hd,id=image1,drive=drive_image1,bus=ahci0.0,unit=0 \ -m 2048 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -boot order=cdn,once=c,menu=off \ -enable-kvm (assuming qemu starts at all)? If it no longer happens with this command line, can you find out which option is required? If the bug is still reproducible this way, can you try removing more things? (-global qxl-vga.vram_size=33554432 -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 , ...) You could also try to replace -device ahci,id=ahci0,bus=pci.0,addr=04 \ -drive id=drive_image1,if=none,cache=unsafe,snapshot=off,aio=native,file=/var/lib/libvirt/images/rhel6 \ -device ide-hd,id=image1,drive=drive_image1,bus=ahci0.0,unit=0 \ with -drive file=/var/lib/libvirt/images/rhel6 In short, the shorter you can make the qemu line needed to reproduce, the better ;) Fwiw, the shorter qemu commandline I could use to reproduce is /usr/libexec/qemu-kvm \ -M rhel6.0.0 \ -nodefaults \ -vga qxl \ -drive file=rhel6 \ -vnc :0 \ -m 2048 \ -enable-kvm (haven't checked if I could get rid of -nodefault or -enable-kvm fwiw). The -m 2048 seems required to get the crash as I could not reproduce without it or with -m 512. and it doesn't crash with -M rhel6.1.0 The -M is crucial argument for the crash. So the problem would be why with -M rhel6.0.0 would cause crash. This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. Although I have not identified the root cause yet, this is due to qxl device being set to revision 1, and thus usage of COMPAT flag, altough xorg-qxl guest driver does not support it since 0.1.1-9 in rhel6. I am not sure this regression was intentional (bug 1078390) What is the version of qxl driver in your guest? It could be that guest compat structure is different (layout), since xorg driver doesn't use the definition from spice-protocol. It could also be that the qemu ring isn't flushed, and some qemu QXLDrawable are wrongly cast to QXLCompatDrawable. Qemu should probably do surface bound checking before doing qxl_blit() still investigating this (In reply to Marc-Andre Lureau from comment #15) > Although I have not identified the root cause yet, this is due to qxl device > being set to revision 1, and thus usage of COMPAT flag, altough xorg-qxl > guest driver does not support it since 0.1.1-9 in rhel6. I am not sure this > regression was intentional (bug 1078390) > > What is the version of qxl driver in your guest? I checked this issue with two guest(rhel 6.5 and rhel 7.0). Well, booting up rhel 6.5 guest with following command line then got the segfault described in above comments: /usr/libexec/qemu-kvm \ -M rhel6.0.0 \ -nodefaults \ -vga qxl \ -device ahci,id=ahci0,bus=pci.0,addr=04 \ -drive id=drive_image1,if=none,cache=unsafe,snapshot=off,aio=native,file=/home/RHEL-Server-6.5-64-virtio.qcow2 \ -device ide-hd,id=image1,drive=drive_image1,bus=ahci0.0,unit=0 \ -vnc :0 \ -m 2048 \ -enable-kvm \ -monitor stdio However, it seems going well by using spice protocol. And the rhel 7 guest will hit Bug 1043851, no matter which protocol being used. xorg-qxl version in each guest: rhel 6.5 - xorg-x11-drv-qxl-0.1.0-7.el6.x86_64 rhel 7.0 - xorg-x11-drv-qxl-0.1.1-9.el7.x86_64 (In reply to Xu Han from comment #16) > (In reply to Marc-Andre Lureau from comment #15) > > Although I have not identified the root cause yet, this is due to qxl device > > being set to revision 1, and thus usage of COMPAT flag, altough xorg-qxl > > guest driver does not support it since 0.1.1-9 in rhel6. I am not sure this > > regression was intentional (bug 1078390) > > > > What is the version of qxl driver in your guest? > rhel 6.5 - xorg-x11-drv-qxl-0.1.0-7.el6.x86_64 that version should support qxlpci version 1 > rhel 7.0 - xorg-x11-drv-qxl-0.1.1-9.el7.x86_64 that version no longer supports qxlpci version 1. Ie we have the same bug with rhel 6.6 due to rebase in bug 1078390 I can't reproduce the crash with 6.6 and xorg-x11-drv-qxl-0.1.0-7.el6.x86_64, but the display doesn't work either (it resizes 3 times with a gray gdm rectangle) * With xorg-x11-drv-qxl-0.1.0-7 the gray gdm screen comes from X crashing with gdm. (see /var/log/gdm/:0.log it is missing fbCopyRegion) even though starting Xorg manually works, the symbol no longer exists, as can be seen in compilation warnings too. also interestingly, I haven't been able to reproduce the crash that easily lately, only <5%... I think the compat code has been long unmaintained and untested, we should declare it officially deprecated. * With xorg-x11-drv-qxl-0.1.1-12 (from rebase 1078390) Black screen, no crash or X exit. Is there really any interest in maintaining the rhel6.0 machine type with spice? closing as WONTFIX, as no customers have reported using rhel 6.0 guest. reopening, as there is a similar bug 1135372 in rhel6 and this is potentially a security issue Upstream commit is http://cgit.freedesktop.org/spice/spice/commit/?id=e270edcbfd958d764e84cdbca6d403ff24fef610 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0335.html |