Bug 1095628
| Summary: | Machine type rhel6.0.0 & -vga qxl & vnc cause qemu-kvm core dump | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | FuXiangChun <xfu> | ||||
| Component: | qemu-kvm | Assignee: | Ronen Hod <rhod> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 7.0 | CC: | bcao, dblechte, hhuang, huding, juzhang, knoel, kraxel, linchen, mazhang, michen, rbalakri, rhod, sluo, virt-maint, xfu | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-10-05 07:38:17 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
(In reply to FuXiangChun from comment #0) > Expected results: > > > Additional info: > 1.Machine type rhel6.1.0 ~ rhel6.5.0 ->works > > 2.runlevel 3 ->works > > 3.rhel6.0.0 & -vga qxl & spice->works > > Question: > For windows guest. Fail to load qxl driver when booting guest with machine > type rhel6.0.0. Are they the same issue? If not, Do QE need to file another > bug to track it? Pls provide the windows operation system platform and qxl version you are using For windows guest and qxl version. QE tested win7 64bit guest with qxl-win-0.1-21 driver. I will add a device manager screenshot of win7-64 bit guest. Created attachment 893587 [details]
snapshot qxl driver
BTW, Tested rhel7.0 guest as comment0 result: 1.Guest black screen when booting rhel7.0 guest. and can not get any message via console. 2. execute system_reset via monitor. qemu-kvm core dump: (qemu) system_reset (qemu) id 0, group 0, virt start 0, virt end ffffffffffffffff, generation 0, delta 0 (/usr/libexec/qemu-kvm:53843): Spice-CRITICAL **: red_memslots.c:94:validate_virt: virtual address out of range virt=0x1000398+0xbf slot_id=1 group_id=1 slot=0x0-0x0 delta=0x0 Thread 6 (Thread 0x7f9643d0d700 (LWP 53847)): #0 0x00007f9652c63705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f9654f07f69 in qemu_cond_wait (cond=<optimized out>, mutex=mutex@entry=0x7f965571e720 <qemu_global_mutex>) at util/qemu-thread-posix.c:116 #2 0x00007f9654e04fab in qemu_kvm_wait_io_event (env=0x7f9657cb89d0) at /usr/src/debug/qemu-1.5.3/cpus.c:761 #3 qemu_kvm_cpu_thread_fn (arg=0x7f9657cb89d0) at /usr/src/debug/qemu-1.5.3/cpus.c:798 #4 0x00007f9652c5fdf3 in start_thread () from /lib64/libpthread.so.0 #5 0x00007f964f96b3dd in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f9641635700 (LWP 53848)): #0 0x00007f9652c6625d in read () from /lib64/libpthread.so.0 #1 0x00007f965065f421 in spice_backtrace_gstack () from /lib64/libspice-server.so.1 #2 0x00007f9650666d67 in spice_logv () from /lib64/libspice-server.so.1 #3 0x00007f9650666ec5 in spice_log () from /lib64/libspice-server.so.1 #4 0x00007f9650625461 in validate_virt () from /lib64/libspice-server.so.1 #5 0x00007f965062557b in get_virt () from /lib64/libspice-server.so.1 #6 0x00007f96506271cf in red_get_drawable () from /lib64/libspice-server.so.1 #7 0x00007f965063b782 in red_process_commands.constprop.139 () from /lib64/libspice-server.so.1 #8 0x00007f965063e6ab in flush_display_commands () from /lib64/libspice-server.so.1 #9 0x00007f965063ee27 in handle_dev_destroy_surfaces () from /lib64/libspice-server.so.1 #10 0x00007f9650622463 in dispatcher_handle_recv_read () from /lib64/libspice-server.so.1 #11 0x00007f9650645ff5 in red_worker_main () from /lib64/libspice-server.so.1 #12 0x00007f9652c5fdf3 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f964f96b3dd in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f9640bff700 (LWP 53850)): #0 0x00007f9652c63705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f9654f07f69 in qemu_cond_wait (cond=cond@entry=0x7f9657d53f60, mutex=mutex@entry=0x7f9657d53f90) at util/qemu-thread-posix.c:116 #2 0x00007f9654df6bf3 in vnc_worker_thread_loop (queue=queue@entry=0x7f9657d53f60) at ui/vnc-jobs.c:222 #3 0x00007f9654df7068 in vnc_worker_thread (arg=0x7f9657d53f60) at ui/vnc-jobs.c:318 #4 0x00007f9652c5fdf3 in start_thread () from /lib64/libpthread.so.0 #5 0x00007f964f96b3dd in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f95377fe700 (LWP 53957)): #0 0x00007f9652c658a0 in sem_timedwait () from /lib64/libpthread.so.0 #1 0x00007f9654f080b7 in qemu_sem_timedwait (sem=sem@entry=0x7f9657b4aac8, ms=ms@entry=10000) at util/qemu-thread-posix.c:238 #2 0x00007f9654de072c in worker_thread (opaque=0x7f9657b4aa30) at thread-pool.c:96 #3 0x00007f9652c5fdf3 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f964f96b3dd in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f9537fff700 (LWP 53958)): #0 0x00007f9652c658a0 in sem_timedwait () from /lib64/libpthread.so.0 #1 0x00007f9654f080b7 in qemu_sem_timedwait (sem=sem@entry=0x7f9657b4aac8, ms=ms@entry=10000) at util/qemu-thread-posix.c:238 #2 0x00007f9654de072c in worker_thread (opaque=0x7f9657b4aa30) at thread-pool.c:96 #3 0x00007f9652c5fdf3 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f964f96b3dd in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f9654bdaa40 (LWP 53843)): #0 0x00007f9652c6625d in read () from /lib64/libpthread.so.0 #1 0x00007f96506221a4 in read_safe () from /lib64/libspice-server.so.1 #2 0x00007f9650622657 in dispatcher_send_message () from /lib64/libspice-server.so.1 #3 0x00007f9650622aca in qxl_worker_destroy_surfaces () from /lib64/libspice-server.so.1 #4 0x00007f9654e22f76 in qxl_spice_destroy_surfaces (qxl=0x7f9657d05230, async=<optimized out>) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:247 #5 0x00007f9654e24d21 in qxl_reset_surfaces (d=0x7f9657d05230) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1303 #6 qxl_hard_reset (d=0x7f9657d05230, loadvm=0) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1153 #7 0x00007f9654d2ed69 in qdev_reset_one (dev=dev@entry=0x7f9657d05230, opaque=opaque@entry=0x0) at hw/core/qdev.c:227 #8 0x00007f9654d2e660 in qdev_walk_children (dev=dev@entry=0x7f9657d05230, devfn=devfn@entry=0x7f9654d2ed60 <qdev_reset_one>, busfn=busfn@entry=0x7f9654d2d580 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:370 #9 0x00007f9654d2e6c5 in qdev_reset_all (dev=dev@entry=0x7f9657d05230) at hw/core/qdev.c:243 #10 0x00007f9654d64479 in pci_device_reset (dev=0x7f9657d05230) at hw/pci/pci.c:180 #11 0x00007f9654d64602 in pci_bus_reset (bus=0x7f9657cd3420) at hw/pci/pci.c:226 #12 0x00007f9654d64629 in pcibus_reset (qbus=<optimized out>) at hw/pci/pci.c:233 #13 0x00007f9654d2e6f0 in qbus_walk_children (bus=bus@entry=0x7f9657cd3420, devfn=devfn@entry=0x7f9654d2ed60 <qdev_reset_one>, busfn=busfn@entry=0x7f9654d2d580 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:347 #14 0x00007f9654d2e68a in qdev_walk_children (dev=<optimized out>, devfn=devfn@entry=0x7f9654d2ed60 <qdev_reset_one>, busfn=busfn@entry=0x7f9654d2d580 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:377 #15 0x00007f9654d2e71a in qbus_walk_children (bus=<optimized out>, devfn=0x7f9654d2ed60 <qdev_reset_one>, busfn=0x7f9654d2d580 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:354 #16 0x00007f9654e0114d in qemu_devices_reset () at vl.c:1811 #17 qemu_system_reset (report=report@entry=true) at vl.c:1820 #18 0x00007f9654cbf184 in main_loop_should_exit () at vl.c:1954 #19 main_loop () at vl.c:1992 #20 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4360 Aborted This isn't a regression bug. QE tested four version qemu-kvm, all hit this issue. 1.qemu-kvm-1.5.3-19.el7.x86_64 2.qemu-kvm-1.5.3-42.el7.x86_64 3.qemu-kvm-rhev-1.5.3-60.el7ev.x86_64 4.qemu-kvm-1.5.3-60.el7_0.2.x86_64 It sounds like vnc related, not spice! according to the description: 1.rhel6.0.0 & -vga qxl & vnc -FAIL 2 rhel6.0.0 & -vga qxl & spice-> works Some background info:
The qxl device has a number of hardware revisions. lspci shows it in the guest:
[root@rhel6 ~]# lspci -s2
00:02.0 VGA compatible controller: Red Hat, Inc. Device 0100 (rev 04)
^^^^^^
rev 01: RHEL-6.0, RHEL-5
rev 02: RHEL-6.1+2
rev 03: RHEL-6.3
rev 04: RHEL-6.4+, RHEL-7.0
The jump from rev 01 to rev 02 is a pretty big one, lot of new stuff was added (including the concept of surfaces). rev 03+04 are minor refinements.
Recent guest drivers simply don't run on rev 01 hardware any more as they depend on surface support being present. That is the case for the qxl kms driver (i.e. rhel-7 guests) for sure. I think it is also true for recent versions of the userspace xorg qxl driver. Dunno about windows drivers.
Rule of thumb: If the guest runs fine on RHEL-5 qxl, it should also run fine with the rhel6.0.0 machine type.
Of course the guest should not be able to crash qemu, even in case it doesn't support rev 01 qxl devices. Initial report looks serious. Comment #5 crash looks like the guest doing something invalid and triggering a spice assert. Not exactly nice, but less critical than a segfault and the rhel7 guest will not run anyway ...
(In reply to David Blechter from comment #7) > It sounds like vnc related, not spice! according to the description: > 1.rhel6.0.0 & -vga qxl & vnc -FAIL > 2 rhel6.0.0 & -vga qxl & spice-> works Stack trace points to qxl. Most likely something in the local renderer. Probably you can also crash qemu by asking for a screen dump via monitor (when using spice instead of vnc). (gdb) print *rect
$2 = {top = 768, left = 809, bottom = 801, right = 863}
That pretty much looks like an invalid dirty rectangle.
xorg uses 1024x768 mode by default, and bottom is off limits.
IIRC we had that recently. Bug doesn't reproduce on latest upstream.
Can't spot a patch on a quick "git log" scan though.
Alon, do you remember?
I don't remember a patch. I do remember seeing this before, not that it helps. > Question:
> For windows guest. Fail to load qxl driver when booting guest with machine
> type rhel6.0.0. Are they the same issue? If not, Do QE need to file another
> bug to track it?
I also hit it that the qxl driver fail to load correctly in windows guest if use rhel6.0.0 machine type on the windows_7_ultimate_with_sp1_x86 guest with virtio-win-1.7.0-1.el7.noarch (qxl driver). Do we need to file different bug to trace it ?
Best Regards,
sluo
Re-test this bug for rhel7.0 guest. 1. Boot rhel7.0 guest with -M rhel6.0.0 & qxl & spice & runlevel5 ->gust black screen(but guest boot successfully via console) 2. Boot rhel7.0 guest with -M rhel6.0.0 & vnc & qxl & runlevel5 ->guest black screen(but guest boot successfully via console) 3.If switch guest to runlevel 3, rhel7.0 works well. Fu Xiang Chun, While you're at it can you test with various machine types in upstream qemu and report if you get the same results? Alon (In reply to Sibiao Luo from comment #12) > > Question: > > For windows guest. Fail to load qxl driver when booting guest with machine > > type rhel6.0.0. Are they the same issue? If not, Do QE need to file another > > bug to track it? > > I also hit it that the qxl driver fail to load correctly in windows guest if > use rhel6.0.0 machine type on the windows_7_ultimate_with_sp1_x86 guest with > virtio-win-1.7.0-1.el7.noarch (qxl driver). Do we need to file different bug > to trace it ? No need, since the component is either qemu or spice (probably - if we find this is not the case we can create new bugs). > > Best Regards, > sluo (In reply to Alon Levy from comment #14) > Fu Xiang Chun, > > While you're at it can you test with various machine types in upstream qemu > and report if you get the same results? > > Alon According to comment13, Re-tested it with qemu-kvm-rhev-2.0.0-1.el7ev.x86_64 for rhel7.0 guest. result: 1. Boot rhel7.0 guest with -M rhel6.0.0 & qxl & spice & runlevel5 ->gust black screen(but guest boot successfully via console) 2. Boot rhel7.0 guest with -M rhel6.0.0 & vnc & qxl & runlevel5 ->guest black screen(but guest boot successfully via console) 3. For Machine type rhel6.1.0~rhel6.5.0 & pc-i440fx-rhel7.0.0 & q35 guest boot successfully(no black screen) Alon, For rhel7.0 guest, Is this a new issue? Do QE need to file a new bug? (In reply to FuXiangChun from comment #16) > (In reply to Alon Levy from comment #14) > > Fu Xiang Chun, > > > > While you're at it can you test with various machine types in upstream qemu > > and report if you get the same results? > > > > Alon > > According to comment13, Re-tested it with qemu-kvm-rhev-2.0.0-1.el7ev.x86_64 > for rhel7.0 guest. > > result: > 1. Boot rhel7.0 guest with -M rhel6.0.0 & qxl & spice & runlevel5 ->gust > black screen(but guest boot successfully via console) > > 2. Boot rhel7.0 guest with -M rhel6.0.0 & vnc & qxl & runlevel5 ->guest > black screen(but guest boot successfully via console) > > 3. For Machine type rhel6.1.0~rhel6.5.0 & pc-i440fx-rhel7.0.0 & q35 > guest boot successfully(no black screen) > > Alon, > For rhel7.0 guest, Is this a new issue? Do QE need to file a new bug? I meant upstream, i.e. non RHEL, so no "-M rhel6.0.0", but there are many other machine configurations - qemu-system-x86_64 -M ? Supported machines are: none empty machine pc Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-1.6) pc-i440fx-1.6 Standard PC (i440FX + PIIX, 1996) (default) pc-i440fx-1.5 Standard PC (i440FX + PIIX, 1996) pc-i440fx-1.4 Standard PC (i440FX + PIIX, 1996) pc-1.3 Standard PC pc-1.2 Standard PC pc-1.1 Standard PC pc-1.0 Standard PC pc-0.15 Standard PC pc-0.14 Standard PC pc-0.13 Standard PC pc-0.12 Standard PC pc-0.11 Standard PC, qemu 0.11 pc-0.10 Standard PC, qemu 0.10 isapc ISA-only PC q35 Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-1.6) pc-q35-1.6 Standard PC (Q35 + ICH9, 2009) pc-q35-1.5 Standard PC (Q35 + ICH9, 2009) pc-q35-1.4 Standard PC (Q35 + ICH9, 2009) But since this is probably just qxl related it should be simpler then testing all of the above. wrt rhel 7 guest, again I think it's the same issue but we will see once it is tested. (unless the bug is closed). |
Description of problem: Boot RHEL6.5 guest with machine type rhel6.0.0 & -vga qxl & vnc. qemu-kvm core dump when guest load GUI(run level 5). Notes: 1. only in the following scenario, this bug can be reproduced. Boot RHEL6.5 guest with machine type rhel6.0.0 & -vga qxl & vnc & guest runlevel5. 2. I tested three version qemu-kvm, all hit this issue. so it isn't a regression bug. qemu-kvm-1.5.3-42.el7.x86_64 & qemu-kvm-rhev-1.5.3-60.el7ev.x86_64 & qemu-kvm-1.5.3-60.el7_0.2.x86_64. Version-Release number of selected component (if applicable): 3.10.0-123.el7.x86_64 qemu-kvm-1.5.3-60.el7_0.2.x86_64 How reproducible: 100% Steps to Reproduce: 1./usr/libexec/qemu-kvm -M rhel6.0.0 -cpu SandyBridge -enable-kvm -m 4096 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1,maxcpus=160 -monitor stdio -name test-all-qemu-kvm -drive file=/home/linux-guest/RHEL-Server-6.5-64-virtio.qcow2,if=none,id=drive-scsi-disk,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x13 -device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk2 -vga qxl -vnc :2 2. 3. Actual results: (gdb) bt #0 0x00007ffff2daca5b in __memcpy_ssse3_back () from /lib64/libc.so.6 #1 0x000055555569064e in memcpy (__len=16, __src=0x7ffed8312794, __dest=<optimized out>) at /usr/include/bits/string3.h:51 #2 qxl_blit (rect=0x5555566bd8b0, qxl=0x5555566abf50) at hw/display/qxl-render.c:51 #3 qxl_render_update_area_unlocked (qxl=qxl@entry=0x5555566abf50) at hw/display/qxl-render.c:140 #4 0x0000555555690940 in qxl_render_update_area_bh (opaque=0x5555566abf50) at hw/display/qxl-render.c:182 #5 0x00005555556074b7 in aio_bh_poll (ctx=ctx@entry=0x5555564e4e00) at async.c:81 #6 0x0000555555607108 in aio_poll (ctx=0x5555564e4e00, blocking=blocking@entry=false) at aio-posix.c:185 #7 0x00005555556073c0 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at async.c:194 #8 0x00007ffff74edac6 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #9 0x00005555556e173a in glib_pollfds_poll () at main-loop.c:187 #10 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:232 #11 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:464 #12 0x0000555555602e30 in main_loop () at vl.c:1988 #13 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4360 (gdb) Expected results: Additional info: 1.Machine type rhel6.1.0 ~ rhel6.5.0 ->works 2.runlevel 3 ->works 3.rhel6.0.0 & -vga qxl & spice->works Question: For windows guest. Fail to load qxl driver when booting guest with machine type rhel6.0.0. Are they the same issue? If not, Do QE need to file another bug to track it?