Red Hat Bugzilla – Full Text Bug Listing
|Summary:||[qemu-kvm] - qxl runs i/o requests synchronously|
|Product:||Red Hat Enterprise Linux 6||Reporter:||Ortal <ogvura>|
|Component:||qemu-kvm||Assignee:||Gerd Hoffmann <kraxel>|
|Status:||CLOSED ERRATA||QA Contact:||Virtualization Bugs <virt-bugs>|
|Version:||6.1||CC:||alevy, bcao, chayang, dallan, desktop-qa-list, djasa, juzhang, kraxel, marcandre.lureau, mkenneth, mkrcmari, tburke, virt-maint|
|Fixed In Version:||qemu-kvm-0.12.1.2-2.179.el6||Doc Type:||Bug Fix|
Cause qxl submits requests to the spice server thread and waits syncronously for the result. Consequence qemu is blocked for a potentially long time. Fix Use completion notification instead of waiting. Result spice server thread processes the request ayncronously without stalling qemu.
|Last Closed:||2011-12-06 10:41:03 EST||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:||565939|
Description Ortal 2011-04-27 10:50:35 EDT
Created attachment 495262 [details] libvirt.log Description of problem: LIBVIRT [0.8.7-18.el6.x86_64] - VMs become Not-Responding when opening/closing few Spice-client Console to a different VMs [Can see that DomainGetXMLDesc command got stuck] Version-Release number of selected component (if applicable): Libvirt version: 0.8.7-18.el6.x86_64 VDSM Version: How reproducible: Always Steps to Reproduce: 1. Raise up few VMs 2. Open Spice Console for all VMs 3. Attempt to close the VMs Spice-Console monitor Actual results: 1. VMs state change to Not Responding: [root@iron-vdsb ~]# vdsClient -s 0 list table 001a6b25-8dfc-4dd2-90ab-20d35a0f0495 5911 VM1_Win2k3 Up* 10.35.70.6 7c89899e-8df1-4b04-8af5-d92611f366bb 6489 Win2k3-4 Up* 62c0edd4-691f-44c6-8906-382ec36650ba 6249 Win2k3-3 Up* d5305078-0df6-4400-a137-fa8c39460514 6060 Win-XP Up* 10.35.70.7 2. libvirt.log: 17:31:14.761: 5935: debug : virDomainGetBlockInfo:4895 : dom=0x7f4dfc006900, (VM: name=VM1_Win2k3, uuid=001a6b25-8dfc-4dd2-90ab-20d35a0f0495), info=0x7f4de61f9fb0 flags=0 17:31:14.761: 5935: debug : remoteIO:10405 : Do proc=194 serial=1082 length=244 wait=0x7f4dfc423260 17:31:14.761: 5935: debug : remoteIO:10434 : Going to sleep 194 0x7f4dfc423260 0x7f4dfc4a33e0 17:31:14.761: 6093: debug : remoteIOEventLoop:10266 : Woken up from poll by other thread 17:31:14.821: 6514: debug : virDomainBlockStats:4506 : dom=0x7f4dfc267db0, (VM: name=Win2k3-4, uuid=7c89899e-8df1-4b04-8af5-d92611f366bb), path=hda, stats=0x7f4df4df7ae0, size=40 17:31:20.517: 7634: debug : virDomainGetXMLDesc:3256 : dom=0x7f4de8007d10, (VM: name=Win-XP, uuid=d5305078-0df6-4400-a137-fa8c39460514), flags=0 17:31:24.691: 7702: debug : virDomainGetXMLDesc:3256 : dom=0x7f4df0001040, (VM: name=Win2k3-3, uuid=62c0edd4-691f-44c6-8906-382ec36650ba), flags=0 17:34:21.082: 10417: debug : virDomainGetXMLDesc:3256 : dom=0x7f4dfc267db0, (VM: name=Win2k3-4, uuid=7c89899e-8df1-4b04-8af5-d92611f366bb), flags=0 17:34:25.193: 10482: debug : virDomainGetXMLDesc:3256 : dom=0x7f4de8007d10, (VM: name=Win-XP, uuid=d5305078-0df6-4400-a137-fa8c39460514), flags=0 (END) Expected results: Should be able Open/Close Spice console many times, with no problem Additional info: ** Please see attached libvirt.log
Comment 4 Michal Privoznik 2011-05-03 10:45:32 EDT
could you please attach libvirt client logs? libvirtd logs seems good.
Comment 6 Ortal 2011-05-29 07:35:59 EDT
It's been a while since this bug, sorry but I can't find the relevant logs in that host (No longer exists)
Comment 7 Marc-Andre Lureau 2011-06-05 14:23:25 EDT
Ortal, I suppose the console didn't have spice-gtk display, or did it? I can easily reproduce it with spice-gtk (integrated in virt-manager for f15). It seems to be happen more often with less optimized QXL drivers in the guest, such as the one in RHEL 6.1. All I have to do is start rhel61 VM, open the console, and do a lot of display operations (move windows terminal around quickly, while doing a "find /" for example) It always hangs at virDomainGetXMLDesc(). The end of the client libvirt log looks like: 19:56:23.082: 12918: debug : virDomainGetXMLDesc:3040 : dom=0x34ff050, (VM: name=rhel6.1, uuid=e60f906e-ac52-d3a8-dfd4-597fbd0847a2), flags=1 19:56:23.082: 12918: debug : remoteIO:10413 : Do proc=14 serial=616 length=64 wait=(nil) 19:56:23.082: 12918: debug : remoteIO:10485 : We have the buck 14 0x5362750 0x5362750
Comment 8 Marc-Andre Lureau 2011-06-16 13:19:37 EDT
I tracked this down a bit, and found that the bug is in spice server code. We discussed it with Alon, and found that flush_display_commands() is buggy: it waits for the client to read all commands, however being asynchronous and sharing the same loop, spice-gtk didn't read all, and virt-manager blocks on a query, resulting in a dead lock. We don't know how best to solve it, either by cleaning the function, releasing the lock, or adding more locks etc.. Gerd help might be welcome Thread 25 (Thread 0x7f8a62e4b700 (LWP 12200)): #0 0x0000003b3280e57d in read () from /lib64/libpthread.so.0 #1 0x0000003b34c1271a in read (__nbytes=4, __buf=0x7f8a62e4a9ec, __fd=14) at /usr/include/bits/unistd.h:45 #2 receive_data (n=4, in_buf=0x7f8a62e4a9ec, fd=14) at red_worker.h:131 #3 read_message (message=0x7f8a62e4a9ec, fd=14) at red_worker.h:144 #4 qxl_worker_update_area (qxl_worker=<optimized out>, surface_id=938, qxl_area=<optimized out>, qxl_dirty_rects=0x0, num_dirty_rects=0, clear_dirty_region=0) at red_dispatcher.c:227 #5 0x00000000005db1a4 in ioport_write (opaque=0x33b4a10, addr=<optimized out>, val=0) at /usr/src/debug/qemu-kvm-0.14.0/hw/qxl.c:991 #6 0x000000000042f4df in kvm_handle_io (count=1, size=1, direction=1, data=<optimized out>, port=49218) at /usr/src/debug/qemu-kvm-0.14.0/kvm-all.c:818 #7 kvm_run (env=0x2f88940) at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:617 #8 0x0000000000430a39 in kvm_cpu_exec (env=<optimized out>) at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:1233 #9 0x0000000000430cc1 in kvm_main_loop_cpu (env=0x2f88940) at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:1419 #10 ap_main_loop (_env=0x2f88940) at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:1466 #11 0x0000003b32807cd1 in start_thread () from /lib64/libpthread.so.0 #12 0x0000003b324e19bd in clone () from /lib64/libc.so.6 Thread 24 (Thread 0x7f8a613e8700 (LWP 12201)): #0 0x0000003b324adf5d in nanosleep () from /lib64/libc.so.6 #1 0x0000003b324db5b4 in usleep () from /lib64/libc.so.6 #2 0x0000003b34c2d68a in flush_display_commands (worker=0x7f8a61213890) at red_worker.c:8787 #3 0x0000003b34c2f232 in handle_dev_update (worker=0x7f8a61213890) at red_worker.c:9723 #4 handle_dev_input (listener=0x7f8a61213890, events=<optimized out>) at red_worker.c:9936 #5 0x0000003b34c30385 in red_worker_main (arg=<optimized out>) at red_worker.c:10286 #6 0x0000003b32807cd1 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003b324e19bd in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f8a630599c0 (LWP 12195)): #0 0x0000003b3280e29d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x0000003b32809ec7 in _L_lock_862 () from /lib64/libpthread.so.0 #2 0x0000003b32809d1b in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x0000000000431758 in kvm_mutex_lock () at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:1730 #4 qemu_mutex_lock_iothread () at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:1744 #5 0x000000000041e569 in main_loop_wait (nonblocking=<optimized out>) at /usr/src/debug/qemu-kvm-0.14.0/vl.c:1412 #6 0x0000000000431547 in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.14.0/qemu-kvm.c:1589 #7 0x000000000040b0a5 in main_loop () at /usr/src/debug/qemu-kvm-0.14.0/vl.c:1464 #8 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /usr/src/debug/qemu-kvm-0.14.0/vl.c:3236
Comment 9 Gerd Hoffmann 2011-06-17 03:47:11 EDT
Reassigning to qemu-kvm. The fundamental issue is that the qxl device runs certain long-running I/O operations synchronously. qemu-kvm needs a fix, maybe also spice-server (depends on how we'll fix it on the host side). The guest drivers will need fixes too as we'll have to extend the device interface.
Comment 10 Gerd Hoffmann 2011-07-14 10:49:48 EDT
Extending cond nak to upstream. Fixing this needs a new guest/host interface, we can't commit to that without the bits being upstream first.
Comment 19 Alon Levy 2011-08-24 10:21:10 EDT
Comment 23 Mike Cao 2011-10-19 01:51:46 EDT
Tried on qemu-kvm-0.12.1.2-2.172.el6 ,can *not* reproduce packages versions: RHEV Manager for Servers and Desktops :3.0.0_0001-40.el6 qemu-kvm-0.12.1.2-2.172.el6.x86_64 libvirt-0.9.4-7.2.el6.x86_64 vdsm-4.9-104.el6.x86_64 spice-server-0.8.2-2.el6.x86_64 steps: 1.use RHEVM installing a RHEL6u1 guest with spice 2.make it into template and create 5 new guests from the template 3.start 5 guests with spice 4.find a win08R2 host to operation RHEVM ,open the guest spice console. 5.right-click the task-bar ,then click “Close All Windows" then close them Actual Results: After steps4 ,the status of guests shows "up" Can not reproduce this issue . Hi,Ortal Is the steps above I have missed something ? Could you tell me how to reproduce this issue ? Thanks , Mike
Comment 24 Ortal 2011-10-23 04:42:47 EDT
Hi Mike, Those are the exact steps. Try to open more than 5 spice consoles, maybe in that case it will reproduce to u.. Thanks, Ortal.
Comment 25 Mike Cao 2011-10-24 23:31:36 EDT
Retry this with 12 VMs according to comment #23 and comment #24 still can not reproduce this issue .when I am trying to reproduce this one ,I always hit bug 697441 . bcao-->Gerd, Hi,Gerd As the reporter said closing the spice client will cause guest in *unknown* status which RHEVM shows . Under which circumstances qemu-kvm will cause this,Could you provide me a way how to reproduce this ? TIA, Mike
Comment 26 Mike Cao 2011-10-28 04:27:53 EDT
Tried on qemu-kvm-0.12.1.2-2.200.el6.x86_64 Repeat the steps in comment#0 more than 30 times ,(windows and linux guest both) Actual Results: on qemu-kvm-0.12.1.2-2.200.el6 ,everything works fine .no error in libvirt.log and guest still shows "UP" both in RHEVM-GUI and vdsClient -s0 list table
Comment 27 juzhang 2011-10-28 06:55:37 EDT
Hi,Gerd According to comment25,we failed to reproduce this issue but reproduce bz697441,seems bz697441 blocked our to reproduce this issue.according to comment26,with qemu-kvm-0.12.1.2-2.200.el6.x86_64,both bz697441 and this issue works well.would you please double-check our steps,if our steps are right,can we change this issue status as verified?
Comment 28 Gerd Hoffmann 2011-10-28 07:18:05 EDT
I never reproduced it myself as I have no local RHEV-M setup. So if Ortal says in comment 24 the steps are ok and you can't reproduce it this way it should really be fixed -> set to verified is ok.
Comment 30 Gerd Hoffmann 2011-11-18 06:50:14 EST
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause qxl submits requests to the spice server thread and waits syncronously for the result. Consequence qemu is blocked for a potentially long time. Fix Use completion notification instead of waiting. Result spice server thread processes the request ayncronously without stalling qemu.
Comment 31 errata-xmlrpc 2011-12-06 10:41:03 EST
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1531.html