Description of problem: libvirtd hangs when an attempt to set the number of virtual CPUs for a shut-off KVM guest is made. Version-Release number of selected component (if applicable): libvirt-0.6.0-3.fc10.x86_64 How reproducible: always Steps to Reproduce: 1. testguest is a previously installed KVM guest domain. It is not currently running. 2. virsh setvcpus testguest 1 Actual results: virsh hangs. I can kill it CTRL+C. If I then run "virsh", it hangs too. "/etc/init.d/libvirtd restart" brings it back to life. Expected results: The number of VCPUs for testguest should be set to the desired value and it should not take longer than a fraction of a second. Additional info: virsh setmem and setmaxmem operations work fine, they do not make libvirtd hang.
Confirmed, libvirtd stack trace looks like: #0 0x00774416 in __kernel_vsyscall () #1 0x002dad99 in __lll_lock_wait () from /lib/libpthread.so.0 #2 0x002d6149 in _L_lock_89 () from /lib/libpthread.so.0 #3 0x002d5a52 in pthread_mutex_lock () from /lib/libpthread.so.0 #4 0x04a12bfd in virMutexLock (m=0x85087a8) at threads-pthread.c:51 #5 0x04a28fdd in virDomainObjLock (obj=0x85087a8) at domain_conf.c:3731 #6 0x04a2b065 in virDomainFindByUUID (doms=0x85042f8, uuid=0x8506a64 "�\005�\024�E��B�Z>\234�h��") at domain_conf.c:192 #7 0x08070b12 in qemudDomainGetMaxVcpus (dom=0x8506a50) at qemu_driver.c:2697 #8 0x08070cd6 in qemudDomainSetVcpus (dom=0x8506a50, nvcpus=2) at qemu_driver.c:2527 #9 0x04a22155 in virDomainSetVcpus (domain=0x8506a50, nvcpus=2) at libvirt.c:3981 #10 0x0805ce62 in remoteDispatchDomainSetVcpus (server=0x84fb5a0, client=0x8519980, conn=0x8506d08, rerr=0xb6b70218, args=0xb6b702d0, ret=0xb6b70290) at remote.c:1984 #11 0x08060664 in remoteDispatchClientRequest (server=0x84fb5a0, client=0x8519980, msg=0x855abe8) at remote.c:322 #12 0x0805537f in qemudWorker (data=0x8506a90) at qemud.c:1406 #13 0x002d451f in start_thread () from /lib/libpthread.so.0 #14 0x0020a04e in clone () from /lib/libc.so.6
Yep, stupid recursive call - one public API is calling into another public API causing it to try & re-acquire the lock it already holds. Some refactoring needed here.
Created attachment 334628 [details] Fix recursive locking
Posted upstream http://www.redhat.com/archives/libvir-list/2009-March/msg00195.html
*** Bug 489779 has been marked as a duplicate of this bug. ***
Fixed in rawhide with: * Tue Mar 17 2009 Daniel P. Berrange <berrange> - 0.6.1-4.fc11 - Avoid deadlock in setting vCPU count