Bug 487013

Summary: virsh setvcpus hangs with libvirt-0.6.0
Product: [Fedora] Fedora Reporter: Michal Schmidt <mschmidt>
Component: libvirtAssignee: Daniel Berrange <berrange>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 10CC: berrange, clalance, crobinso, markmc, mmilgram, veillard, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-03-25 08:04:54 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 480594    
Description Flags
Fix recursive locking none

Description Michal Schmidt 2009-02-23 12:15:35 EST
Description of problem:
libvirtd hangs when an attempt to set the number of virtual CPUs for a shut-off KVM guest is made.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. testguest is a previously installed KVM guest domain. It is not currently running.
2. virsh setvcpus testguest 1
Actual results:
virsh hangs. I can kill it CTRL+C. If I then run "virsh", it hangs too. "/etc/init.d/libvirtd restart" brings it back to life.

Expected results:
The number of VCPUs for testguest should be set to the desired value and it should not take longer than a fraction of a second.

Additional info:
virsh setmem and setmaxmem operations work fine, they do not make libvirtd hang.
Comment 1 Mark McLoughlin 2009-02-24 03:08:33 EST
Confirmed, libvirtd stack trace looks like:

#0  0x00774416 in __kernel_vsyscall ()
#1  0x002dad99 in __lll_lock_wait () from /lib/libpthread.so.0
#2  0x002d6149 in _L_lock_89 () from /lib/libpthread.so.0
#3  0x002d5a52 in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0x04a12bfd in virMutexLock (m=0x85087a8) at threads-pthread.c:51
#5  0x04a28fdd in virDomainObjLock (obj=0x85087a8) at domain_conf.c:3731
#6  0x04a2b065 in virDomainFindByUUID (doms=0x85042f8, uuid=0x8506a64 "�\005�\024�E��B�Z>\234�h��") at domain_conf.c:192
#7  0x08070b12 in qemudDomainGetMaxVcpus (dom=0x8506a50) at qemu_driver.c:2697
#8  0x08070cd6 in qemudDomainSetVcpus (dom=0x8506a50, nvcpus=2) at qemu_driver.c:2527
#9  0x04a22155 in virDomainSetVcpus (domain=0x8506a50, nvcpus=2) at libvirt.c:3981
#10 0x0805ce62 in remoteDispatchDomainSetVcpus (server=0x84fb5a0, client=0x8519980, conn=0x8506d08, rerr=0xb6b70218, 
    args=0xb6b702d0, ret=0xb6b70290) at remote.c:1984
#11 0x08060664 in remoteDispatchClientRequest (server=0x84fb5a0, client=0x8519980, msg=0x855abe8) at remote.c:322
#12 0x0805537f in qemudWorker (data=0x8506a90) at qemud.c:1406
#13 0x002d451f in start_thread () from /lib/libpthread.so.0
#14 0x0020a04e in clone () from /lib/libc.so.6
Comment 2 Daniel Berrange 2009-02-24 05:37:46 EST
Yep, stupid recursive call - one public API is calling into another public API causing it to try & re-acquire the lock it already holds. Some refactoring needed here.
Comment 3 Daniel Berrange 2009-03-10 08:15:45 EDT
Created attachment 334628 [details]
Fix recursive locking
Comment 4 Daniel Berrange 2009-03-10 08:16:06 EDT
Posted upstream

Comment 5 Marc Milgram 2009-03-12 09:58:00 EDT
*** Bug 489779 has been marked as a duplicate of this bug. ***
Comment 6 Mark McLoughlin 2009-03-25 08:04:54 EDT
Fixed in rawhide with:

* Tue Mar 17 2009 Daniel P. Berrange <berrange@redhat.com> - 0.6.1-4.fc11
- Avoid deadlock in setting vCPU count