Bug 693203

Summary: Avoid virsh list hang when qemu becomes unresponsive
Product: Red Hat Enterprise Linux 6 Reporter: Mark Wu <dwu>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0CC: cww, dallan, dyuan, eblake, jdenemar, jyang, rwu, whuang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.2-1.el6 Doc Type: Bug Fix
Doc Text:
Running virsh list command could block indefinitely when any QEMU process tracked by libvirtd is not responding to monitor commands. With this update, virsh list doesn't require any interaction with running QEMU processes and thus can always list all domains.
Story Points: ---
Clone Of:
: 750147 (view as bug list) Environment:
Last Closed: 2011-12-06 11:04:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 750147    

Comment 2 Dave Allan 2011-04-05 02:25:25 UTC
Hi Mark,  Can you propose your patch on the upstream libvirt list?

Comment 3 Jiri Denemark 2011-04-05 10:01:18 UTC
This is not the right approach. Introducing a timeout into all monitor command send to qemu is a bad thing. I think the right approach is to have a simple API which would just return domain's state without talking to its monitor or doing other complicated stuff.

Comment 4 Dave Allan 2011-04-05 20:35:26 UTC
Regardless of the merits of the approach, that's a discussion that should really happen on the upstream list.

Comment 6 Dave Allan 2011-04-06 16:18:57 UTC
Great, thanks for doing that; I saw that there was a fair amount of feedback, if you're willing to keep at it, we'll keep reviewing.  :)  I think it's an important problem to solve.

Comment 7 Osier Yang 2011-04-07 07:04:18 UTC
This bug's fix will also fix this problem: 

http://www.redhat.com/archives/libvir-list/2011-March/msg01422.html

Comment 8 Jiri Denemark 2011-05-05 07:03:46 UTC
BTW, I've just sent patches for the new API which prevents virsh list from hanging upstream: https://www.redhat.com/archives/libvir-list/2011-May/msg00125.html

Comment 9 Jiri Denemark 2011-06-13 12:12:10 UTC
Fixed upstream by v0.9.1-137-g0eaf4d9 and v0.9.1-138-g26d9401:

commit 0eaf4d93bef43c5e0dd0f6a4610281da7d87522e
Author: Jiri Denemark <jdenemar>
Date:   Fri Apr 29 10:20:49 2011 +0200

    virsh: Prefer virDomainGetState over virDomainGetInfo

commit 26d94012f6f69ecf75dc7e04003dfd4ece1e84fd
Author: Jiri Denemark <jdenemar>
Date:   Mon May 2 11:35:29 2011 +0200

    Implement basic virDomainGetState in all drivers
    
    Reason is currently always set to 0 (i.e., *_UNKNOWN).

Comment 10 Daniel Veillard 2011-06-23 02:57:02 UTC
This should be fixed by the libvirt-0.9.2-1.el6 rebase

Comment 11 Huang Wenlong 2011-06-24 05:52:00 UTC
Test this bug with : 

libvirt-0.9.2-1.el6.x86_64
qemu-kvm-0.12.1.2-2.165.el6.x86_64
kernel-2.6.32-156.el6.x86_64
virt-manager-0.8.6-4.el6



Test steps: 

1) run two guest with virt-manager then get fedora14 guest pid 
#ps -aux |grep qemu |grep fedora14 
qemu 9266 4.1 8.9 796864 343604 ? Sl 10:32 2:21 /usr/libexec/qemu-kvm -S -M rhel6.1.0 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name fedora14 ...

2) kill -STOP $PID
kill -STOP 9266
3) check the status of process 
ps -aux |grep qemu |grep fedora14

emu 9266 3.9 8.9 796864 343604 ? Tl 10:32 2:22 /usr/libexec/qemu-kvm -S -M rhel6.1.0 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name fedora14

4)# virsh list 
Id Name State
----------------------------------
21 fedora14 running
22 rhel5.7 running


Now the virsh list do not hang , but the guest's status is "running" ,even the qemu process  is stopped , is that  guest status correct or should change to "stop" ?

Comment 12 Jiri Denemark 2011-06-24 10:40:12 UTC
The guest status is correct and it should still remain "running".

Comment 13 Huang Wenlong 2011-06-27 01:51:51 UTC
(In reply to comment #12)
> The guest status is correct and it should still remain "running".

got it ,thanks

Comment 15 Rita Wu 2011-07-06 10:32:22 UTC
Set it as VERIFIED per comment11&12

Comment 16 Jiri Denemark 2011-11-14 14:30:55 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Running virsh list command could block indefinitely when any QEMU process tracked by libvirtd is not responding to monitor commands. With this update, virsh list doesn't require any interaction with running QEMU processes and thus can always list all domains.

Comment 17 errata-xmlrpc 2011-12-06 11:04:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html