Bug 530594

Summary: restart of libvirtd causes condor_vm-gahp to hang.
Product: Red Hat Enterprise MRG Reporter: Timothy St. Clair <tstclair>
Component: condorAssignee: Timothy St. Clair <tstclair>
Status: CLOSED ERRATA QA Contact: Luigi Toscano <ltoscano>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.2CC: ltoscano, matt, tstclair
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 7.4.2-0.6 Doc Type: Bug Fix
Doc Text:
Restarting 'libvirtd' on an execute-machine which is running KVM/Xen jobs then 'vm-gahp' would continue on forever even though it had lost communication with the VM, and 'libvirtd' had terminated the VM. With this update, it is recognized that 'libvirtd' does not see the VM anymore and terminates the job.
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-14 16:15:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Timothy St. Clair 2009-10-23 16:27:04 UTC
When you restart libvirtd on an execute machine which is running KVM || XEN jobs then the vm-gahp will continue on forever even though it has lost communication to the vm, and libvirtd has terminated the vm.  

Reference Ticket upstream is: http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=883

Comment 1 Timothy St. Clair 2010-01-19 18:01:31 UTC
(In reply to comment #0)
More specifically upstream tracking ticket is: 
http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1119

as 883 is a parent ticket

Comment 2 Timothy St. Clair 2010-01-25 17:49:28 UTC
Changes have been pushed upstream to the 7.4.2 branch 

fix will be in 7.4.2-0.6

Comment 3 Luigi Toscano 2010-05-31 15:08:47 UTC
This bug is linked to some issues of libvirtd, which does not support restarting properly when there are running images at least on RHEL5.x with KVM technology (Xen seems to work). When libvirtd is restarted, the VM is leaked.

The new code promptly recognizes that libvirtd does not see the VM anymore and terminates the job. It does not change Xen behaviour, which still work after the restart.

Verified on RHEL 5.5, KVM/x86_64, Xen/i386, Xen/x86_64.

Comment 4 Martin Prpič 2010-10-07 15:52:07 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Restarting 'libvirtd' on an execute-machine which is running KVM/Xen jobs then 'vm-gahp' would continue on forever even though it had lost communication with the VM, and 'libvirtd' had terminated the VM. With this update, it is recognized that 'libvirtd' does not see the VM anymore and terminates the job.

Comment 6 errata-xmlrpc 2010-10-14 16:15:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html