Created attachment 366589 [details] Proposed patch Description of problem: When querying Xen remotely (xen+ssh://) on el5.4 from a el5.3 host, virsh will raise an "unknown failure" error. Version-Release number of selected component (if applicable): libvirt-0.6.3 How reproducible: 100% reproducible Steps to Reproduce: 1. Make sure guest_dom in shut off on rhel54 host 2. Run "virsh --connect xen+ssh://rhel54 dominfo guest_dom" from rhel53 rhel53# virsh --connect xen+ssh://rhel54 dominfo guest_dom 3. Run "virsh --connect xen+ssh://rhel54 dominfo guest_dom" from rhel53 rhel53# virsh --connect xen+ssh://rhel54 vcpuinfo guest_dom" from rhel53 Actual results: [root@rhel53 ~]# virsh --connect xen+ssh://rhel54 dominfo guest_dom root@rhel54's password: Id: - Name: guest_dom UUID: 111111dd-1111-1111-1111-111111111111 libvir: error : Unknown failure <---- error message State: shut off CPU(s): 1 Max memory: 524288 kB Used memory: 524288 kB [root@rhel53 ~]# virsh --connect xen+ssh://rhel54 vcpuinfo guest_dom3 root@rhel54's password: libvir: error : Unknown failure <---- error message error: Domain shut off, virtual CPUs not present. Expected results: [root@rhel53 ~]# virsh --connect xen+ssh://rhel54 dominfo guest_dom root@rhel54's password: Id: - Name: guest_dom UUID: 111111dd-1111-1111-1111-111111111111 OS Type: Unknown, cannot get domain details State: shut off CPU(s): 1 Max memory: 524288 kB Used memory: 524288 kB [root@rhel53 ~]# virsh --connect xen+ssh://rhel54 vcpuinfo guest_dom3 root@rhel54's password: libvir: Xen error : internal error xenHypervisorGetVcpus: domain shut off or invalid error: Domain shut off, virtual CPUs not present. Additional info: Libvirt 0.6.x now sets a generic error code in case of error, "Unknown failure" that is set in virErrorGenericFailure() in src/virterror.c. However, when the domain is shut off, xenHypervisorDomainGetOSType() and xenHypervisorGetVcpus() in src/xen_internal.c will fail, but do not set the error so this end up as the generic "Unkown failure". The proposed patch attached does two things: 1) If the domain is shut off, xenHypervisorDomainGetOSType() will still return a string "Unknown, cannot get domain details" to avoid generating an error that we cannot avoid 2) xenHypervisorGetVcpus() now properly sets the error message, so "domain shut off or invalid" is raised in place of the "Unknown failure" generic error. PS: there might be other places in the code where the error might need to be set too.
Patch slightly modified posted upstream: https://www.redhat.com/archives/libvir-list/2009-December/msg00635.html however we can't change the semantic of virDomainGetOSType() which returns NULL in case of error. So I changed the submitted patch accordingly, and the result is that the first error is not corrected, that's annoying but preserving the semantic is more important. It's not such a problem in practice I think, at least the second problem is covered.
> 1) If the domain is shut off, xenHypervisorDomainGetOSType() will still return > a string "Unknown, cannot get domain details" to avoid generating an error that > we cannot avoid I don't think that is correct. The virDomainGetOSType() method should work whether the guest is running or not. If it is not working for inactive domains, that sounds like a bug in the 'xm_internal.c' driver.
Well there is no support for the OS type in the xm_internal.c driver. The xen front-end function xenUnifiedDomainGetOSType calls in sequence xenHypervisorDomainGetOSType fails because well the domain isn't running, just return NULL xenDaemonDomainGetOSType also fails because xendConfigVersion is < 3 and just return NULL xenStoreDomainGetOSType also fails to look the domain up in Xenstore. When xenUnifiedDomainGetOSType finds out there is no result it gives back NULL to the virDomainGetOSType() entry point which calls virSetConnError(), which not finding any error creates the mysterious unknown failure in virErrorGenericFailure() I don't think it's worth investigating it more at this point. The best we could do there is maybe drop the virSetConnError(domain->conn) if there was no obvious error. Daniel The problem is somehow in
libvirt-0.6.3-26.el5 has been built into dist-5E-qu-candidate with the fix Daniel
The bugs has been fixed in libvirt-0.6.3-28.el5 First to reproduce the bug on libvirt-0.6.3-25.el5 as follows: # virsh --connect xen+ssh://RHEL5u4server dominfo rhel5u4 Id: - Name: rhel5u4 UUID: 14ecf112-2994-b54b-fff3-3b651c1845af libvir: error : Unknown failure State: shut off CPU(s): 1 Max memory: 1048576 kB Used memory: 1048576 kB # virsh --connect xen+ssh://RHEL5u4server vcpuinfo rhel5u4 libvir: error : Unknown failure error: Domain shut off, virtual CPUs not present. Then, update the libvirt version to libvirt-0.6.3-28.el5, output like this: Id: - Name: rhel5u4 UUID: 14ecf112-2994-b54b-fff3-3b651c1845af libvir: error : Unknown failure State: shut off CPU(s): 1 Max memory: 1048576 kB Used memory: 1048576 kB [root@dhcp-66-70-176 ~]# virsh --connect xen+ssh://10.66.70.162 vcpuinfo rhel5u4 root.70.162's password: libvir: Xen error : internal error xenHypervisorGetVcpus: domain shut off or invalid error: Domain shut off, virtual CPUs not present.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0205.html