Description of problem: PV installs of either FC6 or RHEL5 guests on RHEL5 hosts are broken from virt-manager. The install appears to work OK, but virt-manager doesn't realise. Version-Release number of selected component (if applicable): virt-manager-0.2.6-4.el5 libvirt-0.1.8-10.el5 python-virtinst-0.99.0-1.el5 How reproducible: Seems to be about 75% failure rate Steps to Reproduce: 1. Install PV guest from virt-manager. Doesn't seem to matter which: I've tried both FC6 and RHEL5 20061218 (ie. both old and new PV FB code). Actual results: It attempts to bring up the console when the guest starts, but fails, leaving the install wizard stuck on the last page, with a python error on stdout: Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/create.py", line 425, in finish vm = self.connection.get_vm(guest.uuid) File "/usr/share/virt-manager/virtManager/connection.py", line 74, in get_vm return self.vms[uuid] KeyError: '67a48830-f0a6-77fa-32e7-b3e32675fd71' However, the domain is created, and its console can be activated by hand once it is started. Expected results: Console is brought up.
Can you provide /root/.virt-manager/virt-manager.log /var/log/xen/xend.log Dating from a time immediately after the install failed. I want to try & correlate the sequence of events from the 2 files based on timestamp to figure out where the race condition is occuring...
This happens if you try to create a new domain after a domain with the same name has been stopped. This is relatively frequent say if you gave wrong installer informations, want to restart the install, stop the domain using the "Shutdown" button, and then once it disapeared restart the creation process with the same name. You will see that the UUID in the key error is the uuid that the previous domain with the same name had. One plausible explanation could be that: - virt-manager uses libvirt to see the current list of id - libvirt uses an hypervisor call to get the id list - the new id is seen by the hypervisor but the data are not fully set up in xend (and for some reason they keep the old uuid somewhere) Daniel
Okay I was on slightly older versions, the fact I had the problem was 'normal' virt-manager-0.2.5-1.el5 python-virtinst-0.98.0-1.el5.1pvfb uploading my logs anyway Daniel
Created attachment 143915 [details] virt-manager log for that sequence, I created 3 guests one after the other
Created attachment 143916 [details] associated xend log for the 3 creations
Also, I seem to be able to reproduce this reliably now, ONLY if there is already a domU running when we try to install the new domain. If only the dom0 is running, install seems to proceed normally.
Ok, this is dependant on what config settings. You need to have 'Automatically open consoles' set to 'For new domains' - if it is set to 'Never' or 'For all domains', then the bug won't be visible. Second, you have to create a domain with the same name, twice during lifetime of virt-manager process. The second time its created, virt-manager will see the old UUID. So this is definitely a bug somewhere with the virDomainPtr objet being cached for too long (for ever?)
Creating it twice within the virt-manager session is not necessary: it only has to be seen once by virt-manager to cause the problem. If the named domain is already running when virt-manager starts, and you then kill it and try to reinstall, you'll see the same effect. This seems consistent with what I've observed so far. All instances of the bug have occurred for me while testing installs of different guests, which meant I was running repeated installs into the same domU name.
So, after much investigation I've discovered that the problem is basically that python wrapper around the underlying virDomainPtr object never gets released in the python layer. Since libvirt maintains a cache of virDomainPtr objects indexed on domain name, this means that next time you create a guest with the same name, libvirt gives back the old cached virDomainPtr object. This has the original domain's UUID fixed in it, rather than the new one. Thus we end up seeing the KeyError mentioned in comment #1. The question is why is the python layer not releasing its wrapper to the virDomainPtr object. Well, this wrapper is in turn held in virt-manager's vmmDomain class. Best debugging efforts thus far indicate there is some circular reference which is preventing python's garbage collector from releasing the vmmDomain instance. Which in turns means the virDomainPtr instance is never released. Since I have been unable to find the cause of this circular reference, I'm come up with a workaround. Explicitly set the 'vm' property of the vmmDomain object to None. This causes the underlying virDomainPtr object to be released, even though our higher level vmmDomain object is stuck with a circular reference. Testing with this workaround shows the KeyError problem goes away.
Created attachment 143951 [details] Explicitly drop reference to virDomainPtr object Nasty hack / workaround to ensure virDomainPtr object is released.
*** Bug 211624 has been marked as a duplicate of this bug. ***
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion.
*** Bug 215638 has been marked as a duplicate of this bug. ***
Built with brew: $ brew latest-pkg dist-5E virt-manager Build Tag Built by ---------------------------------------- -------------------- ---------------- virt-manager-0.2.6-7.el5 dist-5E berrange * Tue Jan 9 2007 Daniel P. Berrange <berrange> - 0.2.6-7.el5 - Explicitly drop the libvirt virDomainPtr object when a guest shuts down to avoid hanging onto objects in libvirt cache (bz 220036)
A package has been built which should help the problem described in this bug report. This report is therefore being closed with a resolution of CURRENTRELEASE. You may reopen this bug report if the solution does not work for you.