Hide Forgot
Description of problem: I tried to run VM, it seems not to run so I pressed stop several times. I guess it caused a race that while RHEVM thought the VM is down, in essence it is still up and running in the VDSM: Possible cause: Thread-984::ERROR::2013-02-18 17:03:14,129::vm::680::vm.Vm::(_startUnderlyingVm) vmId=`2723115e-744d-40ea-8b7d-57258f2c9d37`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 642, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 1480, in _run self._domDependentInit() File "/usr/share/vdsm/libvirtvm.py", line 1354, in _domDependentInit raise Exception('destroy() called before Vm started') Exception: destroy() called before Vm started Version-Release number of selected component (if applicable): vdsm-4.10.2-1.4.el6.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 698922 [details] vdsm log
this is harmless so I suppose the exception can be a normal log message. You've killed the VM while it was still being created and we have a code to (almost) correctly handle that...well, looking at the code it's far from being bulletproof...actually, ugly! I'd say too late for current release,but let's make it better in 3.3
refactoring of that part of code didn't make it in 3.3 timeframe, pushing to 3.4...
let's see if we can refactor this in 3.5 timeframe...
*** Bug 1028045 has been marked as a duplicate of this bug. ***
additional thoughts from bug 1028045: I don't care much about vdsm logs, but in UI we shouldn't say anything if we find the VM down. To differentiate, though, we may need to do an extra check at the engine level to see if that VM is not running somewhere else (I'm thinking of a race at the end of migration before the state is updated in UI one would send poweroff to the source host where the VM doesn't run any more)
cleanup started -> POST
the current patch is the right path, but there are more pieces missing. We won't be able to fit it into 3.6, proposing to postpone
This bug is flagged for 3.6, yet the milestone is for 4.0 version, therefore the milestone has been reset. Please set the correct milestone or add the flag.
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
After some thought, I don't think we can do much better than https://gerrit.ovirt.org/#/c/55151/5 on Vdsm side; otherwise, we'll need a complete overhaul of *both* the creation and destroy flow, which could be done only for 4.1. Not sure if patches are needed for Engine side.
(In reply to Francesco Romani from comment #14) > After some thought, I don't think we can do much better than > https://gerrit.ovirt.org/#/c/55151/5 > on Vdsm side; otherwise, we'll need a complete overhaul of *both* the > creation and destroy flow, which could be done only for 4.1. > > Not sure if patches are needed for Engine side. Arik, do you think we need patches on Engine side? Could Vdsm be improved to make Engine's life easier here?
(In reply to Francesco Romani from comment #15) > (In reply to Francesco Romani from comment #14) > > After some thought, I don't think we can do much better than > > https://gerrit.ovirt.org/#/c/55151/5 > > on Vdsm side; otherwise, we'll need a complete overhaul of *both* the > > creation and destroy flow, which could be done only for 4.1. > > > > Not sure if patches are needed for Engine side. > > Arik, do you think we need patches on Engine side? Could Vdsm be improved to > make Engine's life easier here? Considering the monitoring changes in Engine since this bug was reported, we believe no further patches are required.
Verified on vdsm-4.18.10-1.el7ev.x86_64 1) Added sleep to /usr/share/vdsm/virt/vm.py self._vmCreationEvent.set() try: from time import sleep sleep(120) self._run() on the host 2) Started VM in the engine on the specific host 3) Check that host still does not have VM # virsh -r list Id Name State ---------------------------------------------------- 4) Poweroff VM Check VDSM log, do not see any traceback or ERROR related to the bug problem.