Hide Forgot
VDSM: 'Running'/'Paused' VMs failed to recover after vdsmd service restart when OS_Name='UNKNOWN'. Description: ************* - VDSM restart itself when SPM loses connection from master storage domain, - In case 'there's a problem' with redhat-release and OSNAME is not RHEL or RHEV, running VMs which displayed in 'virsh -r list' will fail to recover on vdsm side after vdsmd service restart. and there will be no way to re-initiate them. Note: ****** I had a problem with two instance of redhat-release-server which caused the OSNAME to return 'UNKNOWN'. [root@red-vds3 /]# virsh -r list Id Name State ---------------------------------- 1 basic_xp paused 2 igor_test running [root@red-vds3 /]# vdsClient -s 0 list {RETURNS EMPTY ... } VDSM.log: When attempting to re-initiate the VM's from within RHEVM GUI the operation failed: ***************************************************************************** </features> <cpu match="exact"> <model>Conroe</model> <topology cores="1" sockets="1" threads="1"/> </cpu> </domain> Thread-198::DEBUG::2011-06-26 16:07:10,959::vm::359::vm.Vm::(_startUnderlyingVm) vmId=`6028cc60-341a-4dd9-910c-85e804bc9d35`: :_ongoingCreations released Thread-198::INFO::2011-06-26 16:07:10,959::vm::383::vm.Vm::(_startUnderlyingVm) vmId=`6028cc60-341a-4dd9-910c-85e804bc9d35`:: The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 349, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 939, in _run self._connection.createXML(domxml, flags), File "/usr/share/vdsm/libvirtconnection.py", line 59, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1353, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Requested operation is not valid: domain is already active as 'igor_test' Thread-198::DEBUG::2011-06-26 16:07:10,964::vm::777::vm.Vm::(setDownStatus) vmId=`6028cc60-341a-4dd9-910c-85e804bc9d35`::Chan ged state to Down: Requested operation is not valid: domain is already active as 'igor_test' Thread-201::DEBUG::2011-06-26 16:07:11,148::clientIF::55::vds::(wrapper) [10.35.64.12]::call getVmStats with ('6028cc60-341a- 4dd9-910c-85e804bc9d35',) {}
Created attachment 509965 [details] full VDSM.log
UNKOWN operating system is a misconfiguration. Let's block starting VMs when that is the case. http://git.fedorahosted.org/git/?p=vdsm.git
can rhev-m pass this information, or only host-guest level info?
oops, pasted the wrong link to patch. http://gerrit.usersys.redhat.com/623 Itamar, yeah, rhev-m could send vdsm a bit of information about the host vdsm is running on. But I do not see how this helps to solve possible confusions here - it only adds another point of failure. I'd like to keep this on the the host-guest level only.
Verified - vdsm-4.9-81.el6 - omri's scenario no longer reproduces. VM's successfully recover after vdsmd restart in Omri's scenario.
A patch relating to this bug was mistakenly included in build 96, and would be reverted in the next build. Sorry for the noise.
Cannot be verify, currently blocked 735816.
Could not reproduce on 4.9-104. Closing as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2011-1782.html