Bug 806938 - [ovirt] [vdsm] call destroy returns True altough vm is not destroyed which creates split brain
[ovirt] [vdsm] call destroy returns True altough vm is not destroyed which cr...
Status: CLOSED CURRENTRELEASE
Product: oVirt
Classification: Community
Component: vdsm (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Saveliev Peter
virt
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-26 10:36 EDT by Haim
Modified: 2014-01-12 19:51 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-04-17 06:57:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Haim 2012-03-26 10:36:59 EDT
Description of problem:

potential split brain running the following scenario:

- run vm 
- cause vm to enter pause state 
- using web-admin, stop vm (or using API, call destroy)
- call destroy return with code: done, qemu-process keeps on leaving source. 

analysis: great chances its a libvirt bug. 

Following code was taken from vdsm/libvirtvm.py:
- in logs, i see we enter the exception handling, so here, we assume (libvirt reports) domain is not running, we print the exception, and set vm down, but qemu keeps on running. 

1970     def releaseVm(self):
1971         """
1972         Stop VM and release all resources
1973         """
1974         with self._releaseLock:
1975             if self._released:
1976                 return {'status': doneCode}
1977 
1978             self.log.info('Release VM resources')
1979             self.lastStatus = 'Powering down'
1980             try:
1981                 if self._vmStats:
1982                     self._vmStats.stop()
1983                 if self.guestAgent:
1984                     self.guestAgent.stop()
1985                 if self._dom:
1986                     self._dom.destroy()
1987             except libvirt.libvirtError, e:
1988                 if e.get_error_code() == libvirt.VIR_ERR_NO_DOMAIN:
1989                     self.log.warning("libvirt domain not found", exc_info=True)
1990                 else:
1991                     self.log.warn("VM %s is not running", self.conf['vmId'])
1992 
1993             self.cif.ksmMonitor.adjust()
1994             self._cleanup()
1995 
1996             self.cif.irs.inappropriateDevices(self.id)


Logs:


Thread-5161::DEBUG::2012-03-26 14:23:34,417::BindingXMLRPC::869::vds::(wrapper) client [10.35.97.30]::call vmDestroy with ('9e669b36-cf3b-4cef-81c7-5cd5a522bfcc',) {} flowID [38746fd3]
Thread-5161::INFO::2012-03-26 14:23:34,417::API::300::vds::(destroy) vmContainerLock acquired by vm 9e669b36-cf3b-4cef-81c7-5cd5a522bfcc
Thread-5161::DEBUG::2012-03-26 14:23:34,417::libvirtvm::2016::vm.Vm::(destroy) vmId=`9e669b36-cf3b-4cef-81c7-5cd5a522bfcc`::destroy Called
Thread-5161::INFO::2012-03-26 14:23:34,418::libvirtvm::1978::vm.Vm::(releaseVm) vmId=`9e669b36-cf3b-4cef-81c7-5cd5a522bfcc`::Release VM resources
Thread-5161::WARNING::2012-03-26 14:23:34,418::vm::327::vm.Vm::(_set_lastStatus) vmId=`9e669b36-cf3b-4cef-81c7-5cd5a522bfcc`::trying to set state to Powering down when already Down
Thread-5161::DEBUG::2012-03-26 14:23:34,418::utils::336::vm.Vm::(stop) vmId=`9e669b36-cf3b-4cef-81c7-5cd5a522bfcc`::Stop statistics collection
Thread-5161::DEBUG::2012-03-26 14:23:34,419::vmChannels::152::vds::(unregister) Delete fileno 22 from listener.
Thread-5161::WARNING::2012-03-26 14:23:34,421::libvirtvm::1989::vm.Vm::(releaseVm) vmId=`9e669b36-cf3b-4cef-81c7-5cd5a522bfcc`::libvirt domain not found
Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 1986, in releaseVm
    self._dom.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 490, in f
    ret = attr(*args, **kwargs)
  File "/usr/share/vdsm/libvirtconnection.py", line 82, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 658, in destroy
    if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid '9e669b36-cf3b-4cef-81c7-5cd5a522bfcc'
Thread-5161::DEBUG::2012-03-26 14:23:34,421::utils::602::Storage.Misc.excCmd::(execCmd) '/usr/bin/sudo -n /sbin/service ksmtuned retune' (cwd None)
Thread-5161::DEBUG::2012-03-26 14:23:34,456::utils::602::Storage.Misc.excCmd::(execCmd) FAILED: <err> = 'Unknown operation retune\n'; <rc> = 1
Thread-5161::DEBUG::2012-03-26 14:23:34,457::vmChannels::152::vds::(unregister) Delete fileno 22 from listener.
Thread-5161::DEBUG::2012-03-26 14:23:34,458::task::588::TaskManager.Task::(_updateState) Task=`b2c09668-9f5d-4660-b2e9-a41c0619fdf4`::moving from state init -> state preparing
Thread-5161::INFO::2012-03-26 14:23:34,458::logUtils::37::dispatcher::(wrapper) Run and protect: inappropriateDevices(thiefId='9e669b36-cf3b-4cef-81c7-5cd5a522bfcc')
Thread-5161::INFO::2012-03-26 14:23:34,461::logUtils::39::dispatcher::(wrapper) Run and protect: inappropriateDevices, Return response: None
Thread-5161::DEBUG::2012-03-26 14:23:34,461::task::1172::TaskManager.Task::(prepare) Task=`b2c09668-9f5d-4660-b2e9-a41c0619fdf4`::finished: None
Thread-5161::DEBUG::2012-03-26 14:23:34,462::task::588::TaskManager.Task::(_updateState) Task=`b2c09668-9f5d-4660-b2e9-a41c0619fdf4`::moving from state preparing -> state finished
Thread-5161::DEBUG::2012-03-26 14:23:34,462::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-5161::DEBUG::2012-03-26 14:23:34,462::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-5161::DEBUG::2012-03-26 14:23:34,463::task::978::TaskManager.Task::(_decref) Task=`b2c09668-9f5d-4660-b2e9-a41c0619fdf4`::ref 0 aborting False
Thread-5161::DEBUG::2012-03-26 14:23:34,463::libvirtvm::2011::vm.Vm::(deleteVm) vmId=`9e669b36-cf3b-4cef-81c7-5cd5a522bfcc`::Total desktops after destroy of 9e669b36-cf3b-4cef-81c7-5cd5a522bfcc is 0
Thread-5161::DEBUG::2012-03-26 14:23:34,464::BindingXMLRPC::875::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}}
Comment 1 Saveliev Peter 2013-04-15 07:07:40 EDT
I can not reproduce it on libvirt build 0.10.2-4.3, so maybe it is better to install updates. Or maybe not: it depends on what the version do you use now.

Can you please specify versions of:

vdsm
libvirt
qemu

you used to get this issue?

Note You need to log in before you can comment on or make changes to this bug.