Bug 870928 - [rhevm] Failed create Live-Snapshot during power off VM with Windows OS
Summary: [rhevm] Failed create Live-Snapshot during power off VM with Windows OS
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.2.0
Assignee: Maor
QA Contact: vvyazmin@redhat.com
URL:
Whiteboard: storage
: 1064419 (view as bug list)
Depends On:
Blocks: 881060 915537
TreeView+ depends on / blocked
 
Reported: 2012-10-29 08:27 UTC by vvyazmin@redhat.com
Modified: 2016-02-10 20:44 UTC (History)
13 users (show)

Fixed In Version: sf3
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 881060 (view as bug list)
Environment:
Last Closed: 2013-06-11 09:16:49 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
## Logs vdsm, rhevm (1.03 MB, application/x-gzip)
2012-10-29 08:27 UTC, vvyazmin@redhat.com
no flags Details

Description vvyazmin@redhat.com 2012-10-29 08:27:26 UTC
Created attachment 634920 [details]
## Logs vdsm, rhevm

Description of problem:
Failed create Live-Snapshot during power off VM with Windows OS

Version-Release number of selected component (if applicable):
RHEVM 3.1 - SI22

RHEVM: rhevm-3.1.0-22.el6ev.noarch
VDSM: vdsm-4.9.6-39.0.el6_3.x86_64
LIBVIRT: libvirt-0.9.10-21.el6_3.5.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.2.x86_64
SANLOCK: sanlock-2.3-4.el6_3.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create VM with Windows OS installed
2. Run VM
3. Log in to VM via RDP connection
4. Create Live-Snapshot
5. During Live-Snapshot, power off VM via RDP connection.
  
Actual results:
Failed create Live-Snapshot

Expected results:
Succeed create Live-Snapshot

Additional info:

/var/log/vdsm/vdsm.log 

Thread-14713::DEBUG::2012-10-28 19:17:12,029::libvirtvm::1851::vm.Vm::(snapshot) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Snapshot failed using the quiesce flag, trying agai
n without it (internal error Guest agent not available for now)
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,029::libvirtvm::2498::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::event Saved detail 0 opaque Non
e
Thread-9631::ERROR::2012-10-28 19:17:12,029::utils::398::vm.Vm::(collect) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stats function failed: <AdvancedStatsFunction _sampleNet a
t 0x2d0e8b8>
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 394, in collect
    statsFunction()
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 271, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 182, in _sampleNet
    netSamples[nic.name] = self._vm._dom.interfaceStats(nic.name)
  File "/usr/share/vdsm/libvirtvm.py", line 509, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1734, in interfaceStats
    if ret is None: raise libvirtError ('virDomainInterfaceStats() failed', dom=self)
libvirtError: internal error /proc/net/dev: Interface not found
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,496::libvirtvm::2498::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::event Stopped detail 0 opaque N
one
Thread-14713::ERROR::2012-10-28 19:17:12,496::libvirtvm::1854::vm.Vm::(snapshot) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Unable to take snapshot
Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 1836, in snapshot
    self._dom.snapshotCreateXML(snapxml, snapFlags)
  File "/usr/share/vdsm/libvirtvm.py", line 509, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1513, in snapshotCreateXML
    if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid '67ca9c19-fd90-4b2f-8389-f1f86e969c98'
libvirtEventLoop::INFO::2012-10-28 19:17:12,497::vm::637::vm.Vm::(_onQemuDeath) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::underlying process disconnected
Thread-14713::DEBUG::2012-10-28 19:17:12,497::BindingXMLRPC::880::vds::(wrapper) return vmSnapshot with {'status': {'message': 'Snapshot failed', 'code': 48}}
libvirtEventLoop::INFO::2012-10-28 19:17:12,498::libvirtvm::2122::vm.Vm::(releaseVm) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Release VM resources
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,505::utils::335::vm.Vm::(stop) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stop statistics collection
Thread-9631::DEBUG::2012-10-28 19:17:12,505::utils::366::vm.Vm::(run) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stats thread finished
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,506::vmChannels::152::vds::(unregister) Delete fileno 45 from listener.
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,508::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/service ksmtuned retune' (cwd None)
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,551::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,552::vmChannels::152::vds::(unregister) Delete fileno 45 from listener.
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,554::task::588::TaskManager.Task::(_updateState) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::moving from state init -> state preparing
libvirtEventLoop::INFO::2012-10-28 19:17:12,554::logUtils::37::dispatcher::(wrapper) Run and protect: inappropriateDevices(thiefId='67ca9c19-fd90-4b2f-8389-f1f86e969c98')
libvirtEventLoop::INFO::2012-10-28 19:17:12,557::logUtils::39::dispatcher::(wrapper) Run and protect: inappropriateDevices, Return response: None
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,557::task::1172::TaskManager.Task::(prepare) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::finished: None
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,558::task::588::TaskManager.Task::(_updateState) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::moving from state preparing -> state finished
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,558::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,558::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,559::task::978::TaskManager.Task::(_decref) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::ref 0 aborting False
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,565::vm::929::vm.Vm::(setDownStatus) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Changed state to Down: User shut down
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,566::vmChannels::152::vds::(unregister) Delete fileno 45 from listener.
libvirtEventLoop::DEBUG::2012-10-28 19:17:12,566::utils::335::vm.Vm::(stop) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stop statistics collection
Thread-14721::DEBUG::2012-10-28 19:17:13,975::BindingXMLRPC::874::vds::(wrapper) client [10.35.97.56]::call vmGetStats with ('67ca9c19-fd90-4b2f-8389-f1f86e969c98',) {} flowID [5d2d0516]
Thread-14721::DEBUG::2012-10-28 19:17:13,975::BindingXMLRPC::880::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'hash': '4503720284452207628', 'exitMessage': 'User shut down', 'vmId': '67ca9c19-fd90-4b2f-8389-f1f86e969c98', 'timeOffset': '-2', 'exitCode': 0}]}
Thread-14722::DEBUG::2012-10-28 19:17:14,003::BindingXMLRPC::874::vds::(wrapper) client [10.35.97.56]::call vmDestroy with ('67ca9c19-fd90-4b2f-8389-f1f86e969c98',) {} flowID [5d2d0516]
Thread-14722::INFO::2012-10-28 19:17:14,003::API::301::vds::(destroy) vmContainerLock acquired by vm 67ca9c19-fd90-4b2f-8389-f1f86e969c98
Thread-14722::DEBUG::2012-10-28 19:17:14,003::libvirtvm::2167::vm.Vm::(destroy) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::destroy Called
Thread-14722::DEBUG::2012-10-28 19:17:14,004::libvirtvm::2162::vm.Vm::(deleteVm) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Total desktops after destroy of 67ca9c19-fd90-4b2f-8389-f1f86e969c98 is 0
Thread-14722::DEBUG::2012-10-28 19:17:14,004::BindingXMLRPC::880::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}}


/var/log/ovirt-engine/engine.log

2012-10-28 18:54:33,760 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (pool-4-thread-41) START, SnapshotVDSCommand(HostName = cougar05.scl.lab.tlv.redhat.
com, HostId = 196636e2-1db6-11e2-8fdb-001a4a169738, vmId=67ca9c19-fd90-4b2f-8389-f1f86e969c98), log id: 5483ddca
2012-10-28 18:54:44,911 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) Failed in SnapshotVDS method
2012-10-28 18:54:44,912 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) Error code SNAPSHOT_FAILED and error message VDSGenericException: 
VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed
2012-10-28 18:54:44,915 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) Command org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSComma
nd return value 
 Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus                       Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         48
mMessage                      Snapshot failed


2012-10-28 18:54:44,915 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) HostName = cougar05.scl.lab.tlv.redhat.com
2012-10-28 18:54:44,915 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-41) Command SnapshotVDS execution failed. Exception: VDSErrorException: VDSGenericE
xception: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed
2012-10-28 18:54:44,915 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (pool-4-thread-41) FINISH, SnapshotVDSCommand, log id: 5483ddca
2012-10-28 18:54:44,917 ERROR [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (pool-4-thread-41) Wasnt able to live snpashot due to error: VdcBLLException: VdcBLLExc
eption: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed, rolling back.
2012-10-28 18:54:44,920 ERROR [org.ovirt.engine.core.bll.CreateSnapshotCommand] (pool-4-thread-41) [2f41efed] Ending command with failure: org.ovirt.engine.core.bll.CreateSnapsh
otCommand

Comment 2 Allon Mureinik 2012-10-29 12:08:37 UTC
Engine flow is as follows:
foreach disk in VM
 create a new volume
wait for all the tasks to update

once done:
database updates
if (vm is up)
 take a live snapshot


here, the live snapshot is redundent.
However, this flow is by definition racy - the VM can always be killed between the check and the snapshot.

What needs to be checked that if the snapshot fails, and the VM is actually done, just end with a warning and don't undo the entire command.

Comment 4 Maor 2012-11-21 15:07:24 UTC
Pushed proposed solution to upstream at: http://gerrit.ovirt.org/9390
The proposed solution includes adding an audit log warning and a warning log that the snapshot could not be configured for the VM, but the operation ends successfully without rollback.

Comment 6 vvyazmin@redhat.com 2013-01-09 11:25:25 UTC
No issues are found

Verified on RHEVM 3.2 - SF03 environment 

RHEVM: rhevm-3.2.0-4.el6ev.noarch
VDSM: vdsm-4.10.2-3.0.el6ev.x86_64
LIBVIRT: libvirt-0.10.2-13.el6.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.348.el6.x86_64
SANLOCK: sanlock-2.6-2.el6.x86_64

Comment 7 Itamar Heim 2013-06-11 09:16:49 UTC
3.2 has been released

Comment 8 Itamar Heim 2013-06-11 09:41:54 UTC
3.2 has been released

Comment 9 Daniel Erez 2014-02-19 08:32:19 UTC
*** Bug 1064419 has been marked as a duplicate of this bug. ***

Comment 10 Daniel Erez 2014-07-15 07:01:02 UTC
*** Bug 1119203 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.