Bug 870928
Summary: | [rhevm] Failed create Live-Snapshot during power off VM with Windows OS | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | vvyazmin <vvyazmin> | ||||
Component: | ovirt-engine | Assignee: | Maor <mlipchuk> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | vvyazmin <vvyazmin> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.1.0 | CC: | abaron, amureini, cpelland, dyasny, ebenahar, hateya, iheim, lpeer, mkenneth, ratamir, Rhev-m-bugs, yeylon, ykaul | ||||
Target Milestone: | --- | Keywords: | ZStream | ||||
Target Release: | 3.2.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | sf3 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 881060 (view as bug list) | Environment: | |||||
Last Closed: | 2013-06-11 09:16:49 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 881060, 915537 | ||||||
Attachments: |
|
Engine flow is as follows: foreach disk in VM create a new volume wait for all the tasks to update once done: database updates if (vm is up) take a live snapshot here, the live snapshot is redundent. However, this flow is by definition racy - the VM can always be killed between the check and the snapshot. What needs to be checked that if the snapshot fails, and the VM is actually done, just end with a warning and don't undo the entire command. Pushed proposed solution to upstream at: http://gerrit.ovirt.org/9390 The proposed solution includes adding an audit log warning and a warning log that the snapshot could not be configured for the VM, but the operation ends successfully without rollback. No issues are found Verified on RHEVM 3.2 - SF03 environment RHEVM: rhevm-3.2.0-4.el6ev.noarch VDSM: vdsm-4.10.2-3.0.el6ev.x86_64 LIBVIRT: libvirt-0.10.2-13.el6.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.348.el6.x86_64 SANLOCK: sanlock-2.6-2.el6.x86_64 3.2 has been released 3.2 has been released *** Bug 1064419 has been marked as a duplicate of this bug. *** *** Bug 1119203 has been marked as a duplicate of this bug. *** |
Created attachment 634920 [details] ## Logs vdsm, rhevm Description of problem: Failed create Live-Snapshot during power off VM with Windows OS Version-Release number of selected component (if applicable): RHEVM 3.1 - SI22 RHEVM: rhevm-3.1.0-22.el6ev.noarch VDSM: vdsm-4.9.6-39.0.el6_3.x86_64 LIBVIRT: libvirt-0.9.10-21.el6_3.5.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.2.x86_64 SANLOCK: sanlock-2.3-4.el6_3.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create VM with Windows OS installed 2. Run VM 3. Log in to VM via RDP connection 4. Create Live-Snapshot 5. During Live-Snapshot, power off VM via RDP connection. Actual results: Failed create Live-Snapshot Expected results: Succeed create Live-Snapshot Additional info: /var/log/vdsm/vdsm.log Thread-14713::DEBUG::2012-10-28 19:17:12,029::libvirtvm::1851::vm.Vm::(snapshot) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Snapshot failed using the quiesce flag, trying agai n without it (internal error Guest agent not available for now) libvirtEventLoop::DEBUG::2012-10-28 19:17:12,029::libvirtvm::2498::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::event Saved detail 0 opaque Non e Thread-9631::ERROR::2012-10-28 19:17:12,029::utils::398::vm.Vm::(collect) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stats function failed: <AdvancedStatsFunction _sampleNet a t 0x2d0e8b8> Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 394, in collect statsFunction() File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 271, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 182, in _sampleNet netSamples[nic.name] = self._vm._dom.interfaceStats(nic.name) File "/usr/share/vdsm/libvirtvm.py", line 509, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1734, in interfaceStats if ret is None: raise libvirtError ('virDomainInterfaceStats() failed', dom=self) libvirtError: internal error /proc/net/dev: Interface not found libvirtEventLoop::DEBUG::2012-10-28 19:17:12,496::libvirtvm::2498::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::event Stopped detail 0 opaque N one Thread-14713::ERROR::2012-10-28 19:17:12,496::libvirtvm::1854::vm.Vm::(snapshot) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Unable to take snapshot Traceback (most recent call last): File "/usr/share/vdsm/libvirtvm.py", line 1836, in snapshot self._dom.snapshotCreateXML(snapxml, snapFlags) File "/usr/share/vdsm/libvirtvm.py", line 509, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1513, in snapshotCreateXML if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self) libvirtError: Domain not found: no domain with matching uuid '67ca9c19-fd90-4b2f-8389-f1f86e969c98' libvirtEventLoop::INFO::2012-10-28 19:17:12,497::vm::637::vm.Vm::(_onQemuDeath) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::underlying process disconnected Thread-14713::DEBUG::2012-10-28 19:17:12,497::BindingXMLRPC::880::vds::(wrapper) return vmSnapshot with {'status': {'message': 'Snapshot failed', 'code': 48}} libvirtEventLoop::INFO::2012-10-28 19:17:12,498::libvirtvm::2122::vm.Vm::(releaseVm) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Release VM resources libvirtEventLoop::DEBUG::2012-10-28 19:17:12,505::utils::335::vm.Vm::(stop) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stop statistics collection Thread-9631::DEBUG::2012-10-28 19:17:12,505::utils::366::vm.Vm::(run) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stats thread finished libvirtEventLoop::DEBUG::2012-10-28 19:17:12,506::vmChannels::152::vds::(unregister) Delete fileno 45 from listener. libvirtEventLoop::DEBUG::2012-10-28 19:17:12,508::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/service ksmtuned retune' (cwd None) libvirtEventLoop::DEBUG::2012-10-28 19:17:12,551::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0 libvirtEventLoop::DEBUG::2012-10-28 19:17:12,552::vmChannels::152::vds::(unregister) Delete fileno 45 from listener. libvirtEventLoop::DEBUG::2012-10-28 19:17:12,554::task::588::TaskManager.Task::(_updateState) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::moving from state init -> state preparing libvirtEventLoop::INFO::2012-10-28 19:17:12,554::logUtils::37::dispatcher::(wrapper) Run and protect: inappropriateDevices(thiefId='67ca9c19-fd90-4b2f-8389-f1f86e969c98') libvirtEventLoop::INFO::2012-10-28 19:17:12,557::logUtils::39::dispatcher::(wrapper) Run and protect: inappropriateDevices, Return response: None libvirtEventLoop::DEBUG::2012-10-28 19:17:12,557::task::1172::TaskManager.Task::(prepare) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::finished: None libvirtEventLoop::DEBUG::2012-10-28 19:17:12,558::task::588::TaskManager.Task::(_updateState) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::moving from state preparing -> state finished libvirtEventLoop::DEBUG::2012-10-28 19:17:12,558::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} libvirtEventLoop::DEBUG::2012-10-28 19:17:12,558::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} libvirtEventLoop::DEBUG::2012-10-28 19:17:12,559::task::978::TaskManager.Task::(_decref) Task=`1cd39d60-dde6-4646-bb62-6e1a7e4f7332`::ref 0 aborting False libvirtEventLoop::DEBUG::2012-10-28 19:17:12,565::vm::929::vm.Vm::(setDownStatus) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Changed state to Down: User shut down libvirtEventLoop::DEBUG::2012-10-28 19:17:12,566::vmChannels::152::vds::(unregister) Delete fileno 45 from listener. libvirtEventLoop::DEBUG::2012-10-28 19:17:12,566::utils::335::vm.Vm::(stop) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Stop statistics collection Thread-14721::DEBUG::2012-10-28 19:17:13,975::BindingXMLRPC::874::vds::(wrapper) client [10.35.97.56]::call vmGetStats with ('67ca9c19-fd90-4b2f-8389-f1f86e969c98',) {} flowID [5d2d0516] Thread-14721::DEBUG::2012-10-28 19:17:13,975::BindingXMLRPC::880::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'hash': '4503720284452207628', 'exitMessage': 'User shut down', 'vmId': '67ca9c19-fd90-4b2f-8389-f1f86e969c98', 'timeOffset': '-2', 'exitCode': 0}]} Thread-14722::DEBUG::2012-10-28 19:17:14,003::BindingXMLRPC::874::vds::(wrapper) client [10.35.97.56]::call vmDestroy with ('67ca9c19-fd90-4b2f-8389-f1f86e969c98',) {} flowID [5d2d0516] Thread-14722::INFO::2012-10-28 19:17:14,003::API::301::vds::(destroy) vmContainerLock acquired by vm 67ca9c19-fd90-4b2f-8389-f1f86e969c98 Thread-14722::DEBUG::2012-10-28 19:17:14,003::libvirtvm::2167::vm.Vm::(destroy) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::destroy Called Thread-14722::DEBUG::2012-10-28 19:17:14,004::libvirtvm::2162::vm.Vm::(deleteVm) vmId=`67ca9c19-fd90-4b2f-8389-f1f86e969c98`::Total desktops after destroy of 67ca9c19-fd90-4b2f-8389-f1f86e969c98 is 0 Thread-14722::DEBUG::2012-10-28 19:17:14,004::BindingXMLRPC::880::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}} /var/log/ovirt-engine/engine.log 2012-10-28 18:54:33,760 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (pool-4-thread-41) START, SnapshotVDSCommand(HostName = cougar05.scl.lab.tlv.redhat. com, HostId = 196636e2-1db6-11e2-8fdb-001a4a169738, vmId=67ca9c19-fd90-4b2f-8389-f1f86e969c98), log id: 5483ddca 2012-10-28 18:54:44,911 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) Failed in SnapshotVDS method 2012-10-28 18:54:44,912 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) Error code SNAPSHOT_FAILED and error message VDSGenericException: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed 2012-10-28 18:54:44,915 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) Command org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSComma nd return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 48 mMessage Snapshot failed 2012-10-28 18:54:44,915 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-41) HostName = cougar05.scl.lab.tlv.redhat.com 2012-10-28 18:54:44,915 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-41) Command SnapshotVDS execution failed. Exception: VDSErrorException: VDSGenericE xception: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed 2012-10-28 18:54:44,915 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (pool-4-thread-41) FINISH, SnapshotVDSCommand, log id: 5483ddca 2012-10-28 18:54:44,917 ERROR [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (pool-4-thread-41) Wasnt able to live snpashot due to error: VdcBLLException: VdcBLLExc eption: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed, rolling back. 2012-10-28 18:54:44,920 ERROR [org.ovirt.engine.core.bll.CreateSnapshotCommand] (pool-4-thread-41) [2f41efed] Ending command with failure: org.ovirt.engine.core.bll.CreateSnapsh otCommand