Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1205642

Summary: Live Delete / Merge of Base snapshot on VM with BLOCK disks fail
Product: Red Hat Enterprise Virtualization Manager Reporter: Kevin Alon Goldblatt <kgoldbla>
Component: ovirt-engineAssignee: Adam Litke <alitke>
Status: CLOSED DUPLICATE QA Contact: Kevin Alon Goldblatt <kgoldbla>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.1CC: acanan, amureini, ecohen, gklein, lpeer, lsurette, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---   
Target Release: 3.5.1   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-26 20:49:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1196199    
Attachments:
Description Flags
server, engine and vdsm logs none

Description Kevin Alon Goldblatt 2015-03-25 11:32:28 UTC
Description of problem:
Live Delete / Merge of the Base snapshot(1st one created) failed on a VM with BLOCK disks


Version-Release number of selected component (if applicable):
v3.5.1 vt14.1 with specialy compiled livirt rpms:
http://download.devel.redhat.com/brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64/
rhevm-3.5.1-0.2.el6ev.noarch
vdsm-4.16.12.1-3.el7ev.x86_64


How reproducible:
Tired once

Steps to Reproduce:
1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated)
2. Start the VM
3. Install file systems on all disks, write file1 to all disks, Create snapshot1
4. Write file2 to all disks, Create snapshot2
5. Write file3 to all disks, Create snapshot3
6. Power off VM
7. Custom preview with memory, all 3 snapshots consecutively and then undo the preview.
8. Start the VM again
9. Delete snapshot2 - successful
10. Power off VM, custom preview with memory both remaining snapshots consecutively and then undo the preview
11. Start the VM again
12. Delete snapshot3 - successful
13. Power off VM, custom preview with memory last remaining snapshot and then undo the preview
14. Start the VM again
13. Delete snapshot1 - FAILS


Actual results:
Live Deleting the base snapshot fails

Expected results:
Should be able to delete the snapshot successfully


Additional info:
Engine log:
2015-03-24 20:11:19,631 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-8) Correlation ID: 5e7b6c9f, Job ID: d79ab6e5-0e4c-467d-899a-986df506c486, Call Stack: null, Custo
m Event ID: -1, Message: Snapshot 'snap111' deletion for VM 'vm_test1' was initiated by admin@internal.
.
.
.
2015-03-24 20:11:19,631 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-8) Correlation ID: 5e7b6c9f, Job ID: d79ab6e5-0e4c-467d-899a-986df506c486, Call Stack: null, Custo
m Event ID: -1, Message: Snapshot 'snap111' deletion for VM 'vm_test1' was initiated by admin@internal.

Comment 1 Kevin Alon Goldblatt 2015-03-25 12:50:38 UTC
Created attachment 1006309 [details]
server, engine and vdsm logs

Adding Logs

Comment 2 Adam Litke 2015-03-26 20:49:26 UTC
After looking at this bug and Bug 1155583, I have determined that they both have the same root issue.

In the vdsm log I see messages like:

Thread-50::ERROR::2015-03-26 10:53:09,314::sampling::488::vm.Vm::(collect) vmId=`b3fbc637-9cc9-4b15-ba94-5a2ee1607785`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x3133870>
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/sampling.py", line 484, in collect
    statsFunction()
  File "/usr/share/vdsm/virt/sampling.py", line 359, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/virt/vm.py", line 292, in _highWrite
    self._vm.extendDrivesIfNeeded()
  File "/usr/share/vdsm/virt/vm.py", line 2537, in extendDrivesIfNeeded
    extend = [x for x in self._getExtendCandidates()
  File "/usr/share/vdsm/virt/vm.py", line 2489, in _getExtendCandidates
    capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
  File "/usr/share/vdsm/virt/vm.py", line 689, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 646, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: invalid argument: invalid path /rhev/data-center/dee92f86-2fef-4edf-ac51-135d12646bde/01f563fa-1aef-41ab-92d3-0d6561d4a731/images/31cfd267-e627-4cfe-8f33-f66b0db627c3/1863e178-c40a-42eb-803a-3d1f1ac6658a not assigned to domain

After talking to Eric Blake on IRC, it looks like another problem with libvirt where the domain XML is not updated before the virDomainBlockJobAbort API (used for pivoting an active layer merge) returns.  Therefore, our volume chain sync code gets an old value for the disk path which causes _highWrite to fail in this manner.

*** This bug has been marked as a duplicate of bug 1155583 ***