Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1205642

Summary:

Live Delete / Merge of Base snapshot on VM with BLOCK disks fail

Product:

Red Hat Enterprise Virtualization Manager

Reporter:

Kevin Alon Goldblatt <kgoldbla>

Component:

ovirt-engine

Assignee:

Adam Litke <alitke>

Status:

CLOSED DUPLICATE

QA Contact:

Kevin Alon Goldblatt <kgoldbla>

Severity:

urgent

Docs Contact:

Priority:

unspecified

Version:

3.5.1

CC:

acanan, amureini, ecohen, gklein, lpeer, lsurette, rbalakri, Rhev-m-bugs, yeylon

Target Milestone:

---

Target Release:

3.5.1

Hardware:

x86_64

OS:

Unspecified

Whiteboard:

storage

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-03-26 20:49:26 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Storage

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1196199

Attachments:

Description	Flags
server, engine and vdsm logs	none

Description Kevin Alon Goldblatt 2015-03-25 11:32:28 UTC

Description of problem:
Live Delete / Merge of the Base snapshot(1st one created) failed on a VM with BLOCK disks

Version-Release number of selected component (if applicable):
v3.5.1 vt14.1 with specialy compiled livirt rpms:
http://download.devel.redhat.com/brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64/
rhevm-3.5.1-0.2.el6ev.noarch
vdsm-4.16.12.1-3.el7ev.x86_64

How reproducible:
Tired once

Steps to Reproduce:
1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated)
2. Start the VM
3. Install file systems on all disks, write file1 to all disks, Create snapshot1
4. Write file2 to all disks, Create snapshot2
5. Write file3 to all disks, Create snapshot3
6. Power off VM
7. Custom preview with memory, all 3 snapshots consecutively and then undo the preview.
8. Start the VM again
9. Delete snapshot2 - successful
10. Power off VM, custom preview with memory both remaining snapshots consecutively and then undo the preview
11. Start the VM again
12. Delete snapshot3 - successful
13. Power off VM, custom preview with memory last remaining snapshot and then undo the preview
14. Start the VM again
13. Delete snapshot1 - FAILS

Actual results:
Live Deleting the base snapshot fails

Expected results:
Should be able to delete the snapshot successfully

Additional info:
Engine log:
2015-03-24 20:11:19,631 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-8) Correlation ID: 5e7b6c9f, Job ID: d79ab6e5-0e4c-467d-899a-986df506c486, Call Stack: null, Custo
m Event ID: -1, Message: Snapshot 'snap111' deletion for VM 'vm_test1' was initiated by admin@internal.
.
.
.
2015-03-24 20:11:19,631 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-8) Correlation ID: 5e7b6c9f, Job ID: d79ab6e5-0e4c-467d-899a-986df506c486, Call Stack: null, Custo
m Event ID: -1, Message: Snapshot 'snap111' deletion for VM 'vm_test1' was initiated by admin@internal.

Comment 1 Kevin Alon Goldblatt 2015-03-25 12:50:38 UTC

Created attachment 1006309 [details]
server, engine and vdsm logs

Adding Logs

Comment 2 Adam Litke 2015-03-26 20:49:26 UTC

After looking at this bug and Bug 1155583, I have determined that they both have the same root issue.

In the vdsm log I see messages like:

Thread-50::ERROR::2015-03-26 10:53:09,314::sampling::488::vm.Vm::(collect) vmId=`b3fbc637-9cc9-4b15-ba94-5a2ee1607785`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x3133870>
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/sampling.py", line 484, in collect
    statsFunction()
  File "/usr/share/vdsm/virt/sampling.py", line 359, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/virt/vm.py", line 292, in _highWrite
    self._vm.extendDrivesIfNeeded()
  File "/usr/share/vdsm/virt/vm.py", line 2537, in extendDrivesIfNeeded
    extend = [x for x in self._getExtendCandidates()
  File "/usr/share/vdsm/virt/vm.py", line 2489, in _getExtendCandidates
    capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
  File "/usr/share/vdsm/virt/vm.py", line 689, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 646, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: invalid argument: invalid path /rhev/data-center/dee92f86-2fef-4edf-ac51-135d12646bde/01f563fa-1aef-41ab-92d3-0d6561d4a731/images/31cfd267-e627-4cfe-8f33-f66b0db627c3/1863e178-c40a-42eb-803a-3d1f1ac6658a not assigned to domain

After talking to Eric Blake on IRC, it looks like another problem with libvirt where the domain XML is not updated before the virDomainBlockJobAbort API (used for pivoting an active layer merge) returns.  Therefore, our volume chain sync code gets an old value for the disk path which causes _highWrite to fail in this manner.

*** This bug has been marked as a duplicate of bug 1155583 ***