Bug 2141371 - Incorrect image chain when deleting an intermediate snapshot
Summary: Incorrect image chain when deleting an intermediate snapshot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.5.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.5.3-async
: ---
Assignee: Albert Esteve
QA Contact: sshmulev
URL:
Whiteboard:
Depends On:
Blocks: 1541529
TreeView+ depends on / blocked
 
Reported: 2022-11-09 16:04 UTC by Juan Orti
Modified: 2023-01-11 11:25 UTC (History)
13 users (show)

Fixed In Version: vdsm-4.50.3.6
Doc Type: Bug Fix
Doc Text:
Previously, stale bitmaps in the base image during a cold or live internal merge caused the operation to fail. In this release, the merge operation succeeds.
Clone Of:
Environment:
Last Closed: 2023-01-11 11:25:38 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt vdsm issues 352 0 None open Live merge sometimes fails, "No space left on device" appear in the log 2022-11-16 08:09:49 UTC
Github oVirt vdsm pull 354 0 None open merge: measure base volume bitmaps 2022-11-18 14:55:00 UTC
Github oVirt vdsm pull 357 0 None open livemerge: prune bitmaps before measuring 2022-11-30 10:36:26 UTC
Github oVirt vdsm pull 360 0 None open backport: prune bitmaps fix 2022-12-13 08:44:36 UTC
Red Hat Issue Tracker RHV-48018 0 None None None 2022-11-09 16:08:09 UTC
Red Hat Knowledge Base (Solution) 6984452 0 None None None 2022-11-10 07:52:02 UTC
Red Hat Product Errata RHSA-2023:0074 0 None None None 2023-01-11 11:25:55 UTC

Description Juan Orti 2022-11-09 16:04:28 UTC
Description of problem:
When deleting a intermediate snapshot, the engine tries to synchronize a image chain in which the deleted volume still exists.

Version-Release number of selected component (if applicable):
NOTE: This environment has hotfix for bug 2123141
ovirt-engine-4.5.2.5-0.2.el8ev.noarch

How reproducible:
Happened once in customer environment. I've not been able to reproduce locally.

Steps to Reproduce:
1. VM on block-based storage has 2 snapshots: snap1, snap2. The image chain looks like:

1111-1111-1111-1111 [snap1] <- 2222-2222-2222-2222 [snap2] <- 3333-3333-3333-3333 [Active VM]

2. Delete the oldest snapshot snap1

Actual results:
- Volume 2222-2222-2222-2222 has been merged with the base volume 1111-1111-1111-1111. That's OK.
- The qcow2 volume 3333-3333-3333-3333 has 1111-1111-1111-1111 as backing file. That's OK.
- vdsm calls imageSyncVolumeChain with the volume 2222-2222-2222-2222 still present in the image chain, no idea why.

This causes that the volume 2222-2222-2222-2222 is still present in the chain in the LV tags, SD metadata and database. However the volume contents have been already merged and the qcow2 rebased to the base volume.

In this state, all future snapshot operations fail.

Expected results:
Correct image chain after the snapshot deletion.

Additional info:

Comment 26 Arik 2022-12-13 08:37:00 UTC
missing backports

Comment 34 sshmulev 2022-12-22 08:33:26 UTC
Verified.

Verifications steps:
1. Create a VM from a template and add a disk to the running VM using a qcow2 format
size: 10g
allocation: thin
storage domain: FC or iSCSI

2. Create a snapshot (snap1)
3. Fill the disk with random data: dd if=/dev/urandom bs=1M count=2555 of=/dev/sda oflag=direct conv=fsync
4. Create another snapshot (snap2)
5. Add stale bitmaps using to the base volume (snap1): 
for i in $(seq 70); do
    qemu-img bitmap --add /dev/<vg_name>/<lv_name> stale-bitmap-$i
done

6. In engine UI, delete snapshot snap1

Expected results:
Merge operation succeeds, without errors.

Actul results as expected.


Versions:
Engine-4.5.3.6-0.zstream.20221207085812.gitdecf5699b99.el8
vdsm-4.50.3.6-1.el8ev

Comment 36 errata-xmlrpc 2023-01-11 11:25:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV 4.4 SP1 [ovirt-4.5.3-3] security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:0074


Note You need to log in before you can comment on or make changes to this bug.