+++ This bug was initially created as a clone of Bug #1053733 +++ +++ This bug was initially created as a clone of Bug #960934 +++ When merging a snapshot disk, we have a transient situation where the data of the merged snapshot exists twice - in the source and target. We must make sure we have this space for a successful merge: | File Domain | Block Domain -----|-----------------------------------------|------------- qcow | preallocated : 1.1 * disk capacity |1.1 * min(used ,capacity) | sparse: 1.1 * min(used ,capacity) | -----|-----------------------------------------|------------- raw | preallocated: disk capacity |disk capacity | sparse: min(used,capacity) --- Additional comment from Vered Volansky on 2014-10-01 09:14:31 IST --- The related commands to the above scenario are RemoveSnapshotCommand and RemoveDiskSnapshotsCommand. Verify in two ways: 1. Remove a snapshot from vm tab, snapshots subtab. 2. Remove a disk snapshot from the storage tab, snapshots subtab. Say we have B + S1 + S2, all 10G. SD should have extra available space for 20G (the maximum size merge of the two snapshots). Domain with less than 20G free space should fail and 20G+ should succeed. ------------------------------------------------------------------------------- This bug is a RHEV tracker for the QE team to verify against RHEVM 3.5.0
Working on verfication, encountered the follwoing: Created a VM with 10G preallocated disk on FC domain and created a snapshot. Storage domain had 10G free space. Initiated a snapshot merge (via VM tab->snapshots subtab). The operation wasn't blocked and right after, the domain was reported to be with 0G free space: 2015-01-01 15:28:53,002 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-85) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Critical, Low disk space. fc1 domain has 0 GB of free space Vered, Allon, please advise. Attaching logs from engine and vdsm
Created attachment 974961 [details] logs 1.1.15
Also, the validation should consider also the value of FreeSpaceCriticalLowInGB, which is 5G by default (can be changed). In my setup, the value is the default (5G)
Elad, looking into this. A. Please report the actual operations you have executed, I guess deleting one snapshot out of ???. Please state exactly which snapshot that was. B. Regarding threshold, the validation does take care of this, you may take it into consideration as you will, with different numbers. The threshold issue is a clear one as to how to verify. We gave more details for the allocations, since there were questions raised from QE in the past.
Vered I think what Elad probably means is that during snapshot merge operation, oVirt-engine doesn't take into account FreeSpaceCriticalLowInGB, and reaches 0 Free space, which is not allowed by definition. I have managed to reproduce this after deleting one snapshot which was the only snapshot. Rest of the steps were per Elad's comment #1. 2015-01-04 17:49:22,922 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-41) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Critical, Low disk space. FC_1 domain has 0 GB of free space
Ori, Regarding threshold, please state the whole situation. It's not true that you can never go bellow the threshold. If there was 10G free space, as stated by Elad, the threshold validation would and should not stop this action. Threshold stops operations involving allocation only if the threshold has currently been met, with no regard to the specific storage-allocation-related operation we would like to execute. If we're low on space, we won't do it. If we're not low on space, we'll do it, even if we *will* be low on space after (or eve not enough space for the operation).
Verified with 13.6
Did the following: - Had a VM with a 10G disk attached. - Created 1 snapshot - After snapshot creation, domain had 10G free space - Initiated snapshot merge Vered, I think that threshold validation must stop the merge. User has no way to know that a simple merge operation would disable the storage domain.
Sorry, moving back to VERIFIED based on comment #7 Vered, setting need-info? for comment #8 for further discussion
Elad - Initiated snapshot merge - how? What button did you press on the webadmin? IIUC, you had one snapshot, which you deleted. That snapshot should be 1G, plus the preallocated 10G disk of the VM makes 11G. Space allocation should have failed on space allocation since 11G < 10G (free space), with no regard to the threshold validation, which should and did pass (10G >= 5G). I don't understand what's the bug's status, Kevin marked as verified on comment #7. Please clarify.
(In reply to Vered Volansky from comment #10) > Elad - > Initiated snapshot merge - how? What button did you press on the webadmin? > IIUC, you had one snapshot, which you deleted. Virtual Machines tab -> Snapshots subtab -> delete Had 1 snapshot. > That snapshot should be 1G, plus the preallocated 10G disk of the VM makes > 11G. > Space allocation should have failed on space allocation since 11G < 10G > (free space), with no regard to the threshold validation, which should and > did pass (10G >= 5G). > The snapshot merge operation wasn't blocked. > I don't understand what's the bug's status, Kevin marked as verified on > comment #7. > Please clarify. Leaving it as VERIFIED based on Kevin's verification and if it will be necessary, we'll change the status
Managed to reproduce only after the following (not yet merged) patches: http://gerrit.ovirt.org/#/c/36892/ http://gerrit.ovirt.org/#/c/36889/ Make sure verification of this bug is done on a version which consists on these bugs.
Managed to reproduce only after the following (not yet merged) patches: http://gerrit.ovirt.org/#/c/36892/ http://gerrit.ovirt.org/#/c/36889/ Make sure verification of this bug is done on a version which consists on these patches.
Verification clarification: Needed space for snapshot merge should be min(disk virtual size, deleted snapshot size + the snapshot child's size).
*** Bug 1182222 has been marked as a duplicate of this bug. ***
RHEV-M 3.5.0 has been released, closing this bug.