Description of problem: When Creating a new live snapshot there's no validation that the SD indeed has enough space for the memory volumes. Note that this validation should use StorageDomainValidator.hasSpaceForClonedDisks(), and should also complement the other space validation for the actual snapshot volume. For this an addition to StorageDomainValidator should be implemented. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Create a live snapshot on a running VM with not enough memory in the VM's storage domain (more memory used by the VM than available on the SD). Actual results: This is not checked and should fail during operation. Expected results: CDA failure Additional info: Verify CDA failure with and without sufficient space on the SD - for both empty AI AND memory volumes (there are several scenarios for full coverage of all potential bugs).
Doesn't the following CDA validation in CreateAllSnapshotsFromVmCommand cover it? "if (getParameters().isSaveMemory() && Guid.Empty.equals(getStorageDomainIdForVmMemory())) { return failCanDoAction(VdcBllMessages.ACTION_TYPE_FAILED_NO_SUITABLE_DOMAIN_FOUND); }"
No, this checks the domainId is correct. The needed (and now posted) verification should be that there actually is enough space on the storage domain for the new volumes. Virt should review though.
Misunderstood comment #1. The pre-existing code is not enough, but the fix should be integrated with the above method. A full fix will be submitted.
following my discussion with vered, it fails with runtime instead of CDA
Aharon, did you mean the opposite? It should fail with CDA. If it doesn't please reopen the bug.
Just to make things clear - before the fix it failed with runtime, after with CDA (didn't check, based on our discussion) following that and as this is low severity i closed without verification downstream
this ovirt bug was fixed during 3.5.1 cycle and is included in the build, and therefore should be verified.
Discussed with ogofen, who'll post a clear message. In CreateAllSnapshotsFromVmCommand, the domain for the memory volumes is picked, and it might not be saved to the memory volumes. Need to look further, since the behaviour is not the same for his two scenarios.
The space validation when creating a live snapshot is not supported when having more that one storage domain in a datacenter, the reason is that right now, memory volumes are getting created on a 'random' domain instead of a meaningful domain (see bz #1186230) I have encountered a pattern of "memory volume creation", it is always gets created on the first domain in case that it is a block domain. steps to reproduce this bug: * have one block domain named 'block' * have one file domain named 'file' * make sure the block domain has insufficient space * make sure the file domain has sufficient free space 1.create a VM with one disk on file(RAM_SIZE > 1G) 2.run VM and create a live snapshot the operation is successful, but no memory volume is created because of insufficient space on Block domain. UI shows a memory snapshot has been created via 'snapshot overview'.
(In reply to Ori Gofen from comment #9) > the operation is successful, but no memory volume is created because of > insufficient space on Block domain. > UI shows a memory snapshot has been created via 'snapshot overview'. It should get created on the file domain. Do you have logs that show an failure?
(In reply to Vered Volansky from comment #8) > Discussed with ogofen, who'll post a clear message. > In CreateAllSnapshotsFromVmCommand, the domain for the memory volumes is > picked, and it might not be saved to the memory volumes. Took a closer look. This is saved just fine, if there's not enough space it should fail with CDA, if there is memory volumes should be saved. Only due to a race, it may happen that there's no space lest for the memory volumes (created after the disks snapshots). If that's the case, a VolumeCreationError should be thrown, while the operation still succeeds. Logs should verify id that was what happened. Ori please add logs. Is it possible the flow consisted of some other operation on the file domain that used space, making it too low for the memory volumes? Or that the block domain had enough space for memory volumes on CDA, to be used after, and before the memory volumes creation?
Created attachment 1005040 [details] logs Allon, no failure encountered, as I said when I spoke to Vered I don't see any evidence to a successful Ram volume creation via vdsm's getVolumesInfo, personally I think it's a virt issue but here's the flow. executed twice the scenario described at comment #9 first time with sufficient space on FC domain, the Ram volume has been created: please be aware, the size of this volume is in correspondence to VM's cpu memory (4.2 G) status = OK domain = 783133ad-3c52-4556-8382-5cecfb9c1bf6 capacity = 4563402752 voltype = LEAF description = parent = 00000000-0000-0000-0000-000000000000 format = RAW image = 2d5c1973-864c-4485-a627-74d80301e92e uuid = 0aa474c6-d6c2-4343-9e41-bb4c2e46607e disktype = 2 legality = LEGAL mtime = 0 apparentsize = 4563402752 truesize = 4563402752 <-- 4.2 G type = PREALLOCATED children = [] pool = ctime = 1427018665 when having insufficient space on the Block the result is that I couldn't find a volume on the File domain that meets size requirement of a ram snapshot, as I said the operation is successful and to my opinion it is a virt issue
Discussion with Ori - the memory/configuration volumes are indeed saved to the file domain, but the size does not fit the VM memory as it was set for it (lower size). Tried to reproduce and failed: 1. Had a clogged block storage domain. 2. Had another file doaim with enough space. 3. Created a new VM with 1GB disk on the file domain, with ~4M memory. 4. Ran it and created a snapshot with memory volumes. 5. Checked the mem disks id in the DB (snapshots), found them in the host under /rhev/data-center/mnt/my_file_domain/domain_id/images. 6. du -h * gave me the desired results: 4.5M for the memory domain as detected in the db. Ori, if there's something wrong with my reproduction please let me know.
4M memory? couldn't be... Anyway the behavior of ram snapshot on File domain is not related to this bug, this bug is verified on my part, all other ram related issues should be and will be opened separately. I will update here.
ovirt 3.5.2 was GA'd. closing current release.