Previously, when VDSM was restarted during VM snapshot creation, it sometimes corrupted the VM and made it unusable. This issue was resolved and now VM is correctly rolled back to the previous state, if snapshot creation is interrupted for any reason.
DescriptionCarlos Mestre González
2015-06-26 12:55:49 UTC
Description of problem:
I've been trying this scenario when manually restarting vdsm during a snapshot creation on different times during the process expecting the rollback, and *some times* the rollback doesn't work and leave the vm unusable (corrupted?). Trying to start the vm returns NPE.
Version-Release number of selected component (if applicable):
rhevm-3.5.3.1-1.4.el6ev.noarch
How reproducible:
25%
Steps to Reproduce:
1. create a vm with multiple disks of different types
2. add a snapshot to the vm (all disks)
3. when the engine.log shows the CreateAllSnapshotsFromVmCommand, restart the vdsm in the spm host
Actual results:
- Creation of the snapshots fails as expected, but the vm becomes unusable (cannot be started, create new snapshots), example:
2015-06-26 15:25:13,585 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (ajp-/127.0.0.1:8702-1) [vms_syncAction_5a9b28e3-c382-4271] Command org.ovirt.engine.core.bll.RunVmCommand throw exception: java.lang.NullPointerException
at org.ovirt.engine.core.bll.RunVmCommand.getMemoryFromSnapshot(RunVmCommand.java:154) [bll.jar:]
Expected results:
- Creation of the snapshots fails and the vm is working.
Additional info:
- Tested using NFS storage domains
- hypervisors RHEL 7.1 with:
vdsm-4.16.20-1.el7ev.x86_64
libvirt-1.2.8-16.el7_1.3.x86_64
qemu-img-rhev-2.1.2-23.el7_1.3.x86_64
Comment 1Carlos Mestre González
2015-06-26 12:56:33 UTC
Seems that the NPE is in this line:
cachedMemoryVolumeFromSnapshot = archSupportSnapshot && FeatureSupported.memorySnapshot(getVm().getVdsGroupCompatibilityVersion()) ?
getActiveSnapshot().getMemoryVolume() : StringUtils.EMPTY;
Thus I reckon that it's more of a virt-ish issue, Michal, can one of your guys have a look?
Tal, We'll take a look, but it seems to me the actual snapshot is not aborted/reverted correctly. We can surely fix NPE, but it looks like the state of the VM is not correct, and that's more in your area
Verified: rhevm-3.6.0-0.13.master.el6
vdsm-4.17.6-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-22.el7.x86_64
sanlock-3.2.4-1.el7.x86_64
libvirt-client-1.2.17-5.el7.x86_64
Scenario:
1. create a vm with multiple disks of different types
2. add a snapshot to the vm (all disks)
3. when the engine.log shows the CreateAllSnapshotsFromVmCommand, restart the vdsm in the spm host
Actual result:
VM remains locked until VDSM is running again
4. Wail till VM is available again and preview created snapshot.
5. commit snapshot and verify VM is running properly.
Description of problem: I've been trying this scenario when manually restarting vdsm during a snapshot creation on different times during the process expecting the rollback, and *some times* the rollback doesn't work and leave the vm unusable (corrupted?). Trying to start the vm returns NPE. Version-Release number of selected component (if applicable): rhevm-3.5.3.1-1.4.el6ev.noarch How reproducible: 25% Steps to Reproduce: 1. create a vm with multiple disks of different types 2. add a snapshot to the vm (all disks) 3. when the engine.log shows the CreateAllSnapshotsFromVmCommand, restart the vdsm in the spm host Actual results: - Creation of the snapshots fails as expected, but the vm becomes unusable (cannot be started, create new snapshots), example: 2015-06-26 15:25:13,585 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (ajp-/127.0.0.1:8702-1) [vms_syncAction_5a9b28e3-c382-4271] Command org.ovirt.engine.core.bll.RunVmCommand throw exception: java.lang.NullPointerException at org.ovirt.engine.core.bll.RunVmCommand.getMemoryFromSnapshot(RunVmCommand.java:154) [bll.jar:] Expected results: - Creation of the snapshots fails and the vm is working. Additional info: - Tested using NFS storage domains - hypervisors RHEL 7.1 with: vdsm-4.16.20-1.el7ev.x86_64 libvirt-1.2.8-16.el7_1.3.x86_64 qemu-img-rhev-2.1.2-23.el7_1.3.x86_64