Description of problem: I've been trying this scenario when manually restarting vdsm during a snapshot creation on different times during the process expecting the rollback, and *some times* the rollback doesn't work and leave the vm unusable (corrupted?). Trying to start the vm returns NPE. Version-Release number of selected component (if applicable): rhevm-3.5.3.1-1.4.el6ev.noarch How reproducible: 25% Steps to Reproduce: 1. create a vm with multiple disks of different types 2. add a snapshot to the vm (all disks) 3. when the engine.log shows the CreateAllSnapshotsFromVmCommand, restart the vdsm in the spm host Actual results: - Creation of the snapshots fails as expected, but the vm becomes unusable (cannot be started, create new snapshots), example: 2015-06-26 15:25:13,585 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (ajp-/127.0.0.1:8702-1) [vms_syncAction_5a9b28e3-c382-4271] Command org.ovirt.engine.core.bll.RunVmCommand throw exception: java.lang.NullPointerException at org.ovirt.engine.core.bll.RunVmCommand.getMemoryFromSnapshot(RunVmCommand.java:154) [bll.jar:] Expected results: - Creation of the snapshots fails and the vm is working. Additional info: - Tested using NFS storage domains - hypervisors RHEL 7.1 with: vdsm-4.16.20-1.el7ev.x86_64 libvirt-1.2.8-16.el7_1.3.x86_64 qemu-img-rhev-2.1.2-23.el7_1.3.x86_64
Created attachment 1043502 [details] engine.log
Created attachment 1043503 [details] vdsm.log
Seems that the NPE is in this line: cachedMemoryVolumeFromSnapshot = archSupportSnapshot && FeatureSupported.memorySnapshot(getVm().getVdsGroupCompatibilityVersion()) ? getActiveSnapshot().getMemoryVolume() : StringUtils.EMPTY; Thus I reckon that it's more of a virt-ish issue, Michal, can one of your guys have a look?
Tal, We'll take a look, but it seems to me the actual snapshot is not aborted/reverted correctly. We can surely fix NPE, but it looks like the state of the VM is not correct, and that's more in your area
Any insights Michal?
Verified: rhevm-3.6.0-0.13.master.el6 vdsm-4.17.6-1.el7ev.noarch qemu-kvm-rhev-2.3.0-22.el7.x86_64 sanlock-3.2.4-1.el7.x86_64 libvirt-client-1.2.17-5.el7.x86_64 Scenario: 1. create a vm with multiple disks of different types 2. add a snapshot to the vm (all disks) 3. when the engine.log shows the CreateAllSnapshotsFromVmCommand, restart the vdsm in the spm host Actual result: VM remains locked until VDSM is running again 4. Wail till VM is available again and preview created snapshot. 5. commit snapshot and verify VM is running properly.