Hide Forgot
Created attachment 1151799 [details] logs from engine and hypervisor Description of problem: While trying to verify bug 1270220, I did the following: Created an NFS domain resides on a sever that simulates slow files deletion using [1], created a VM with a disk resides on the slowfs domain attached, exported the VM to export domain, changed the deletion delay in the storage server to 10 sec (unlink = 10), removed the VM with the attached disk and immediately tried to import the VM with the disk (the same image ID). I got the following exception in engine.log: 2016-04-28 12:44:01,823 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (default task-14) [338bff11] ERROR, GetImagesListVDSCommand( GetImagesListVDSCommandParameters:{runAsync='true', storagePoolId='7de10d80-b113-4f60-8f7f-e70f6476432b', ignoreFailoverLimit='false', sdUUID='fb97cea4-5bf2-48fa-9ceb-b8b2e109acd4'}), exception: For input string: "_remove_me_a9187951", log id: 6238e26f 2016-04-28 12:44:01,823 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (default task-14) [338bff11] Exception: java.lang.NumberFormatException: For input string: "_remove_me_a9187951" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) [rt.jar:1.8.0_71] at java.lang.Long.parseLong(Long.java:589) [rt.jar:1.8.0_71] at java.lang.Long.valueOf(Long.java:776) [rt.jar:1.8.0_71] at java.lang.Long.decode(Long.java:928) [rt.jar:1.8.0_71] at java.util.UUID.fromString(UUID.java:198) [rt.jar:1.8.0_71] at org.ovirt.engine.core.compat.Guid.<init>(Guid.java:73) [compat.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.GetImagesListVDSCommand.executeIrsBrokerCommand(GetImagesListVDSCommand.java:23) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.executeVDSCommand(IrsBrokerCommand.java:159) [vdsbroker.jar:] ============================ Webadmdin: Operation Canceled Error while executing action: slow: General command validation failure. ============================ Version-Release number of selected component (if applicable): ovirt-engine-4.0.0-0.0.master.20160406161747.gita4ecba2.el7.centos.noarch vdsm-4.17.999-724.gitb8cb30a.el7.centos.noarch How reproducible: For the mentioned scenario (I think it depends on timing) Steps to Reproduce: 1. Create an NFS domain resides on a sever that simulates slow files deletion using [1] (can be achieved also by manipulating vdsm code so the files deletion will be slower) 2. Create a VM with a disk resides on the slowfs domain attached 3. Export the VM to export domain 4. Change the deletion delay in the storage server to 10 sec (unlink = 10) 5. remove the VM with the attached disk and immediately try to import the VM with the disk (the same image ID) to the same data domain Actual results: Import fails with the mentioned exception and error message. Expected results: Import should succeed Additional info: [1] https://github.com/nirs/slowfs/blob/master/README.md
This issue should be fixed in VDSM, getImagesList should not return images that are going to be deleted. Setting target to 4.1 as this issue is a corner case in a slow storage environment and the operation fails as it should just not in a graceful way.
Tested according to the steps in the description. VM import succeeded. Used: vdsm-4.19.6-1.el7ev.x86_64 rhevm-4.1.1.2-0.1.el7.noarch Slowfs: https://github.com/nirs/slowfs/blob/master/README.md