Created attachment 1622194 [details] TC11141_logs Description of problem: seen in 2 TC's so far that fail to create VM from template and leaves the VM locked, this is why IMHO this bug is urgent. In both cases after VM from template fails we see the following exception: 2019-10-01 21:09:57,201+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-83) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Failed to complete VM vm_TestCase18982_0121045989 creation. 2019-10-01 21:09:57,229+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedThreadFactory-engineScheduled-Thread-83) [vms_create_8df9a84a-3778-40fb] Failed invoking callback end method 'onFailed' for command '6767e13a-d40e-4408-a11d-c6febe7e031b' with exception 'null', the callback is marked for end method retries but max number of retries have been attempted. The command will be marked as Failed. 2019-10-01 21:09:57,229+03 INFO [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedThreadFactory-engineScheduled-Thread-83) [vms_create_8df9a84a-3778-40fb] Exception in invoking callback of command AddVmFromTemplate (6767e13a-d40e-4408-a11d-c6febe7e031b): NullPointerException: 2019-10-01 21:09:57,229+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedThreadFactory-engineScheduled-Thread-83) [vms_create_8df9a84a-3778-40fb] Error invoking callback method 'onFailed' for 'FAILED' command '6767e13a-d40e-4408-a11d-c6febe7e031b' 2019-10-01 21:09:57,229+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedThreadFactory-engineScheduled-Thread-83) [vms_create_8df9a84a-3778-40fb] Exception: javax.ejb.EJBTransactionRolledbackException at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:203) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:420) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:148) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509) at org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81) [weld-ejb.jar:3.0.6.Final-redhat-00002] at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67) [wildfly-ejb3-7.2.4.GA-redhat-00002.jar:7.2.4.GA-redhat-00002] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) : ............................................................................. at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_222] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_222] at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) [javax.enterprise.concurrent.jar:1.0 .0.redhat-1] at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_222] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_222] at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] Caused by: java.lang.NullPointerException Version-Release number of selected component (if applicable): How reproducible: Hard to reproduce, seen only in 2 TC's so far. Trying to reproduce by running the TC's ~12 times so far did not reproduce the issue, Steps to Reproduce: 1.Create VM from template - and make sure operation fails(like restart SPM VDSM) Actual results: VM creation fails and VM and it's images remain locked. NPE seen in the engine log. Expected results: VM creation fails rollback(VM removed) smoothly without issues. No NPE should be seen. Additional info: So far issue seen in two cases in Tier3 storage suit: 1)TC11141 Copy disks from a VM that was created from a template 2)TC18982: Create VM from Template and kill 'vdsmd' on SPM with only SPM and no HSM available
Version-Release number of selected component (if applicable): ovirt-engine-4.3.6.6-0.1.el7.noarch vdsm-4.30.33-1.el7ev.x86_64 libvirt-4.5.0-23.el7_7.1.x86_64 qemu-img-rhev-2.12.0-33.el7.x86_64
I managed to reproduce this issue easily on another enviroment running TestCase18982. Create VM from Template and kill 'vdsmd' on SPM with only SPM and no HSM available
Hard to reproduce without a specific test case means it's not urgent, setting to high
(In reply to Tal Nisan from comment #3) > Hard to reproduce without a specific test case means it's not urgent, > setting to high See comment 2, this is easy to reproduce. Create VM from Template and kill 'vdsmd' on SPM , setting it back to urgent. This is urgent also because it effects: 1) VM which is locked with all its disks lock. 2) Locked VM can not be worked on or removed. 3) Deactivating a storage domain that has this VM will fail as this VM is not in the "down" state.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
To clarify we had 2 TC's here each running create VM from template: 1)TestCase11141 - create VM from template as thin + restart VDSM cause VM to be locked. Should be fixed by https://gerrit.ovirt.org/#/c/103811/ 2)TestCase18982 - create VM from template as clone + restart VDSM cause VM to be locked - NOT fixed yet , moving bug back to assigned
Verified on 4.3.6.7-0.1.el7. Tested both manual cases restarting VDSM while creating a VM from templated as thin/clone - passed, VM is not locked. Also ran failed automation TestCase18982 and TestCase11141 - passed.
This bugzilla is included in oVirt 4.3.6 async release, published on October 8th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.6 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.