Description of problem: Engine fails to recreate session after Create Template failure Version-Release number of selected component (if applicable): ovirt-engine-4.1.0.3-0.1.el7.noarch rhevm-4.1.0.3-0.1.el7.noarch vdsm-4.19.4-1.el7ev.x86_64 How reproducible: Always Steps to Reproduce: 1. Create VM with 5 disks including block, nfs, ide, thin and preallocated 2. Create Template 3. Create VM from template and restart the engine during the CopyDataCommand on the HSM - engine restarts and fails with restarting the Create Template session Actual results: engine restarts and fails with restarting the Create Template session Expected results: engine restarts should succeed in restarting the Create Template session after restarting during the CopyDataCommand on the HSM Additional info: 2017-02-02 12:49:49,086+02 INFO [org.ovirt.engine.core.bll.Backend] (ServerService Thread Pool -- 62) [] Running compensation on startup for Command 'org.ovirt.engine.core.bll.AddVmFromTemplateCommand', Command Id '23eb19a2-0e99-492c-8eb4-245c8e8e6019' 2017-02-02 12:49:49,095+02 ERROR [org.ovirt.engine.core.bll.CommandsFactory] (ServerService Thread Pool -- 62) [] CommandsFactory : Failed to get type information using reflection for Class 'org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand', Command Id 'd2e9864b-5895-4a52-8cac-9a6fc0aa85c8': org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand.<init>(org.ovirt.engine.core.compat.Guid) 2017-02-02 12:49:49,095+02 ERROR [org.ovirt.engine.core.bll.CommandsFactory] (ServerService Thread Pool -- 62) [] Exception: java.lang.NoSuchMethodException: org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand.<init>(org.ovirt.engine.core.compat.Guid) at java.lang.Class.getConstructor0(Class.java:3082) [rt.jar:1.8.0_121] at java.lang.Class.getDeclaredConstructor(Class.java:2178) [rt.jar:1.8.0_121]
Created attachment 1247109 [details] server, engine and vdsm logs Adding logs
Kevin, is it really a must to have all those 5 disks? Can it be reproduced with one disk only?
(In reply to Tal Nisan from comment #2) > Kevin, is it really a must to have all those 5 disks? Can it be reproduced > with one disk only? You don't need 5 disks as long as you can ensure that the restart occurs during the copydata phase. So use a larger disk of say 10g with at least 5g of data
Liron - any updates?
Hi Yaniv, That's issue exists since that flow was introduced - unrelated to the SPDM work (it's caused by creation of default compensation context for transactive commands - bz 1399540) I believe that we should fix it as part of BZ 1399191 https://bugzilla.redhat.com/show_bug.cgi?id=1399191#c10) which will tackle that entire flow - right now the only actual effect of the error is having exception in the logs (Kevin, please correct me if i'm wrong). Tal, how do we want to proceed with it?
Postponing for 4.1.2 in the meanwhile since it's a corner case as well, note that bug 1399191 is targeted to 4.2, do we want to move it to 4.1.2 as well?
Tal, I believe that we should target both to the same version (possibly merge them) to test the flow one time after the changes (instead if multiple times) - it can be either 4.1.2 or 4.2, whatever seems better like a better fit to you.
It seems that this issue was fixed as part of another fix, I cannot reproduce this. Can you check if this still happens?
Ran the scenario again >>>>> The create vm from template failed Ran the scenario with the following code: ---------------------------------------------------- ovirt-engine-4.2.0-0.0.master.20170813134654.gitaee967b.el7.centos.noarch vdsm-4.20.2-77.gite43f776.el7.centos.x86_64 Engine-log: ----------------------------------- 2017-08-30 17:41:23,980+03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] EVENT_ID: USER_ADD_VM_STARTED(37), VM vm2_from_tp_v mstar1 creation was initiated by admin@internal-authz. 2017-08-30 17:41:24,006+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler9) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Command 'CreateCloneOfTemplate' (id: '0a7bcd a2-66b2-4ea3-b749-b7f54e202a3b') waiting on child command id: 'd81bb9ca-e4e8-4ee8-8487-3d36dedcce9d' type:'CopyImageGroupWithData' to complete 2017-08-30 17:41:24,021+03 INFO [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Lock freed to object 'EngineLock:{exclusiveLocks='[vm2_from_tp_vmstar 1=VM_NAME]', sharedLocks='[9207331b-2e8d-4f0a-9640-17f9d7c5d19e=DISK, 41414829-a55e-4083-95bc-97ffbb3035b6=DISK, b99bb2ff-72a0-4373-a576-732e7e315c71=DISK, 1c0da4dd-3bcc-413a-9de5-56a6ea8c52c4=DISK, abe22005-dde 6-4be8-9709-95aaad8b3816=DISK, be51e72f-8f51-4958-9386-2a6aa095fdad=DISK, 9077f689-a27f-43a0-8dd2-54058d83b096=DISK, 171a09a7-c70f-4600-9e95-29366e9ae2c5=TEMPLATE]'}' 2017-08-30 17:41:24,280+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler9) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Error invoking callback method 'doPolling' for 'ACTIVE' command '1caf1d41-9462-489a-992b-0d8250e3a493' 2017-08-30 17:41:24,280+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler9) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Exception: org.springframework.jdbc.UncategorizedSQLExce ption: PreparedStatementCallback; uncategorized SQLException for SQL [select * from getvdsbyvdsid(?, ?, ?)]; SQL state [null]; error code [0]; IJ031013: Interrupted attempting lock: org.jboss.jca.adapters.jdbc. local.LocalManagedConnection@52b73e3c; nested exception is java.sql.SQLException: IJ031013: Interrupted attempting lock: org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@52b73e3c at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:84) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:649) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:684) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:716) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:766) [spring-jdbc.jar:4.3.9.RELEASE] at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:152) [dal.jar:] at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:118) [dal.jar:] at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:198) [spring-jdbc.jar:4.3.9.RELEASE] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:135) [dal.jar:] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:105) [dal.jar:] at org.ovirt.engine.core.dao.VdsDaoImpl.get(VdsDaoImpl.java:56) [dal.jar:] at org.ovirt.engine.core.dao.VdsDaoImpl.get(VdsDaoImpl.java:50) [dal.jar:] at org.ovirt.engine.core.bll.HostJobCallback.childCommandsExecutionEnded(HostJobCallback.java:46) [bll.jar:] at org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:63) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:114) [bll.jar:] at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source) [:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_131] at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:83) [scheduler.jar:] at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:55) [scheduler.jar:] at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_131] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131] Caused by: java.sql.SQLException: IJ031013: Interrupted attempting lock: org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@52b73e3c at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.tryLock(BaseWrapperManagedConnection.java:402) at org.jboss.jca.adapters.jdbc.WrappedConnection.lock(WrappedConnection.java:156) at org.jboss.jca.adapters.jdbc.WrappedStatement.lock(WrappedStatement.java:138) at org.jboss.jca.adapters.jdbc.WrappedResultSet.lock(WrappedResultSet.java:6022) at org.jboss.jca.adapters.jdbc.WrappedResultSet.next(WrappedResultSet.java:2679) at org.springframework.jdbc.core.RowMapperResultSetExtractor.extractData(RowMapperResultSetExtractor.java:92) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.core.RowMapperResultSetExtractor.extractData(RowMapperResultSetExtractor.java:60) [spring-jdbc.jar:4.3.9.RELEASE] at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:697) [spring-jdbc.jar:4.3.9.RELEASE] 2017-08-30 17:41:23,889+03 INFO [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Command [id=852d3b0b-5c3b-4d56-b184-77b75964abcd]: Compensating TRANS IENT_ENTITY of org.ovirt.engine.core.common.businessentities.ReleaseMacsTransientCompensation; snapshot: org.ovirt.engine.core.common.businessentities.ReleaseMacsTransientCompensation@5c25d1d5. 2017-08-30 17:41:23,891+03 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] transaction rolled back 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.Cr eateCloneOfTemplateCommand'. 2017-08-30 17:41:23,905+03 ERROR [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (default task-2) [77e29e1b-fec3-4868-a27e-f6fdd3d84a32] Transaction rolled-back for command 'org.ovirt.engine.core.bll.AddVmF romTemplateCommand'.
Created attachment 1320181 [details] vdsm, server, engine logs New logs added
Comment 8 is from Aug 20th. Comment 9 states that it was tested with a build from Aug 13th: > ovirt-engine-4.2.0-0.0.master.20170813134654.gitaee967b.el7.centos.noarch Can we please retry this with a minimal scenario and a recent build? Thanks!
(In reply to Allon Mureinik from comment #11) > Comment 8 is from Aug 20th. > > Comment 9 states that it was tested with a build from Aug 13th: > > ovirt-engine-4.2.0-0.0.master.20170813134654.gitaee967b.el7.centos.noarch > > Can we please retry this with a minimal scenario and a recent build? > Thanks! No reliable reproducer, closing. Feel free to reopen if you can answer the above needinfo.