Bug 1105699 - RHEV hypervisor deactivation led to RHEV-M NPEs + 10 minute delay + DB tx abort
Summary: RHEV hypervisor deactivation led to RHEV-M NPEs + 10 minute delay + DB tx abort
Keywords:
Status: CLOSED DUPLICATE of bug 1097256
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Roy Golan
QA Contact: meital avital
URL:
Whiteboard: virt
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-06 18:49 UTC by Dave Sullivan
Modified: 2019-04-28 10:23 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-16 07:22:31 UTC
oVirt Team: ---
Target Upstream Version:


Attachments (Terms of Use)

Description Dave Sullivan 2014-06-06 18:49:53 UTC
Description of problem:

Attempted to set a hypervisor into maintenance (deactivation) on the a RHEV-M 3.3 instance.  This led to a RHEV-M NPE after it reported failing to get an EngineLock, then a 10 minute delay with nothing happening until a DB transaction rollback message was logged. The guest migrations continue after this with additional RHEV-M errors logged.





Version-Release number of selected component (if applicable):

rhevm-3.3.2-0.50
vdsm-4.13.2-0.13

How reproducible:

The deactivation of uxcevrh0004d is requested and subsequent inactivity period:
{noformat}
2014-06-02 09:27:29,631 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (ajp-/127.0.0.1:8702-3) [29329a80] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: 82b3ed8c-90d9-4956-8137-087d27651635 Type: VDS
2014-06-02 09:27:29,643 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (ajp-/127.0.0.1:8702-3) [29329a80] START, SetVdsStatusVDSCommand(HostName = uxcevrh0003d, HostId = 82b3ed8c-90d9-4956-8137-087d27651635, status=PreparingForMaintenance, nonOperationalReason=NONE, stopSpmFailureLogged=true), log id: 690df229
2014-06-02 09:27:29,645 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (ajp-/127.0.0.1:8702-3) [29329a80] FINISH, SetVdsStatusVDSCommand, log id: 690df229
2014-06-02 09:27:29,716 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (ajp-/127.0.0.1:8702-3) [29329a80] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: 82b3ed8c-90d9-4956-8137-087d27651635 Type: VDS
2014-06-02 09:27:29,772 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] Lock Acquired to object EngineLock [exclusiveLocks= key: 6b87b1fe-664c-45c9-ac48-a6dc8adc80d6 value: VM
, sharedLocks= ]
2014-06-02 09:27:29,864 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (ajp-/127.0.0.1:8702-3) [604c9e29] Candidate host uxcevrh0011d (b61e0767-19e3-4068-9be5-fc05337141aa) was filtered out by VAR__FILTERTYPE__INTERNAL filter Memory
2014-06-02 09:27:29,898 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] Running command: InternalMigrateVmCommand internal: true. Entities affected :  ID: 6b87b1fe-664c-45c9-ac48-a6dc8adc80d6 Type: VM
2014-06-02 09:27:29,966 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (ajp-/127.0.0.1:8702-3) [604c9e29] Candidate host uxcevrh0011d (b61e0767-19e3-4068-9be5-fc05337141aa) was filtered out by VAR__FILTERTYPE__INTERNAL filter Memory (correlation id: 604c9e29)
2014-06-02 09:27:30,253 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (DefaultQuartzScheduler_Worker-13) [71576b03] Running command: MaintenanceNumberOfVdssCommand internal: true. Entities affected :  ID: 82b3ed8c-90d9-4956-8137-087d27651635 Type: VDS
2014-06-02 09:27:30,307 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (DefaultQuartzScheduler_Worker-13) [71576b03] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: 82b3ed8c-90d9-4956-8137-087d27651635 Type: VDS
2014-06-02 09:27:30,318 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (DefaultQuartzScheduler_Worker-13) [5edc01ff] Failed to Acquire Lock to object EngineLock [exclusiveLocks= key: 6b87b1fe-664c-45c9-ac48-a6dc8adc80d6 value: VM
, sharedLocks= ]
2014-06-02 09:27:30,320 WARN  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (DefaultQuartzScheduler_Worker-13) [5edc01ff] CanDoAction of action InternalMigrateVm failed. Reasons:VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED,$VmName ffnrsa0001d
2014-06-02 09:27:30,322 ERROR [org.ovirt.engine.core.bll.job.ExecutionHandler] (DefaultQuartzScheduler_Worker-13) [5edc01ff] java.lang.NullPointerException
2014-06-02 09:27:30,322 ERROR [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (DefaultQuartzScheduler_Worker-13) [5edc01ff] ResourceManager::vdsMaintenance - Failed migrating desktop ffnrsa0001d
2014-06-02 09:27:30,341 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] Lock Acquired to object EngineLock [exclusiveLocks= key: 86c81840-adbe-42ae-bab6-6f94697a5041 value: VM
, sharedLocks= ]
2014-06-02 09:27:30,448 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] Candidate host uxcevrh0011d (b61e0767-19e3-4068-9be5-fc05337141aa) was filtered out by VAR__FILTERTYPE__INTERNAL filter Memory
2014-06-02 09:27:30,475 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] Running command: InternalMigrateVmCommand internal: true. Entities affected :  ID: 86c81840-adbe-42ae-bab6-6f94697a5041 Type: VM
2014-06-02 09:27:30,564 INFO  [org.ovirt.engine.core.bll.LoginUserCommand] (ajp-/127.0.0.1:8702-15) Running command: LoginUserCommand internal: false.
2014-06-02 09:27:30,568 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-15) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: User esp-ce-rhev-sched logged in.
2014-06-02 09:27:31,553 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] START, MigrateVDSCommand(HostName = uxcevrh0003d, HostId = 82b3ed8c-90d9-4956-8137-087d27651635, vmId=6b87b1fe-664c-45c9-ac48-a6dc8adc80d6, srcHost=uxcevrh0003d-ctl.chicago.cme.com, dstVdsId=92988fce-943c-4029-9930-d9a3b07ad576, dstHost=uxcevrh0009d-ctl.chicago.cme.com:54321, migrationMethod=ONLINE, tunnelMigration=false), log id: 3a8b3a04
2014-06-02 09:27:31,584 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] Candidate host uxcevrh0011d (b61e0767-19e3-4068-9be5-fc05337141aa) was filtered out by VAR__FILTERTYPE__INTERNAL filter Memory (correlation id: 1a05ba2a)
2014-06-02 09:27:32,046 INFO  [org.ovirt.engine.core.bll.LoginUserCommand] (ajp-/127.0.0.1:8702-13) Running command: LoginUserCommand internal: false.
2014-06-02 09:27:32,049 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-13) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: User esp-ce-rhev-sched logged in.
2014-06-02 09:28:30,723 INFO  [org.ovirt.engine.core.bll.LoginUserCommand] (ajp-/127.0.0.1:8702-7) Running command: LoginUserCommand internal: false.
2014-06-02 09:28:30,728 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-7) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: User e17081 logged in.
2014-06-02 09:31:05,963 INFO  [org.ovirt.engine.core.bll.LoginUserCommand] (ajp-/127.0.0.1:8702-14) Running command: LoginUserCommand internal: false.
2014-06-02 09:31:05,967 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-14) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: User e17081 logged in.
2014-06-02 09:37:29,903 ERROR [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (Transaction Reaper Worker 0) Transaction rolled-back for command: org.ovirt.engine.core.bll.InternalMigrateVmCommand.
{noformat}

Migrations resume and then there's some anger:
{noformat}
2014-06-02 09:37:29,903 ERROR [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (Transaction Reaper Worker 0) Transaction rolled-back for command: org.ovirt.engine.core.bll.InternalMigrateVmCommand.
2014-06-02 09:37:29,957 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] START, MigrateVDSCommand(HostName = uxcevrh0003d, HostId = 82b3ed8c-90d9-4956-8137-087d27651635, vmId=86c81840-adbe-42ae-bab6-6f94697a5041, srcHost=uxcevrh0003d-ctl.chicago.cme.com, dstVdsId=92988fce-943c-4029-9930-d9a3b07ad576, dstHost=uxcevrh0009d-ctl.chicago.cme.com:54321, migrationMethod=ONLINE, tunnelMigration=false), log id: 330b927d
2014-06-02 09:37:29,957 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] VdsBroker::migrate::Entered (vm_guid=86c81840-adbe-42ae-bab6-6f94697a5041, srcHost=uxcevrh0003d-ctl.chicago.cme.com, dstHost=uxcevrh0009d-ctl.chicago.cme.com:54321,  method=online
2014-06-02 09:37:29,968 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] START, MigrateBrokerVDSCommand(HostName = uxcevrh0003d, HostId = 82b3ed8c-90d9-4956-8137-087d27651635, vmId=86c81840-adbe-42ae-bab6-6f94697a5041, srcHost=uxcevrh0003d-ctl.chicago.cme.com, dstVdsId=92988fce-943c-4029-9930-d9a3b07ad576, dstHost=uxcevrh0009d-ctl.chicago.cme.com:54321, migrationMethod=ONLINE, tunnelMigration=false), log id: 668e5619
2014-06-02 09:37:29,988 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] FINISH, MigrateBrokerVDSCommand, log id: 668e5619
2014-06-02 09:37:30,002 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (DefaultQuartzScheduler_Worker-13) [1a05ba2a] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 330b927d
2014-06-02 09:37:30,024 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (DefaultQuartzScheduler_Worker-13) [24457ca5] Lock Acquired to object EngineLock [exclusiveLocks= key: 1fea9d17-c27a-4d14-8d57-37915b5fdffe value: VM
, sharedLocks= ]

<snip migrations happening>

2014-06-02 09:38:16,736 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-13) [4a1b7001] Correlation ID: 71576b03, Job ID: 37a0193d-6d97-4994-826a-e2a50155d88b, Call Stack: null, Custom Event ID: -1, Message: Failed to switch Host uxcevrh0003d to Maintenance mode.
<snip>
2014-06-02 09:38:20,849 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] VdsBroker::migrate::Entered (vm_guid=6b87b1fe-664c-45c9-ac48-a6dc8adc80d6, srcHost=uxcevrh0003d-ctl.chicago.cme.com, dstHost=uxcevrh0009d-ctl.chicago.cme.com:54321,  method=online
2014-06-02 09:38:20,864 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] Cannot perform logging: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction
        at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:573) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:637) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:666) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:706) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:154) [dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:120) [dal.jar:]
        at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:147) [dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:103) [dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeRead(SimpleJdbcCallsHandler.java:95) [dal.jar:]
        at org.ovirt.engine.core.dao.VdsStaticDAODbFacadeImpl.get(VdsStaticDAODbFacadeImpl.java:24) [dal.jar:]
        at org.ovirt.engine.core.dao.VdsStaticDAODbFacadeImpl.get(VdsStaticDAODbFacadeImpl.java:20) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.getAndSetVdsStatic(VdsBrokerCommand.java:75) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.getAdditionalInformation(VdsBrokerCommand.java:66) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.toString(VDSCommandBase.java:43) [vdsbroker.jar:]
        at java.text.MessageFormat.subformat(MessageFormat.java:1271) [rt.jar:1.7.0_55]
        at java.text.MessageFormat.format(MessageFormat.java:860) [rt.jar:1.7.0_55]
        at java.text.Format.format(Format.java:157) [rt.jar:1.7.0_55]
        at java.text.MessageFormat.format(MessageFormat.java:836) [rt.jar:1.7.0_55]
        at org.ovirt.engine.core.utils.log.Log.transform(Log.java:167) [utils.jar:]
        at org.ovirt.engine.core.utils.log.Log.infoFormat(Log.java:128) [utils.jar:]
        at org.ovirt.engine.core.utils.log.LoggedUtils.log(LoggedUtils.java:162) [utils.jar:]
        at org.ovirt.engine.core.utils.log.LoggedUtils.logEntry(LoggedUtils.java:80) [utils.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:25) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.MigrateVDSCommand.executeVdsIdCommand(MigrateVDSCommand.java:26) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsIdVDSCommandBase.executeVDSCommand(VdsIdVDSCommandBase.java:42) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:28) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:397) [vdsbroker.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunAsyncVdsCommand(VDSBrokerFrontendImpl.java:48) [bll.jar:]
        at org.ovirt.engine.core.bll.MigrateVmCommand.perform(MigrateVmCommand.java:144) [bll.jar:]
        at org.ovirt.engine.core.bll.MigrateVmCommand.executeVmCommand(MigrateVmCommand.java:132) [bll.jar:]
        at org.ovirt.engine.core.bll.VmCommand.executeCommand(VmCommand.java:87) [bll.jar:]
        at org.ovirt.engine.core.bll.InternalMigrateVmCommand.executeCommand(InternalMigrateVmCommand.java:19) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1134) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1219) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1895) [bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:210) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:114) [utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1239) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:362) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:416) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:395) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:636) [bll.jar:]
        at sun.reflect.GeneratedMethodAccessor316.invoke(Unknown Source) [:1.7.0_55]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_55]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_55]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptorFactory$ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptorFactory.java:72) [jboss-as-ee.jar:7.3.2.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:58) [jboss-as-ee.jar:7.3.2.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:58) [jboss-as-ee.jar:7.3.2.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at

<snip>

        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:336)
        at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:490)
        at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:420)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:926)
        at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_55]
Caused by: java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction
        at org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:147)
        at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111) [spring-jdbc.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77) [spring-jdbc.jar:3.1.1.RELEASE]
        ... 235 more
Caused by: javax.resource.ResourceException: IJ000460: Error checking for a transaction
        at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.getManagedConnection(TxConnectionManagerImpl.java:362)
        at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:464)
        at org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:139)
        ... 237 more
Caused by: javax.resource.ResourceException: IJ000459: Transaction is not active: tx=TransactionImple < ac, BasicAction: 0:ffff0adc591f:79300199:5366498f:41579f status: ActionStatus.ABORTED >
        at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.getManagedConnection(TxConnectionManagerImpl.java:352)
        ... 239 more

2014-06-02 09:38:20,878 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] FINISH, MigrateBrokerVDSCommand, log id: 7fd20554
2014-06-02 09:38:20,878 ERROR [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] Command MigrateVDS execution failed. Exception: CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction
2014-06-02 09:38:20,878 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] FINISH, MigrateVDSCommand, log id: 3a8b3a04
2014-06-02 09:38:20,880 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] Lock freed to object EngineLock [exclusiveLocks= key: 6b87b1fe-664c-45c9-ac48-a6dc8adc80d6 value: VM
, sharedLocks= ]
2014-06-02 09:38:20,881 ERROR [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (ajp-/127.0.0.1:8702-3) [604c9e29] ResourceManager::vdsMaintenance - Failed migrating desktop ffnrsa0001d
{noformat}


Then more unhappiness:
{noformat}
2014-06-02 09:38:37,901 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [30481654] Failed to Acquire Lock to object EngineLock [exclusiveLocks= key: dc93fb6c-77cf-4e81-80fb-d59a66f76d6c value: VM
, sharedLocks= ]
2014-06-02 09:38:37,902 WARN  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [30481654] CanDoAction of action InternalMigrateVm failed. Reasons:VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED,$VmName fanrcuo0001d
2014-06-02 09:38:37,902 ERROR [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (ajp-/127.0.0.1:8702-3) [30481654] ResourceManager::vdsMaintenance - Failed migrating desktop fanrcuo0001d
2014-06-02 09:38:37,906 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [2cf8ea02] Failed to Acquire Lock to object EngineLock [exclusiveLocks= key: 40202816-fd02-4549-8df8-a18b08b5d144 value: VM
, sharedLocks= ]
2014-06-02 09:38:37,906 WARN  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [2cf8ea02] CanDoAction of action InternalMigrateVm failed. Reasons:VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED,$VmName fanrnmf0005d
2014-06-02 09:38:37,906 ERROR [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (ajp-/127.0.0.1:8702-3) [2cf8ea02] ResourceManager::vdsMaintenance - Failed migrating desktop fanrnmf0005d
2014-06-02 09:38:37,910 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [6bd77496] Failed to Acquire Lock to object EngineLock [exclusiveLocks= key: b489997a-b261-49cd-9bfc-c0b325045b5d value: VM
, sharedLocks= ]
2014-06-02 09:38:37,910 WARN  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [6bd77496] CanDoAction of action InternalMigrateVm failed. Reasons:VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED,$VmName ffcead0103d
2014-06-02 09:38:37,910 ERROR [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (ajp-/127.0.0.1:8702-3) [6bd77496] ResourceManager::vdsMaintenance - Failed migrating desktop ffcead0103d
2014-06-02 09:38:37,914 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ajp-/127.0.0.1:8702-3) [1c84e72f] Failed to Acquire Lock to object EngineLock [exclusiveLocks= key: a73685ab-d211-4096-8678-82af04f5732c value: VM
, sharedLocks= ]

<snip more migration activity>

2014-06-02 09:40:58,574 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-41) Failed to refresh VDS , vds = 82b3ed8c-90d9-4956-8137-087d27651635 : uxcevrh0003d, error = java.lang.NullPointerException, continuing.: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.handOverVM(VdsUpdateRunTimeInfo.java:1782) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.removeVmsFromCache(VdsUpdateRunTimeInfo.java:1749) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.updateRepository(VdsUpdateRunTimeInfo.java:1721) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVmStats(VdsUpdateRunTimeInfo.java:923) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVdsRunTimeInfo(VdsUpdateRunTimeInfo.java:496) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.Refresh(VdsUpdateRunTimeInfo.java:334) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.OnTimer(VdsManager.java:236) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor303.invoke(Unknown Source) [:1.7.0_55]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_55]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_55]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:]

2014-06-02 09:41:00,381 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVdsCommand] (DefaultQuartzScheduler_Worker-8) [5d3b3f3] START, FullListVdsCommand(HostName = uxcevrh0009d, HostId = 92988fce-943c-4029-9930-d9a3b07ad576, vds=Host[uxcevrh0009d], vmIds=[6b87b1fe-664c-45c9-ac48-a6dc8adc80d6]), log id: 71cd5a1b
2014-06-02 09:41:00,388 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVdsCommand] (DefaultQuartzScheduler_Worker-8) [5d3b3f3] FINISH, FullListVdsCommand, return: [Ljava.util.HashMap;@60d6dcfb, log id: 71cd5a1b
2014-06-02 09:41:01,612 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-43) [521dd5d9] Updated vds status from Preparing for Maintenance to Maintenance in database,  vds = 82b3ed8c-90d9-4956-8137-087d27651635 : uxcevrh0003d
2014-06-02 09:41:01,620 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-4-thread-41) Clearing cache of pool: 46e674e2-ae9b-4cd0-8480-9885a74821f4 for problematic entities of VDS: uxcevrh0003d.
2014-06-02 09:41:01,633 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-43) [521dd5d9] START, DisconnectStoragePoolVDSCommand(HostName = uxcevrh0003d, HostId = 82b3ed8c-90d9-4956-8137-087d27651635, storagePoolId = 46e674e2-ae9b-4cd0-8480-9885a74821f4, vds_spm_id = 3), log id: 3c62b587
2014-06-02 09:41:05,652 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-43) [521dd5d9] FINISH, DisconnectStoragePoolVDSCommand, log id: 3c62b587
{noformat}

At the end the hypervisor has made it to the maintenance state:
{noformat}
[RHEVM shell (connected)]# show host uxcevrh0003d | grep status
status-state                          : maintenance
{noformat}
All of the guests that were on uxcevrh0003d appear to have survived migration:
{noformat}
[.../trunk/bin]$ ./rhev-info.py --rhevapi https://rhev-ce-dc3-01.chicago.cme.com/api --rhevcert /etc/pki/cme/ca-bundle.crt --rhevuser e17081@prod.ad.merc.chicago.cme.com --guest ffnrsa0001d --guest fanreuo0001d --guest fanrcuo0001d --guest fanrnmf0005d --guest ffcead0103d --guest mmnrhh0001d --guest clnrap0001ld --guest facefas0001d --guest clnrotc0005ld --guest clcepos0001ld --guest clcecps0001d --guest mmnril0001d --guest clnrdlv0001ld --guest epscesb0001d --guest clnrotc0001ld --guest gcnrest0001d --guest rgnrlt0001d --guest tsnrweb0001d --guest ffcemg0001d --guest fanrago0001d --guest ffnrad0103d --guest facenmf0201d > /tmp/guestdata
[.../trunk/bin]$ egrep '^[^ +]|state: |hyper: ' /tmp/guestdata
ffnrsa0001d (6b87b1fe-664c-45c9-ac48-a6dc8adc80d6)
  state: up [2014-05-21T08:32:20.998000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
fanreuo0001d (d7233803-52ea-4663-8cf0-94651741d1b6)
  state: up [2014-05-21T08:29:11.158000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
fanrcuo0001d (dc93fb6c-77cf-4e81-80fb-d59a66f76d6c)
  state: up [2014-05-21T09:03:38.100000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
fanrnmf0005d (40202816-fd02-4549-8df8-a18b08b5d144)
  state: up [2014-05-28T14:57:59.006000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
ffcead0103d (b489997a-b261-49cd-9bfc-c0b325045b5d)
  state: up [2014-05-21T08:30:32.074000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
mmnrhh0001d (a73685ab-d211-4096-8678-82af04f5732c)
  state: up [2014-05-21T09:04:04.665000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
clnrap0001ld (075abf30-a85e-4318-951a-0767cd4d3b5b)
  state: up [2014-05-20T16:50:38.580000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
facefas0001d (c974c270-a7c1-443d-9f2b-d76f20ee9c06)
  state: up [2014-05-28T14:58:36.959000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
clnrotc0005ld (fd4a9132-1244-4acb-8eb2-66eeb84796d1)
  state: up [2014-05-21T08:24:51.321000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
clcepos0001ld (3fb5d843-59e6-46de-aa57-338002f8ca2c)
  state: up [2014-05-20T16:49:58.722000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
clcecps0001d (4f95e6b7-0444-4b0f-bcb4-91415160e34a)
  state: up [2014-05-20T16:49:21.970000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
mmnril0001d (26b6bee5-2dbe-4484-8e36-02b0015f53d7)
  state: up [2014-05-21T08:34:42.423000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
clnrdlv0001ld (b0830452-14bd-4934-907b-04ee4d943903)
  state: up [2014-05-21T08:23:54.877000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
epscesb0001d (e6e3c11b-6b24-4d47-bd44-5575753df629)
  state: up [2014-05-21T08:26:34.285000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
clnrotc0001ld (f3039595-ed8d-4906-9089-2ddc8ad308b8)
  state: up [2014-05-28T14:58:35.328000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
gcnrest0001d (98fa2a70-e085-4f70-9fc1-261baa06b801)
  state: up [2014-05-21T08:32:52.899000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
rgnrlt0001d (c8a023c1-90f6-4934-9321-0786a216c36c)
  state: up [2014-05-21T08:35:25.907000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
tsnrweb0001d (703323b8-e3b5-48e3-b4f5-e88f05152728)
  state: up [2014-05-21T08:36:25.881000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
ffcemg0001d (b47c677e-7cf5-4f4a-be98-986c1c8e2b4d)
  state: up [2014-05-21T08:31:02.961000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
fanrago0001d (822ec100-b652-4ade-adea-16884e352831)
  state: up [2014-05-21T08:28:27.011000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
ffnrad0103d (844692fd-52f0-4e12-bfe1-a6389fc6c955)
  state: up [2014-05-21T08:31:23.147000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
facenmf0201d (8d01f41f-2eb3-4459-b760-1300a1b472e0)
  state: up [2014-05-21T08:27:53.737000-05:00]
  hyper: uxcevrh0009d (92988fce-943c-4029-9930-d9a3b07ad576)
{noformat}

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Allon Mureinik 2014-06-07 07:58:56 UTC
The NPE seems to originate from the flow that resumes a VM:

<snip>
2014-06-02 09:40:58,574 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-41) Failed to refresh VDS , vds = 82b3ed8c-90d9-4956-8137-087d27651635 : uxcevrh0003d, error = java.lang.NullPointerException, continuing.: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.handOverVM(VdsUpdateRunTimeInfo.java:1782) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.removeVmsFromCache(VdsUpdateRunTimeInfo.java:1749) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.updateRepository(VdsUpdateRunTimeInfo.java:1721) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVmStats(VdsUpdateRunTimeInfo.java:923) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVdsRunTimeInfo(VdsUpdateRunTimeInfo.java:496) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.Refresh(VdsUpdateRunTimeInfo.java:334) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.OnTimer(VdsManager.java:236) [vdsbroker.jar:]
</snip>

Michal, can one of your guys take a look please?

Comment 5 Omer Frenkel 2014-06-16 07:22:31 UTC
This is the same case as in bug 1097256
multiple maintenance (one from user and one from automated mechanism) creates a race, and since migrations are using transactions it can create deadlocks, mis-configuration in the db and eventually the above NPE.

closing as duplicate as the other one has wider analysis and information,
see bug 1097256#c12
and reference for the fix in 3.3

*** This bug has been marked as a duplicate of bug 1097256 ***


Note You need to log in before you can comment on or make changes to this bug.