Hide Forgot
Created attachment 555515 [details] logs Description of problem: When tried to move host to maintenance, it stuck in "preparing for maintenance" status. Exception came few min after. 2012-01-16 15:22:34,795 ERROR [org.ovirt.engine.core.bll.ActivateVdsCommand] (pool-5-thread-50) Command org.ovirt.engine.core.bll.ActivateVdsCommand throw exception: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: Error checking for a transaction at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82) [spring-jdbc-2.5.6.SEC02.jar:] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:577) [spring-jdbc-2.5.6.SEC02.jar:] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:641) [spring-jdbc-2.5.6.SEC02.jar:] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:670) [spring-jdbc-2.5.6.SEC02.jar:] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:702) [spring-jdbc-2.5.6.SEC02.jar:] at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:155) [engine-dal.jar:] at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:121) [engine-dal.jar:] at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:164) [spring-jdbc-2.5.6.SEC02.jar:] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:112) [engine-dal.jar:] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadAndReturnMap(SimpleJdbcCallsHandler.java:63) [engine-dal.jar:] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:54) [engine-dal.jar:] at org.ovirt.engine.core.dao.NetworkDAODbFacadeImpl.getAllForCluster(NetworkDAODbFacadeImpl.java:154) [engine-dal.jar:] at org.ovirt.engine.core.bll.ActivateVdsCommand.executeCommand(ActivateVdsCommand.java:61) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.ExecuteWithoutTransaction(CommandBase.java:617) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:709) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:978) [engine-bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204) [utils-3.0.0-0001.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInRequired(TransactionSupport.java:142) [utils-3.0.0-0001.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:109) [utils-3.0.0-0001.jar:] at org.ovirt.engine.core.bll.CommandBase.Execute(CommandBase.java:722) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.ExecuteAction(CommandBase.java:209) [engine-bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.RunCommands(MultipleActionsRunner.java:140) [engine-bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner$1.run(MultipleActionsRunner.java:61) [engine-bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:57) [utils-3.0.0-0001.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [:1.6.0_22] Version-Release number of selected component (if applicable): ovirt-engine-backend-3.0.0_0001-1.1.fc16.x86_64 How reproducible: Steps to Reproduce: 1. Tried to move host to maintenance 2. 3. Actual results: Exception when tried to move host to maintenance Expected results: Host should move to maintenance Additional info:
when moving host to maintenance, it first moves to preparing-for-maintenance, in this status, the host is being polled. in this case, there is no answer from the host, therefore it takes 9 minutes for the host to move to non-responsive (from this status it can easily move to maintenance as done in the end of the attached log) in this flow, there are attempts to activate the host in this period of 9 minutes, but since the transaction timeout (for some reason) is 5 minutes, you see this failure. default timeout should be 10 minutes AFAIK, if not, consider open a bug, as it doesn't work well. please check if reproduced with 10 minutes transaction timeout.
Seems like the default value in 5 minutes. In file /usr/share/jboss-as-7.1.0.Beta1b/standalone/configuration/standalone.xml <subsystem xmlns="urn:jboss:domain:transactions:1.1"> <core-environment> <process-id> <uuid/> </process-id> </core-environment> <recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/> <coordinator-environment default-timeout="300"/>
Are we sure the default T/O should be 10mintues? sounds a bit long.
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.