Created attachment 665382[details]
log and db dump
Description of problem:
in 3 hosts cluster with NFs storage I blocked connectivity to the storage from all the hosts.
during vms migratioin we are getting sql error when trying to update step for re-run of vm migration
Version-Release number of selected component (if applicable):
si25.1
How reproducible:
100%
Steps to Reproduce:
1. in 3 hosts cluster with NFS storage create ~40 vm's and run them on all hosts
2. block connectivity to the storage domain from all the hosts
3.
Actual results:
when we try to re-run a vm we get sql error on Failed to save step
Expected results:
we should be able to update the table on re-run
Additional info:logs and db dump
2012-12-18 10:21:44,518 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-50) Command org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand return value
Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode 12
mMessage Fatal error during migration
2012-12-18 10:21:44,518 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-50) HostName = gold-vdsd
2012-12-18 10:21:44,518 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-50) Command MigrateStatusVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to MigrateStatusVD
S, error = Fatal error during migration
2012-12-18 10:21:44,518 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (pool-4-thread-50) FINISH, MigrateStatusVDSCommand, log id: 25f29f8d
2012-12-18 10:21:44,565 ERROR [org.ovirt.engine.core.bll.job.JobRepositoryImpl] (pool-4-thread-50) Failed to save step ae3965a6-da21-4a16-898f-ff68e685ec5e, VALIDATING.: org.springframework.dao.DataIntegrityViolationException: CallableStatementCallback; SQL [{call insertstep(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
Detail: Key (job_id)=(699b46e5-8c91-454b-a6de-aba726e02ba6) is not present in table "job".
Where: SQL statement "INSERT INTO step( step_id, parent_step_id, job_id, step_type, description, step_number, status, start_time, end_time, correlation_id, external_id, external_system_type) VALUES ( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 , $9 , $10 , $11 , $12 )"
PL/pgSQL function "insertstep" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
Detail: Key (job_id)=(699b46e5-8c91-454b-a6de-aba726e02ba6) is not present in table "job".
Where: SQL statement "INSERT INTO step( step_id, parent_step_id, job_id, step_type, description, step_number, status, start_time, end_time, correlation_id, external_id, external_system_type) VALUES ( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 , $9 , $10 , $11 , $12 )"
PL/pgSQL function "insertstep" line 2 at SQL statement
at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:245) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:1030) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate.java:1064) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.core.simple.AbstractJdbcCall.executeCallInternal(AbstractJdbcCall.java:388) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.core.simple.AbstractJdbcCall.doExecute(AbstractJdbcCall.java:351) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:124) [engine-dal.jar:]
at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeModification(SimpleJdbcCallsHandler.java:37) [engine-dal.jar:]
at org.ovirt.engine.core.dao.DefaultGenericDaoDbFacade.save(DefaultGenericDaoDbFacade.java:93) [engine-dal.jar:]
at org.ovirt.engine.core.bll.job.JobRepositoryImpl$1.runInTransaction(JobRepositoryImpl.java:55) [engine-bll.jar:]
at org.ovirt.engine.core.bll.job.JobRepositoryImpl$1.runInTransaction(JobRepositoryImpl.java:49) [engine-bll.jar:]
at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204) [engine-utils.jar:]
at org.ovirt.engine.core.bll.job.JobRepositoryImpl.saveStep(JobRepositoryImpl.java:49) [engine-bll.jar:]
at org.ovirt.engine.core.bll.job.ExecutionHandler.addSubStep(ExecutionHandler.java:318) [engine-bll.jar:]
at org.ovirt.engine.core.bll.job.ExecutionHandler.addStep(ExecutionHandler.java:269) [engine-bll.jar:]
at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:284) [engine-bll.jar:]
at org.ovirt.engine.core.bll.RunVmCommandBase.rerunInternal(RunVmCommandBase.java:241) [engine-bll.jar:]
at org.ovirt.engine.core.bll.MigrateVmCommand.rerunInternal(MigrateVmCommand.java:222) [engine-bll.jar:]
at org.ovirt.engine.core.bll.RunVmCommandBase$1.run(RunVmCommandBase.java:212) [engine-bll.jar:]
at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:64) [engine-utils.jar:]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_09-icedtea]
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_09-icedtea]
at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_09-icedtea]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [rt.jar:1.7.0_09-icedtea]
I tried couple of times to reproduce it with the latest build on rhev-3.1 branch d/s in a 3 hosts environment and pool of 30 VMs: I ran all VMs and block the connection to the storage from all hosts.
The result was:
1. SPM become non-operational
2. there're 8 rerun because of failed migrations. the cause to the failure is "Error creating the requested virtual machine" and not "Fatal error during migration" as in the log above though..
3. no DataIntegrityViolationException or error save step cannot be saved
Dafna - I'll need more help to reproduce it
In reply to comment #4 -
Arik -I would also try to setup si25.1 - i.e -have its engine code and try to reproduce.
Maybe the bug got solved due to some other work on other bugs/features.
Created attachment 665382 [details] log and db dump Description of problem: in 3 hosts cluster with NFs storage I blocked connectivity to the storage from all the hosts. during vms migratioin we are getting sql error when trying to update step for re-run of vm migration Version-Release number of selected component (if applicable): si25.1 How reproducible: 100% Steps to Reproduce: 1. in 3 hosts cluster with NFS storage create ~40 vm's and run them on all hosts 2. block connectivity to the storage domain from all the hosts 3. Actual results: when we try to re-run a vm we get sql error on Failed to save step Expected results: we should be able to update the table on re-run Additional info:logs and db dump 2012-12-18 10:21:44,518 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-50) Command org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 12 mMessage Fatal error during migration 2012-12-18 10:21:44,518 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-50) HostName = gold-vdsd 2012-12-18 10:21:44,518 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-50) Command MigrateStatusVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to MigrateStatusVD S, error = Fatal error during migration 2012-12-18 10:21:44,518 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (pool-4-thread-50) FINISH, MigrateStatusVDSCommand, log id: 25f29f8d 2012-12-18 10:21:44,565 ERROR [org.ovirt.engine.core.bll.job.JobRepositoryImpl] (pool-4-thread-50) Failed to save step ae3965a6-da21-4a16-898f-ff68e685ec5e, VALIDATING.: org.springframework.dao.DataIntegrityViolationException: CallableStatementCallback; SQL [{call insertstep(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job" Detail: Key (job_id)=(699b46e5-8c91-454b-a6de-aba726e02ba6) is not present in table "job". Where: SQL statement "INSERT INTO step( step_id, parent_step_id, job_id, step_type, description, step_number, status, start_time, end_time, correlation_id, external_id, external_system_type) VALUES ( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 , $9 , $10 , $11 , $12 )" PL/pgSQL function "insertstep" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job" Detail: Key (job_id)=(699b46e5-8c91-454b-a6de-aba726e02ba6) is not present in table "job". Where: SQL statement "INSERT INTO step( step_id, parent_step_id, job_id, step_type, description, step_number, status, start_time, end_time, correlation_id, external_id, external_system_type) VALUES ( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 , $9 , $10 , $11 , $12 )" PL/pgSQL function "insertstep" line 2 at SQL statement at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:245) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:1030) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate.java:1064) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.springframework.jdbc.core.simple.AbstractJdbcCall.executeCallInternal(AbstractJdbcCall.java:388) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.springframework.jdbc.core.simple.AbstractJdbcCall.doExecute(AbstractJdbcCall.java:351) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:124) [engine-dal.jar:] at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeModification(SimpleJdbcCallsHandler.java:37) [engine-dal.jar:] at org.ovirt.engine.core.dao.DefaultGenericDaoDbFacade.save(DefaultGenericDaoDbFacade.java:93) [engine-dal.jar:] at org.ovirt.engine.core.bll.job.JobRepositoryImpl$1.runInTransaction(JobRepositoryImpl.java:55) [engine-bll.jar:] at org.ovirt.engine.core.bll.job.JobRepositoryImpl$1.runInTransaction(JobRepositoryImpl.java:49) [engine-bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204) [engine-utils.jar:] at org.ovirt.engine.core.bll.job.JobRepositoryImpl.saveStep(JobRepositoryImpl.java:49) [engine-bll.jar:] at org.ovirt.engine.core.bll.job.ExecutionHandler.addSubStep(ExecutionHandler.java:318) [engine-bll.jar:] at org.ovirt.engine.core.bll.job.ExecutionHandler.addStep(ExecutionHandler.java:269) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:284) [engine-bll.jar:] at org.ovirt.engine.core.bll.RunVmCommandBase.rerunInternal(RunVmCommandBase.java:241) [engine-bll.jar:] at org.ovirt.engine.core.bll.MigrateVmCommand.rerunInternal(MigrateVmCommand.java:222) [engine-bll.jar:] at org.ovirt.engine.core.bll.RunVmCommandBase$1.run(RunVmCommandBase.java:212) [engine-bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:64) [engine-utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [rt.jar:1.7.0_09-icedtea]