Created attachment 843118 [details] engine sos report and vdsm.log Description of problem: I tried to create multiple VMs (15) from a template (cloned VMs from a template). Engine failed to complete the creation of some of the images with postgres deadlock error for AddVmFromTemplate. After the failure, the VMs disks remained 'locked'. Version-Release number of selected component (if applicable): is29 rhevm-3.3.0-0.42.el6ev.noarch postgresql-8.4.13-1.el6_3.x86_64 How reproducible: Not sure Steps to Reproduce: 1. create a template from VM with 2 disks 2. create 15 VMs from the template (manually from UI or using a script on REST or SDK) as cloned provisioning Actual results: Engine failed in AddVmFromTemplate because of a deadlock in the database. Apparently, 2 processes are trying to perform the same operation in the same table in DB. The error as presented in engine.log: 2013-12-29 17:28:36,489 ERROR [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-46) [within thread]: EndAction for action type AddVmFromTemplate threw an exception.: javax.ejb.EJBTransactionRolledbackExc eption: CallableStatementCallback; SQL [{call updateimagestatus(?, ?)}]; ERROR: deadlock detected Detail: Process 8282 waits for ShareLock on transaction 1767434; blocked by process 7678. Process 7678 waits for ShareLock on transaction 1767433; blocked by process 8282. Hint: See server log for query details. Where: SQL statement "UPDATE images SET imageStatus = $1 WHERE image_guid = $2 " PL/pgSQL function "updateimagestatus" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: deadlock detected Detail: Process 8282 waits for ShareLock on transaction 1767434; blocked by process 7678. Process 7678 waits for ShareLock on transaction 1767433; blocked by process 8282. Hint: See server log for query details. Where: SQL statement "UPDATE images SET imageStatus = $1 WHERE image_guid = $2 " PL/pgSQL function "updateimagestatus" line 2 at SQL statement After that, the disks of those VM remains in 'locked' status Additional info: engine sos report and vdsm.log
This bug can block this PRD: https://bugzilla.redhat.com/show_bug.cgi?id=815642 Therefore, added the ? flag of rhevm-3.3.0
Also the VM remain 'locked'
Also the VMs remain 'locked'
is this a regression from 3.2 for same use case/test?
(In reply to Itamar Heim from comment #4) > is this a regression from 3.2 for same use case/test? Multiple creation of VMs from template wasn't possible before 3.3 since the read-lock for the template images. This is a new feature that was introduced in 3.3: https://bugzilla.redhat.com/show_bug.cgi?id=815642
ok, so how was 815642 verified? does it work with 2 concurrent VMs? 6? 10?
(In reply to Itamar Heim from comment #6) > ok, so how was 815642 verified? does it work with 2 concurrent VMs? 6? 10? My test was with 15 VMs with 2 disks each. It failed after a few VMs, I'm not sure exactly how many, it's hard to trace that in the log.
I couldn't reproduce this bug on master. it turned out that it was fixed by http://gerrit.ovirt.org/#/c/21100. I'll backport this simple patch to 3.3.
http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=205d8bbbadd374cb0a620822e3717320641741ed
reproduced using is30 from engine.log -------------------- 2014-01-09 17:17:06,260 ERROR [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-42) [within thread]: EndAction for action type AddVmFromTemplate threw an exception.: javax.ejb.EJBTransactionRolledbackException: CallableStatementCallback; SQL [{call updateimagestatus(?, ?)}]; ERROR: deadlock detected PL/pgSQL function "updateimagestatus" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: deadlock detected Caused by: org.springframework.dao.DeadlockLoserDataAccessException: CallableStatementCallback; SQL [{call updateimagestatus(?, ?)}]; ERROR: deadlock detected PL/pgSQL function "updateimagestatus" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: deadlock detected Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected verified using is31 no deadlock in logs VMs created much faster than with is30
Closing - RHEV 3.3 Released