Bug 1047163
Summary: | [engine-backend] deadlock in postgres during multiple AddVmFromTemplate threads | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Elad <ebenahar> | ||||
Component: | ovirt-engine | Assignee: | Arik <ahadas> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | meital avital <mavital> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 3.3.0 | CC: | acanan, acathrow, amureini, ebenahar, eedri, iheim, lpeer, michal.skrivanek, ofrenkel, pstehlik, Rhev-m-bugs, sherold, yeylon | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.3.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
Whiteboard: | virt | ||||||
Fixed In Version: | is31 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | Type: | Bug | |||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1056111 | ||||||
Attachments: |
|
This bug can block this PRD: https://bugzilla.redhat.com/show_bug.cgi?id=815642 Therefore, added the ? flag of rhevm-3.3.0 Also the VM remain 'locked' Also the VMs remain 'locked' is this a regression from 3.2 for same use case/test? (In reply to Itamar Heim from comment #4) > is this a regression from 3.2 for same use case/test? Multiple creation of VMs from template wasn't possible before 3.3 since the read-lock for the template images. This is a new feature that was introduced in 3.3: https://bugzilla.redhat.com/show_bug.cgi?id=815642 ok, so how was 815642 verified? does it work with 2 concurrent VMs? 6? 10? (In reply to Itamar Heim from comment #6) > ok, so how was 815642 verified? does it work with 2 concurrent VMs? 6? 10? My test was with 15 VMs with 2 disks each. It failed after a few VMs, I'm not sure exactly how many, it's hard to trace that in the log. I couldn't reproduce this bug on master. it turned out that it was fixed by http://gerrit.ovirt.org/#/c/21100. I'll backport this simple patch to 3.3. reproduced using is30 from engine.log -------------------- 2014-01-09 17:17:06,260 ERROR [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-42) [within thread]: EndAction for action type AddVmFromTemplate threw an exception.: javax.ejb.EJBTransactionRolledbackException: CallableStatementCallback; SQL [{call updateimagestatus(?, ?)}]; ERROR: deadlock detected PL/pgSQL function "updateimagestatus" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: deadlock detected Caused by: org.springframework.dao.DeadlockLoserDataAccessException: CallableStatementCallback; SQL [{call updateimagestatus(?, ?)}]; ERROR: deadlock detected PL/pgSQL function "updateimagestatus" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: deadlock detected Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected verified using is31 no deadlock in logs VMs created much faster than with is30 Closing - RHEV 3.3 Released Closing - RHEV 3.3 Released |
Created attachment 843118 [details] engine sos report and vdsm.log Description of problem: I tried to create multiple VMs (15) from a template (cloned VMs from a template). Engine failed to complete the creation of some of the images with postgres deadlock error for AddVmFromTemplate. After the failure, the VMs disks remained 'locked'. Version-Release number of selected component (if applicable): is29 rhevm-3.3.0-0.42.el6ev.noarch postgresql-8.4.13-1.el6_3.x86_64 How reproducible: Not sure Steps to Reproduce: 1. create a template from VM with 2 disks 2. create 15 VMs from the template (manually from UI or using a script on REST or SDK) as cloned provisioning Actual results: Engine failed in AddVmFromTemplate because of a deadlock in the database. Apparently, 2 processes are trying to perform the same operation in the same table in DB. The error as presented in engine.log: 2013-12-29 17:28:36,489 ERROR [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-46) [within thread]: EndAction for action type AddVmFromTemplate threw an exception.: javax.ejb.EJBTransactionRolledbackExc eption: CallableStatementCallback; SQL [{call updateimagestatus(?, ?)}]; ERROR: deadlock detected Detail: Process 8282 waits for ShareLock on transaction 1767434; blocked by process 7678. Process 7678 waits for ShareLock on transaction 1767433; blocked by process 8282. Hint: See server log for query details. Where: SQL statement "UPDATE images SET imageStatus = $1 WHERE image_guid = $2 " PL/pgSQL function "updateimagestatus" line 2 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: deadlock detected Detail: Process 8282 waits for ShareLock on transaction 1767434; blocked by process 7678. Process 7678 waits for ShareLock on transaction 1767433; blocked by process 8282. Hint: See server log for query details. Where: SQL statement "UPDATE images SET imageStatus = $1 WHERE image_guid = $2 " PL/pgSQL function "updateimagestatus" line 2 at SQL statement After that, the disks of those VM remains in 'locked' status Additional info: engine sos report and vdsm.log