Bug 891316 - ovirt-engine-backend [scalability]: Deadlock occurred during mass startup of VMs.
ovirt-engine-backend [scalability]: Deadlock occurred during mass startup of ...
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Linux
high Severity urgent
: ---
: 3.4.0
Assigned To: Omer Frenkel
Yuri Obshansky
virt
: Regression, ZStream
Depends On:
Blocks: 1060692 rhev3.4beta 1142926
  Show dependency treegraph
 
Reported: 2013-01-02 10:09 EST by Omri Hochman
Modified: 2015-09-22 09 EDT (History)
15 users (show)

See Also:
Fixed In Version: is1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1060692 (view as bug list)
Environment:
Last Closed: 2014-05-13 04:58:00 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
engine.log (261.83 KB, application/octet-stream)
2013-01-02 10:10 EST, Omri Hochman
no flags Details
console_log (179.62 KB, application/octet-stream)
2013-01-02 10:11 EST, Omri Hochman
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 13656 None None None Never
oVirt gerrit 13682 None None None Never

  None (edit)
Description Omri Hochman 2013-01-02 10:09:25 EST
ovirt-engine-backend [scalability]: Deadlock occurred during mass startup of VMs. 

This Issue was found during investigation of Bug 891270.

Description:
************ 
I Started 10000 VMs with 256MB Memory (100 by 100) on 13 hosts.   

Environment:
*************
rhevm3.2 (build sf2.1)
rhevm-3.2.0-2.el6ev.noarch 
rhevm-backend-3.2.0-2.el6ev.noarch
vdsm-cli-4.10.2-2.0.el6.noarch  (on hosts)
vdsm-4.10.2-2.0.el6.x86_64 (on hosts) 

Results:
********
Found one Java-level deadlock: <See Console.log>
=============================
"pool-3-thread-33":
  waiting to lock monitor 0x00007ff678002ee8 (object 0x00000000c398be80, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-99"
"QuartzScheduler_Worker-99":
  waiting for ownable synchronizer 0x00000000c398c800, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "pool-3-thread-30"
"pool-3-thread-30":
  waiting to lock monitor 0x00007ff678002ee8 (object 0x00000000c398be80, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-99"
Comment 1 Omri Hochman 2013-01-02 10:10:54 EST
Created attachment 671471 [details]
engine.log
Comment 2 Omri Hochman 2013-01-02 10:11:51 EST
Created attachment 671472 [details]
console_log
Comment 3 Omri Hochman 2013-01-02 14:24:30 EST
Fixed Description:
***************** 
I Started *1000* VMs with 256MB Memory (100 by 100) on 13 hosts.
Comment 4 Roy Golan 2013-01-02 16:01:19 EST
the deadlock is caused by the refresh thread holding the VdsManager lock waiting on decreasedPending lock and RunVm thread performing rerun() and holding the decreasedPending lock waiting to perform UpdateVdsDynamicData ( a VDS command which acquires the VdsManager lock)

I see 2 main ways to solve this:
1. get rid of the decreasedPending lock and make it AtomicInteger to ensure atomicity and visibility without blocking
2. fix the order of lock acquisition in decreasedPending() method - first get the VdsManager lock and then perform decreasPending and call
Comment 7 mkublin 2013-04-07 12:21:31 EDT
During doing some work on phantom vds status, the deadlock also will be solved a patch is added to bug
Comment 14 Shai Revivo 2013-12-30 04:11:40 EST
Currently we do not have the resources (Lab) to test it.
will have to push it forward to 3.4
Comment 15 Shai Revivo 2014-01-15 10:11:22 EST
QE cannot verify this bug in 3.3, will verify in 3.4
Comment 19 Yuri Obshansky 2014-05-13 04:58:00 EDT
The bug is identical to bug *Bug 1060692* <https://bugzilla.redhat.com/show_bug.cgi?id=1060692> -ovirt-engine-backend [scalability]: Deadlock occurred during mass startup of VMs.
which was fixed and verified in 3.3.2
So, closed

Note You need to log in before you can comment on or make changes to this bug.