Bug 1401896 - engine server threads keep running after tasks are gone
Summary: engine server threads keep running after tasks are gone
Keywords:
Status: CLOSED DUPLICATE of bug 1401585
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: 4.0.6.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Oved Ourfali
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-06 11:19 UTC by sefi litmanovich
Modified: 2016-12-06 11:47 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-12-06 11:47:48 UTC
oVirt Team: Infra
Embargoed:


Attachments (Terms of Use)
logs.tar - engine.log from the past 2 weeks, server logs, thread dump (out.log) (16.90 MB, application/x-gzip)
2016-12-06 11:19 UTC, sefi litmanovich
no flags Details

Description sefi litmanovich 2016-12-06 11:19:18 UTC
Created attachment 1228416 [details]
logs.tar - engine.log from the past 2 weeks, server logs, thread dump (out.log)

Description of problem:

I have an engine rhevm-4.0.6.1-0.1.el7ev.noarch.
My env's current state is such that I can't really open any new task at all. When I looked at the log after an attempt I see:

2016-12-05 17:19:51,932 WARN  [org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil] (default task-11) [] The thread pool failed to execute list of tasks: Task java.util.concurrent.FutureTask@6f8619c0 rejected from org.ovirt.engine.core.
utils.threadpool.ThreadPoolUtil$InternalThreadExecutor@31835b5f[Running, pool size = 500, active threads = 500, queued tasks = 100, completed tasks = 24428]
2016-12-05 17:19:51,933 ERROR [org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner] (default task-11) [] Failed to execute multiple actions of type 'StopVm': java.util.concurrent.RejectedExecutionException: Task java.util.concurr
ent.FutureTask@6f8619c0 rejected from org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalThreadExecutor@31835b5f[Running, pool size = 500, active threads = 500, queued tasks = 100, completed tasks = 24428]
2016-12-05 17:19:51,933 ERROR [org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner] (default task-11) [] Exception: java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask
@6f8619c0 rejected from org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalThreadExecutor@31835b5f[Running, pool size = 500, active threads = 500, queued tasks = 100, completed tasks = 24428]

Then tried to see how many threads were open by ovirt-engine server and I get:

[root@~]# ps huH p 28300 | wc -l
587

It should be noted that the env has not been yum updated for some time and there are a lot of eap7 packages that require update.

As for java I did not see a required update and the current running version is:
[root@~]# java -version
openjdk version "1.8.0_111"
OpenJDK Runtime Environment (build 1.8.0_111-b15)
OpenJDK 64-Bit Server VM (build 25.111-b15, mixed mode)

What have I been doing with this env past few weeks:

Other then just managing a few vms and running some manual test cases I have ran a script to check a possible vdsm memory leak. The script was starting 5-8 vms and stopping them with 1:30 minutes between each action, and it ran using python-ovirt-engine-sdk4-4.0.2-1.el7ev.x86_64 for around 10 days (on and off, but mostly on).
This might be some lead to what may have caused the problem.

Attached is engine, server logs and thread dump (out.log - dump of jstack -J-d64 <pid>)

Version-Release number of selected component (if applicable):
rhevm-4.0.6.1-0.1.el7ev.noarch

How reproducible:
Once

Steps to Reproduce:
Not sure

Actual results:
Engine has 587 running threads and no new task can be started.

Expected results:
Threads should not be stuck.

Comment 1 sefi litmanovich 2016-12-06 11:47:48 UTC

*** This bug has been marked as a duplicate of bug 1401585 ***


Note You need to log in before you can comment on or make changes to this bug.