Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 891270

Summary: Ovirt-engine-backend [Scalability]: Mass VMs Startup caused Jdbc to run out of connection. 'nested exception Unable to get managed connection for java:/ENGINEDataSource.
Product: Red Hat Enterprise Virtualization Manager Reporter: Omri Hochman <ohochman>
Component: ovirt-engineAssignee: Liran Zelkha <lzelkha>
Status: CLOSED DUPLICATE QA Contact: Omri Hochman <ohochman>
Severity: high Docs Contact:
Priority: high    
Version: 3.2.0CC: acanan, acathrow, audgiri, bazulay, dossow, emesika, hateya, hchiramm, iheim, jcoscia, jkt, lpeer, lzelkha, ohochman, rgolan, Rhev-m-bugs, yeylon, yzaslavs
Target Milestone: ---Flags: ohochman: needinfo+
Target Release: 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-18 18:56:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log
none
console.log
none
server.log
none
new_server_log none

Description Omri Hochman 2013-01-02 12:03:50 UTC
Ovirt-engine-backend [Scalability]: Mass VMs Startup caused Jdbc to run out of connection. 'nested exception Unable to get managed connection for java:/ENGINEDataSource.

Scenario: 
**********
Attempt to start 1000 VMs ('100 by 100', according the GUI paging search results.) 

Results: 
********
The following error display :  "Error while executing action: Action failed due to database connection failure. Please check connectivity to your Database server."


Note: 
*****
from Some reason, the number of open connection to the DB remain high '98, 
even after the operation is done. 

[root@puma03 ~]# ./conns.sh
Password for user postgres: 
 datname | count 
---------+-------
 engine  |    98
(1 row)


engine.log:
************
2013-01-02 10:32:32,918 INFO  [org.ovirt.engine.core.bll.RunVmCommand] (pool-3-thread-49) [6532727b] Running command: RunVmCommand internal: false. Entities affected :  ID: 80ea882d-5901-46b0-b73f-8b2bff09de41 Type: VM
2013-01-02 10:32:32,937 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (pool-3-thread-30) START, IsVmDuringInitiatingVDSCommand( vmId = a70dadcd-859e-429d-be81-73cbbf1a6c7e), log id: 33c77cfa
2013-01-02 10:32:32,937 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (pool-3-thread-30) FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 33c77cfa
2013-01-02 10:32:33,013 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (pool-3-thread-33) START, IsVmDuringInitiatingVDSCommand( vmId = ea90773f-b4a6-4f5c-a78b-9065c6764dd0), log id: 42de428a
2013-01-02 10:32:33,013 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (pool-3-thread-33) FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 42de428a
2013-01-02 10:32:33,019 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (pool-3-thread-46) Data access error during CanDoActionFailure.: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:/ENGINEDataSource
        at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:573) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:637) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:666) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:706) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
        at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:155) [engine-dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:121) [engine-dal.jar:]
        at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181) [spring-jdbc-3.1.1.RELEASE.jar:3.1.1.RELEASE]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:124) [engine-dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadAndReturnMap(SimpleJdbcCallsHandler.java:75) [engine-dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:66) [engine-dal.jar:]
        at org.ovirt.engine.core.dao.VmDeviceDAODbFacadeImpl.getVmDeviceByVmIdTypeAndDevice(VmDeviceDAODbFacadeImpl.java:97) [engine-dal.jar:]
        at org.ovirt.engine.core.dao.VmDeviceDAODbFacadeImpl.getVmDeviceByVmIdTypeAndDevice(VmDeviceDAODbFacadeImpl.java:84) [engine-dal.jar:]
        at org.ovirt.engine.core.bll.VmRunHandler.getPluggedDisks(VmRunHandler.java:236) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.VmRunHandler.canRunVm(VmRunHandler.java:77) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.RunVmCommand.canRunVm(RunVmCommand.java:681) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.RunVmCommand.canDoAction(RunVmCommand.java:659) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.internalCanDoAction(CommandBase.java:611) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.canDoActionOnly(CommandBase.java:274) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.MultipleActionsRunner$2.run(MultipleActionsRunner.java:150) [engine-bll.jar:]
        at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:64) [engine-utils.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_09-icedtea]
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_09-icedtea]
        at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_09-icedtea]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [rt.jar:1.7.0_09-icedtea]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [rt.jar:1.7.0_09-icedtea]
        at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_09-icedtea]
Caused by: java.sql.SQLException: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:/ENGINEDataSource

Comment 1 Omri Hochman 2013-01-02 12:06:31 UTC
Created attachment 671403 [details]
engine.log

Comment 2 Omri Hochman 2013-01-02 12:14:37 UTC
<Attached console.log>

Found 1 deadlock. 

Found one Java-level deadlock:
=============================
"pool-3-thread-33":
  waiting to lock monitor 0x00007ff678002ee8 (object 0x00000000c398be80, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-99"
"QuartzScheduler_Worker-99":
  waiting for ownable synchronizer 0x00000000c398c800, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "pool-3-thread-30"
"pool-3-thread-30":
  waiting to lock monitor 0x00007ff678002ee8 (object 0x00000000c398be80, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-99"

Comment 3 Omri Hochman 2013-01-02 12:16:03 UTC
Created attachment 671416 [details]
console.log

Comment 4 Omri Hochman 2013-01-02 12:19:37 UTC
Attached server.log :
********************
2013-01-02 12:02:14,363 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/webadmin]] (ajp-/127.0.0.1:8702-11) Exception while dispatching incoming RPC call: com.google.gwt.user.server.rpc.UnexpectedException: Service method 'public abstract org.ovirt.engine.core.common.queries.VdcQueryReturnValue org.ovirt.engine.ui.frontend.gwtservices.GenericApiGWTService.RunQuery(org.ovirt.engine.core.common.queries.VdcQueryType,org.ovirt.engine.core.common.queries.VdcQueryParametersBase)' threw an unexpected exception: java.lang.IllegalStateException
        at com.google.gwt.rpc.server.RPC.invokeAndStreamResponse(RPC.java:219) [gwt-servlet-2.3.0.jar:]
        at com.google.gwt.rpc.server.RpcServlet.processCall(RpcServlet.java:161) [gwt-servlet-2.3.0.jar:]
        at com.google.gwt.rpc.server.RpcServlet.processPost(RpcServlet.java:222) [gwt-servlet-2.3.0.jar:]
        at com.google.gwt.user.server.rpc.AbstractRemoteServiceServlet.doPost(AbstractRemoteServiceServlet.java:62) [gwt-servlet-2.3.0.jar:]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:754) [jboss-servlet-api_3.0_spec.jar:1.0.1.Final-redhat-2]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:847) [jboss-servlet-api_3.0_spec.jar:1.0.1.Final-redhat-2]
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:329) [jbossweb.jar:]
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb.jar:]
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275) [jbossweb.jar:]
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161) [jbossweb.jar:]
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489) [jbossweb.jar:]
        at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:169) [jboss-as-web.jar:7.1.3.Final-redhat-4]
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155) [jbossweb.jar:]
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) [jbossweb.jar:]
        at org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466) [jbossweb.jar:]
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) [jbossweb.jar:]
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:372) [jbossweb.jar:]
        at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505) [jbossweb.jar:]
        at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:453) [jbossweb.jar:]
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:931) [jbossweb.jar:]
        at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_09-icedtea]
Caused by: java.lang.IllegalStateException
        at org.ovirt.engine.core.bll.QueriesCommandBase.validatePermissions(QueriesCommandBase.java:129) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.QueriesCommandBase.ExecuteCommand(QueriesCommandBase.java:67) [engine-bll.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.Execute(VdcCommandBase.java:28) [engine-dal.jar:]
        at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:404) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.Backend.RunQuery(Backend.java:381) [engine-bll.jar:]
        at sun.reflect.GeneratedMethodAccessor81.invoke(Unknown Source) [:1.7.0_09-icedtea]

Comment 5 Omri Hochman 2013-01-02 12:20:52 UTC
Created attachment 671427 [details]
server.log

Comment 8 Eli Mesika 2013-01-02 14:36:02 UTC
(In reply to comment #0)
> Ovirt-engine-backend [Scalability]: Mass VMs Startup caused Jdbc to run out
> of connection. 'nested exception Unable to get managed connection for
> java:/ENGINEDataSource.
> 
> Scenario: 
> **********
> Attempt to start 1000 VMs ('100 by 100', according the GUI paging search
> results.) 
> 
> Results: 
> ********
> The following error display :  "Error while executing action: Action failed
> due to database connection failure. Please check connectivity to your
> Database server."
> 
> 
> Note: 
> *****
> from Some reason, the number of open connection to the DB remain high '98, 
> even after the operation is done. 
> 
> [root@puma03 ~]# ./conns.sh
> Password for user postgres: 
>  datname | count 
> ---------+-------
>  engine  |    98
> (1 row)
> 

How long did you waited after operation completed?
The pool is shrinking only after a timeout in which the (max-connection - min-connections) are unused

Comment 9 Omri Hochman 2013-01-02 14:45:43 UTC
(In reply to comment #8)
> (In reply to comment #0)

> How long did you waited after operation completed?
> The pool is shrinking only after a timeout in which the (max-connection -
> min-connections) are unused

I'm now 4 hours after the transaction and still there are 98 connections.. 
(Although already I saw it before dropping to ~68 )
 
[root@puma03 ~]# date
Wed Jan  2 14:43:45 GMT 2013
[root@puma03 ~]# ./conns.sh 
Password for user postgres: 
 datname | count 
---------+-------
 engine  |    98
(1 row)

[root@puma03 ~]#

Comment 10 Roy Golan 2013-01-02 15:09:46 UTC
If you have lots of hosts updater thread (VdsRefreshRunTimeInfo) can run concurrent per host and could peek the numbers you're seeing. it queries the DB for networks, VDS static and more...

but without query logging this is just a hard guesswork.

you can increase the log level of jdbc calls by jmxconsole (if available) 
or adding a logger to standalone.xml.in

<logger category="org.springframework.jdbc.core.simple.SimpleJdbcCall">
   <level name="DEBUG"/>
</logger>

Comment 11 Omri Hochman 2013-01-02 15:16:56 UTC
After farther investigation and discussion the issue we decided to split the bug into two bugs: 

In this bug we'll address the Jdbc that ran out of connection. 
The deadlock will be handled in BZ#891316

Comment 12 Roy Golan 2013-01-02 15:22:15 UTC
instead of standalone.xml its 
/usr/share/ovirt-engine/service/engine-service.xml.in

Comment 13 Omri Hochman 2013-01-02 15:27:59 UTC
(In reply to comment #10)
> If you have lots of hosts updater thread (VdsRefreshRunTimeInfo) can run
> concurrent per host and could peek the numbers you're seeing. it queries the
> DB for networks, VDS static and more...
> 
> but without query logging this is just a hard guesswork.
> 
> you can increase the log level of jdbc calls by jmxconsole (if available) 
> or adding a logger to standalone.xml.in
> 
> <logger category="org.springframework.jdbc.core.simple.SimpleJdbcCall">
>    <level name="DEBUG"/>
> </logger>

Note: 
-----
After restart the engine service number of open connections remained 98.

New_server.log (attached): 
--------------------------
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) The following parameters are used for call {call updatevm_
interface_statistics(?, ?, ?, ?, ?, ?, ?)} with: {v_vm_id=cda7415f-1a56-4323-b0cf-e55dae8f2bc3, v_id=204f7f28-0d9c-472b-922c-b538c99d438a, v_rx_drop=0.0, v_tx_drop=0.0, v
_iface_status=0, v_tx_rate=0.0, v_rx_rate=0.0}
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 1: v_id SQL Type 1111 Type Name null org.springframework.j
dbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 2: v_rx_drop SQL Type 2 Type Name null org.springframework
.jdbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 3: v_rx_rate SQL Type 2 Type Name null org.springframework
.jdbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 4: v_tx_drop SQL Type 2 Type Name null org.springframework
.jdbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 5: v_tx_rate SQL Type 2 Type Name null org.springframework
.jdbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 6: v_iface_status SQL Type 4 Type Name null org.springfram
ework.jdbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) 7: v_vm_id SQL Type 1111 Type Name null org.springframewor
k.jdbc.core.SqlParameter
2013-01-02 15:23:34,551 DEBUG [org.springframework.jdbc.core.simple.SimpleJdbcCall] (QuartzScheduler_Worker-68) The following parameters are used for call {call updatevm_
interface_statistics(?, ?, ?, ?, ?, ?, ?)} with: {v_vm_id=5a8666cc-84d5-429d-a350-e25e7fc1a8d4, v_id=2795f2b2-2de9-4d07-bde5-28efd041490f, v_rx_drop=0.0,

Comment 14 Omri Hochman 2013-01-02 15:31:06 UTC
Created attachment 671484 [details]
new_server_log

Comment 18 Liran Zelkha 2013-06-25 07:42:34 UTC
Can you please add stack trace so we can see what the JVM does? You can use jstack utility for that.

Comment 19 Eli Mesika 2013-06-25 07:53:54 UTC
(In reply to Liran Zelkha from comment #18)
> Can you please add stack trace so we can see what the JVM does? You can use
> jstack utility for that.

Please note comment 16 , this may be related to the history service (ovirt-engine-dwh) and a simple check may be redoing the test Omri does with after stopping the dwh service

Comment 23 Aharon Canan 2013-10-16 09:24:31 UTC
Vlad has nothing to do with it as he is no longer handling scale.

Comment 24 Liran Zelkha 2013-12-18 18:56:40 UTC

*** This bug has been marked as a duplicate of bug 995084 ***