Bug 1640797

Summary: Cannot create VMs
Product: [oVirt] ovirt-engine Reporter: oliver.albl
Component: GeneralAssignee: Martin Perina <mperina>
Status: CLOSED DEFERRED QA Contact: Avihai <aefrat>
Severity: urgent Docs Contact:
Priority: low    
Version: 4.3.6.7CC: bugs, christian.grundmann, dfediuck, dholler, michal.skrivanek, mperina, nsoffer, oliver.albl, rbarry
Target Milestone: ---Flags: pm-rhel: ovirt-4.5?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-02 14:49:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log
none
VDSM logs from hosts and SPM
none
qemu logs
none
engine.log for vm fsc-gkjay1zx-clientwindows1-c7
none
engine.log from today, 00:00 to 03.00am none

Description oliver.albl 2018-10-18 19:01:06 UTC
Created attachment 1495423 [details]
engine.log

Description of problem:
I run a cluster with 10 hosts for automated testing purposes. I automatically create VMs, run some workload and delete them again. Today I ran into a situation where I could not create VMs anymore. Messages were:

Failed to create VM <vmname> (User: …).
Failed to complete VM <vmname> creation.

Version-Release number of selected component (if applicable):
4.2.6.4-1.el7

How reproducible:
Occurred today for the first time, generated load on hosts by automatically creating vms, running vm workloads, deleting vms

Steps to Reproduce:
1.
2.
3.

Actual results:
VM creation failed.

Expected results:


Additional info:

Comment 1 Nir Soffer 2018-10-18 20:49:09 UTC
Looks like virt/infra issue.

Oliver, any errors on the hosts in vdsm.log?

Comment 2 oliver.albl 2018-10-18 21:22:49 UTC
Created attachment 1495445 [details]
VDSM logs from hosts and SPM

Comment 3 oliver.albl 2018-10-18 21:23:13 UTC
Nir,

Just those:

vdsm.log.3.06:2018-10-18 19:50:41,244+0200 ERROR (vmchannels) [virt.vm] (vmId='7d907188-7500-41eb-994b-ca7dd04af4e1') No JSON object could be decoded: 'e": ""}' (guestagent:566)
vdsm.log.3.09:2018-10-18 19:49:20,655+0200 ERROR (vmchannels) [virt.vm] (vmId='5e89450b-7f21-42c2-8db1-358201176ee2') No JSON object could be decoded: 'in 10"}' (guestagent:566)
vdsm.log.3.13:2018-10-18 19:48:45,017+0200 ERROR (vmchannels) [virt.vm] (vmId='5af0ec86-58e5-4eee-93fb-1555665cfab7') No JSON object could be decoded: 'WSVAGRANT"}' (guestagent:566)
vdsm.log.3.15:2018-10-18 19:11:22,660+0200 ERROR (vmchannels) [virt.vm] (vmId='fcc2c898-a289-4a0a-a7e6-32a6bb2d5e01') No JSON object could be decoded: 'in 10"}' (guestagent:566)
vdsm.log.3.15:2018-10-18 19:34:10,053+0200 ERROR (vmchannels) [virt.vm] (vmId='cb9b0e6d-8cf2-4fa2-a8ba-b4fea6b870ce') No JSON object could be decoded: 'WSVAGRANT"}' (guestagent:566)
vdsm.log.3.15:2018-10-18 19:34:16,925+0200 ERROR (vmchannels) [virt.vm] (vmId='462d5314-2e8d-4438-b379-8acda9818242') No JSON object could be decoded: 'e": ""}' (guestagent:566)
vdsm.log.3.17:2018-10-18 19:41:45,433+0200 ERROR (vmchannels) [virt.vm] (vmId='33d55c0b-1235-4145-85d3-6e8faf186cb3') No JSON object could be decoded: 'in 10"}' (guestagent:566)
vdsm.log.3.27:2018-10-18 19:42:52,795+0200 ERROR (vmchannels) [virt.vm] (vmId='8f37bf67-c4bf-4269-984e-cdd859955a87') No JSON object could be decoded: 'WSVAGRANT"}' (guestagent:566)
vdsm.log.3.27:2018-10-18 19:42:52,795+0200 ERROR (vmchannels) [virt.vm] (vmId='8f37bf67-c4bf-4269-984e-cdd859955a87') Extra data: line 1 column 15 - line 1 column 28 (char 14 - 27): ' "active-user", "name": ""}' (guestagent:566)
vdsm.log.3.30:2018-10-18 19:35:10,885+0200 ERROR (vmchannels) [virt.vm] (vmId='c27f1a31-8cf9-4c80-a341-1543d5262e1e') No JSON object could be decoded: 'in 2016"}' (guestagent:566)
vdsm.log.3.31:2018-10-18 19:38:03,316+0200 ERROR (vmchannels) [virt.vm] (vmId='5a641993-e67f-4b44-aadc-1e8ac4028ef5') No JSON object could be decoded: 'in 10"}' (guestagent:566)
vdsm.log.3.31:2018-10-18 19:51:11,817+0200 ERROR (vmchannels) [virt.vm] (vmId='9c4ac41c-46d8-4a87-a53b-790ab663a640') No JSON object could be decoded: ': "WINDOWSVAGRANT"}' (guestagent:566)

I attached VDSM logs from hosts and SPM.

Comment 4 Ryan Barry 2018-10-19 10:52:01 UTC
Please attach engine.log

Comment 5 oliver.albl 2018-10-19 10:57:02 UTC
Engine log is already attached. Do you need more entries before the occurrence?

Comment 6 Ryan Barry 2018-10-19 11:04:50 UTC
I didn't see it in the attached archive. Let me check again

Comment 7 Ryan Barry 2018-10-19 13:52:21 UTC
What actually happens to the VMs?

I see that the engine schedules it and hands it off to vdsm, and vdsm starts the VM. The guest agent may not be installed, but there's not much else in the logs.

Can you attach a qemu log for one of the VMs which may be having problems? foliocissprint-c29, for example?

Comment 8 oliver.albl 2018-10-19 14:33:59 UTC
Created attachment 1495668 [details]
qemu logs

qemu logs for vm foliocicsprint-c17 from all hosts in the cluster. Related messages in engine:

          log_time          | severity |                                                       message
----------------------------+----------+---------------------------------------------------------------------------------------------------------------------
 2018-10-18 10:31:44.443+02 |        0 | VM foliocicsprint-c17 creation was initiated by username.com.com-authz.
 2018-10-18 10:32:00.899+02 |        0 | VM foliocicsprint-c17 creation has been completed.
 2018-10-18 10:32:02.91+02  |        0 | VM foliocicsprint-c17 was started by username.com.com-authz (Host: testVMHOST029).
 2018-10-18 10:32:52.191+02 |        0 | VM foliocicsprint-c17 started on Host testVMHOST029
 2018-10-18 11:23:47.243+02 |        0 | VM foliocicsprint-c17 powered off by username.com.com-authz (Host: testVMHOST029).
 2018-10-18 11:23:53.491+02 |        0 | VM foliocicsprint-c17 is down.
 2018-10-18 11:24:04.234+02 |        0 | VM foliocicsprint-c17 was successfully removed.
 2018-10-18 14:05:20.526+02 |        0 | VM foliocicsprint-c17 creation was initiated by username.com.com-authz.
 2018-10-18 14:07:40.559+02 |        0 | VM foliocicsprint-c17 creation has been completed.
 2018-10-18 14:07:44.792+02 |        0 | VM foliocicsprint-c17 was started by username.com.com-authz (Host: testVMHOST015).
 2018-10-18 14:08:16.516+02 |        0 | VM foliocicsprint-c17 started on Host testVMHOST015
 2018-10-18 15:04:10.021+02 |        0 | VM foliocicsprint-c17 powered off by username.com.com-authz (Host: testVMHOST015).
 2018-10-18 15:04:21.358+02 |        0 | VM foliocicsprint-c17 is down.
 2018-10-18 15:05:01.665+02 |        0 | VM foliocicsprint-c17 was successfully removed.

 2018-10-18 19:50:10.159+02 |        2 | Failed to create VM foliocicsprint-c17 (User: username.com.com-authz).
 2018-10-18 19:50:17.576+02 |        2 | Failed to complete VM foliocicsprint-c17 creation.

 2018-10-18 19:52:18.235+02 |        0 | VM foliocicsprint-c17 creation was initiated by username.com.com-authz.
 2018-10-18 19:52:30.049+02 |        0 | VM foliocicsprint-c17 creation has been completed.
 2018-10-18 19:52:39.031+02 |        0 | VM foliocicsprint-c17 was started by username.com.com-authz (Host: testVMHOST031).
 2018-10-18 19:53:21.57+02  |        0 | VM foliocicsprint-c17 started on Host testVMHOST031
 2018-10-18 20:58:54.516+02 |        0 | VM foliocicsprint-c17 powered off by username.com.com-authz (Host: testVMHOST031).
 2018-10-18 20:59:02.844+02 |        0 | VM foliocicsprint-c17 is down.
 2018-10-18 20:59:42.686+02 |        0 | VM foliocicsprint-c17 was successfully removed.
 2018-10-19 05:06:32.552+02 |        0 | VM foliocicsprint-c17 creation was initiated by username.com.com-authz.
 2018-10-19 05:06:46.185+02 |        0 | VM foliocicsprint-c17 creation has been completed.
 2018-10-19 05:06:51.024+02 |        0 | VM foliocicsprint-c17 was started by username.com.com-authz (Host: testVMHOST031).
 2018-10-19 05:07:21.55+02  |        0 | VM foliocicsprint-c17 started on Host testVMHOST031
 2018-10-19 06:04:04.024+02 |        0 | VM foliocicsprint-c17 powered off by username.com.com-authz (Host: testVMHOST031).
 2018-10-19 06:04:42.193+02 |        0 | VM foliocicsprint-c17 was successfully removed.

Comment 9 oliver.albl 2018-10-19 14:39:53 UTC
The problem is, that vms sometimes (e.g. foliocicsprint-c17 at 2018-10-18 19:50:10.159+02) cannot be created, I don't see vdsm trying to start one of those.

Comment 10 Ryan Barry 2018-10-26 12:26:52 UTC
This looks like a storage problem. Is the template locked or otherwise unreachable?

Comment 11 oliver.albl 2018-10-26 12:33:21 UTC
I did not see any storage problem. The template is used by multiple VMs in parallel. Is there a limit?

Comment 12 Michal Skrivanek 2018-10-29 10:39:02 UTC
there's no limit, but your system doesnt' seem to keep up. It timed out after 10 minutes (the example in comment #9) and in the meantime there were a lot of VM starts. How many do you try to run in parallel?

Comment 13 christian.grundmann 2018-10-29 10:42:55 UTC
(In reply to Michal Skrivanek from comment #12)
> there's no limit, but your system doesnt' seem to keep up. It timed out
> after 10 minutes (the example in comment #9) and in the meantime there were
> a lot of VM starts. How many do you try to run in parallel?

Hi,
each job creates around 20-40 thin clones from the same template. (target should be 100-200 if stable)
but there can be multiple jobs at the same time creating clones of the template.

Comment 14 Ryan Barry 2018-11-06 22:25:59 UTC
Can you throttle this job? It seems as if your storage isn't able to keep up.

Why not time it to create 20 at a time (or more, you'd need to test how long until it times out), then continue to the next batch?

Comment 15 christian.grundmann 2018-11-07 13:06:13 UTC
(In reply to Ryan Barry from comment #14)
> Can you throttle this job? It seems as if your storage isn't able to keep up.
> 
> Why not time it to create 20 at a time (or more, you'd need to test how long
> until it times out), then continue to the next batch?

I don't think this can be done client side.
I can throttle the creation per job, but don't now how many jobs are running somewhere else. Its not the only job running against the same engine and storage.

It whould be ok for me if the VMs need longer time for creating if there is some load (engine, storage ...), but i am not ok with getting "Internal Engine Error" from the API. The queing has to be done on oVirt side.

Comment 16 Doron Fediuck 2018-11-08 10:44:09 UTC
Lowering priority as it's hardware related.

Comment 17 christian.grundmann 2018-11-08 10:46:59 UTC
(In reply to Doron Fediuck from comment #16)
> Lowering priority as it's hardware related.

I don't think you could fix this with hardware, we already have an all flash storage attached with fibrechannel and it happens with 20 clones at the same time.

Comment 18 Michal Skrivanek 2018-11-08 11:09:47 UTC
(In reply to christian.grundmann from comment #15)

> 
> It whould be ok for me if the VMs need longer time for creating if there is
> some load (engine, storage ...), but i am not ok with getting "Internal
> Engine Error" from the API. The queing has to be done on oVirt side.

yes, that's the transaction timeout for sync jobs. For normal creations diska are created separately and asynchronously internally, here it seems the clone operation is part of the synchronous response and that's wrong.
What iexactly are you calling? can you post a sample rest api request?

Comment 19 christian.grundmann 2018-11-08 11:24:24 UTC
(In reply to Michal Skrivanek from comment #18)
> (In reply to christian.grundmann from comment #15)
> 
> > 
> > It whould be ok for me if the VMs need longer time for creating if there is
> > some load (engine, storage ...), but i am not ok with getting "Internal
> > Engine Error" from the API. The queing has to be done on oVirt side.
> 
> yes, that's the transaction timeout for sync jobs. For normal creations
> diska are created separately and asynchronously internally, here it seems
> the clone operation is part of the synchronous response and that's wrong.
> What iexactly are you calling? can you post a sample rest api request?

I am using vagrant with ovirt4 plugin
https://www.ovirt.org/blog/2017/02/using-oVirt-vagrant/
which uses your ruby SDK
https://github.com/myoung34/vagrant-ovirt4/blob/master/lib/vagrant-ovirt4/action/create_vm.rb

hope this helps

Comment 20 Ryan Barry 2018-11-08 11:38:30 UTC
Is this as simple as calling 'vagrant up' 20 times in succession?

Comment 21 christian.grundmann 2018-11-08 12:22:43 UTC
(In reply to Ryan Barry from comment #20)
> Is this as simple as calling 'vagrant up' 20 times in succession?

we are using https://www.vagrantup.com/docs/multi-machine/ but yes it should also be reproduceable if you call vagrant up in parallel

Comment 22 Ryan Barry 2018-11-13 03:22:50 UTC
Let's move this to an asynchronous call

Comment 23 Ryan Barry 2019-01-21 14:54:24 UTC
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both

Comment 24 christian.grundmann 2019-07-09 14:17:56 UTC
Created attachment 1588743 [details]
engine.log for vm fsc-gkjay1zx-clientwindows1-c7

Comment 25 christian.grundmann 2019-07-09 14:19:40 UTC
In some cases when we hit this bug we can see the Job for creating the vm running a long time (10-45 minutes) before it fails.
If this happens all other jobs which operate on the same template (delete vm, create vm) are blocked until the long running create fails.

I added a log file with multiple retries creating the VM "fsc-gkjay1zx-clientwindows1-c7" from a template
A database error can be found in the log:


2019-07-08 12:53:42,572+02 ERROR [org.ovirt.engine.core.bll.AddVmCommand] (default task-49982) [] Command 'org.ovirt.engine.core.bll.AddVmCommand' failed: Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@6cc2006f[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@258f4e0 connection handles=1 lastReturned=1562583222519 lastValidated=1562582322918 lastCheckedOut=1562583222519 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@4c466379 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@387e9380[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@3def3afd[connectionListener=6cc2006f connectionManager=7388ab63 warned=false currentXid=null productName=PostgreSQL productVersion=10.6 jndiName=java:/ENGINEDataSource] txSync=null]
2019-07-08 12:53:42,573+02 INFO  [org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand] (default task-50082) [] FINISH, SetVmStatusVDSCommand, return: , log id: f87a880
2019-07-08 12:53:42,572+02 ERROR [org.ovirt.engine.core.bll.AddVmCommand] (default task-49982) [] Exception: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@6cc2006f[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@258f4e0 connection handles=1 lastReturned=1562583222519 lastValidated=1562582322918 lastCheckedOut=1562583222519 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@4c466379 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@387e9380[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@3def3afd[connectionListener=6cc2006f connectionManager=7388ab63 warned=false currentXid=null productName=PostgreSQL productVersion=10.6 jndiName=java:/ENGINEDataSource] txSync=null]
	at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:81) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:1045) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate.java:1104) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.core.simple.AbstractJdbcCall.executeCallInternal(AbstractJdbcCall.java:414) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.core.simple.AbstractJdbcCall.doExecute(AbstractJdbcCall.java:374) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:198) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:135) [dal.jar:]
	at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:130) [dal.jar:]
	at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeModification(SimpleJdbcCallsHandler.java:76) [dal.jar:]
	at org.ovirt.engine.core.dao.DefaultGenericDao.save(DefaultGenericDao.java:93) [dal.jar:]
	at org.ovirt.engine.core.dao.VmDynamicDaoImpl.save(VmDynamicDaoImpl.java:35) [dal.jar:]
	at org.ovirt.engine.core.bll.AddVmCommand.addVmDynamic(AddVmCommand.java:1272) [bll.jar:]
	at org.ovirt.engine.core.bll.AddVmCommand.lambda$executeVmCommand$3(AddVmCommand.java:1003) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:202) [utils.jar:]
	at org.ovirt.engine.core.bll.AddVmCommand.executeVmCommand(AddVmCommand.java:1001) [bll.jar:]
	at org.ovirt.engine.core.bll.VmCommand.executeCommand(VmCommand.java:172) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1157) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1315) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1964) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1375) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:419) [bll.jar:]
	at org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:451) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:433) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:388) [bll.jar:]
	at sun.reflect.GeneratedMethodAccessor482.invoke(Unknown Source) [:1.8.0_212]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_212]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_212]
	at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.jboss.as.weld.ejb.DelegatingInterceptorInvocationContext.proceed(DelegatingInterceptorInvocationContext.java:92) [wildfly-weld-ejb-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.weld.interceptor.proxy.WeldInvocationContextImpl.interceptorChainCompleted(WeldInvocationContextImpl.java:107) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
	at org.jboss.weld.interceptor.proxy.WeldInvocationContextImpl.proceed(WeldInvocationContextImpl.java:126) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
	at org.ovirt.engine.core.common.di.interceptor.LoggingInterceptor.apply(LoggingInterceptor.java:12) [common.jar:]
	at sun.reflect.GeneratedMethodAccessor74.invoke(Unknown Source) [:1.8.0_212]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_212]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_212]
	at org.jboss.weld.interceptor.reader.SimpleInterceptorInvocation$SimpleMethodInvocation.invoke(SimpleInterceptorInvocation.java:73) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
	at org.jboss.weld.interceptor.proxy.WeldInvocationContextImpl.invokeNext(WeldInvocationContextImpl.java:92) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
	at org.jboss.weld.interceptor.proxy.WeldInvocationContextImpl.proceed(WeldInvocationContextImpl.java:124) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
	at org.jboss.weld.bean.InterceptorImpl.intercept(InterceptorImpl.java:105) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
	at org.jboss.as.weld.ejb.DelegatingInterceptorInvocationContext.proceed(DelegatingInterceptorInvocationContext.java:82) [wildfly-weld-ejb-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.as.weld.interceptors.EjbComponentInterceptorSupport.delegateInterception(EjbComponentInterceptorSupport.java:60)
	at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:77)
	at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:89)
	at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:102)
	at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.ovirt.engine.core.bll.interceptors.CorrelationIdTrackerInterceptor.aroundInvoke(CorrelationIdTrackerInterceptor.java:13) [bll.jar:]
	at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source) [:1.8.0_212]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_212]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_212]
	at org.jboss.as.ee.component.ManagedReferenceLifecycleMethodInterceptor.processInvocation(ManagedReferenceLifecycleMethodInterceptor.java:89)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:40)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
	at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:216) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:418) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:148) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81) [weld-ejb-3.0.5.Final.jar:3.0.5.Final]
	at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
	at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:618)
	at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
	at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:198)
	at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185)
	at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81)
	at org.ovirt.engine.core.common.interfaces.BackendLocal$$$view3.runAction(Unknown Source) [common.jar:]
	at org.ovirt.engine.api.restapi.resource.BackendResource.doAction(BackendResource.java:250)
	at org.ovirt.engine.api.restapi.resource.AbstractBackendCollectionResource.doCreateEntity(AbstractBackendCollectionResource.java:233)
	at org.ovirt.engine.api.restapi.resource.AbstractBackendCollectionResource.performCreate(AbstractBackendCollectionResource.java:144)
	at org.ovirt.engine.api.restapi.resource.AbstractBackendCollectionResource.performCreate(AbstractBackendCollectionResource.java:134)
	at org.ovirt.engine.api.restapi.resource.AbstractBackendCollectionResource.performCreate(AbstractBackendCollectionResource.java:153)
	at org.ovirt.engine.api.restapi.resource.BackendVmsResource.addVm(BackendVmsResource.java:568)
	at org.ovirt.engine.api.restapi.resource.BackendVmsResource.add(BackendVmsResource.java:239)
	at sun.reflect.GeneratedMethodAccessor1687.invoke(Unknown Source) [:1.8.0_212]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_212]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_212]
	at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:139) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:509) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:399) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$0(ResourceMethodInvoker.java:363) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:365) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:337) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:137) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:100) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:439) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:229) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:135) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:138) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:215) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:227) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51) [resteasy-jaxrs-3.6.2.Final.jar:3.6.2.Final]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:791) [jboss-servlet-api_4.0_spec-1.0.0.Final.jar:1.0.0.Final]
	at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
	at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:81)
	at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
	at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
	at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:274)
	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchToPath(ServletInitialHandler.java:209)
	at io.undertow.servlet.spec.RequestDispatcherImpl.forwardImpl(RequestDispatcherImpl.java:223)
	at io.undertow.servlet.spec.RequestDispatcherImpl.forwardImplSetup(RequestDispatcherImpl.java:149)
	at io.undertow.servlet.spec.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:111)
	at org.ovirt.engine.api.restapi.invocation.VersionFilter.doFilter(VersionFilter.java:178)
	at org.ovirt.engine.api.restapi.invocation.VersionFilter.doFilter(VersionFilter.java:98)
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.api.restapi.invocation.CurrentFilter.doFilter(CurrentFilter.java:117)
	at org.ovirt.engine.api.restapi.invocation.CurrentFilter.doFilter(CurrentFilter.java:72)
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.aaa.filters.RestApiSessionMgmtFilter.doFilter(RestApiSessionMgmtFilter.java:78) [aaa.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.aaa.filters.EnforceAuthFilter.doFilter(EnforceAuthFilter.java:42) [aaa.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.aaa.filters.SsoRestApiNegotiationFilter.doFilter(SsoRestApiNegotiationFilter.java:84) [aaa.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter.doFilter(SsoRestApiAuthFilter.java:47) [aaa.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.aaa.filters.SessionValidationFilter.doFilter(SessionValidationFilter.java:59) [aaa.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.aaa.filters.RestApiSessionValidationFilter.doFilter(RestApiSessionValidationFilter.java:35) [aaa.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.api.restapi.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:111)
	at org.ovirt.engine.api.restapi.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:102)
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at org.ovirt.engine.core.utils.servlet.CORSSupportFilter.doFilter(CORSSupportFilter.java:283) [utils.jar:]
	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
	at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
	at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
	at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
	at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
	at org.wildfly.extension.undertow.security.SecurityContextAssociationHandler.handleRequest(SecurityContextAssociationHandler.java:78)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
	at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.security.handlers.AuthenticationConstraintHandler.handleRequest(AuthenticationConstraintHandler.java:53)
	at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
	at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
	at io.undertow.servlet.handlers.security.ServletSecurityConstraintHandler.handleRequest(ServletSecurityConstraintHandler.java:59)
	at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
	at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
	at io.undertow.security.handlers.NotificationReceiverHandler.handleRequest(NotificationReceiverHandler.java:50)
	at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at org.wildfly.extension.undertow.security.jacc.JACCContextIdHandler.handleRequest(JACCContextIdHandler.java:61)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
	at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
	at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
	at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
	at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
	at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
	at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
	at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
	at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
	at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
	at io.undertow.server.Connectors.executeRootHandler(Connectors.java:360)
	at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
	at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
	at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1985)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1487)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1378)
	at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_212]
Caused by: java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@6cc2006f[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@258f4e0 connection handles=1 lastReturned=1562583222519 lastValidated=1562582322918 lastCheckedOut=1562583222519 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@4c466379 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@387e9380[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@3def3afd[connectionListener=6cc2006f connectionManager=7388ab63 warned=false currentXid=null productName=PostgreSQL productVersion=10.6 jndiName=java:/ENGINEDataSource] txSync=null]
	at org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:146)
	at org.jboss.as.connector.subsystems.datasources.WildFlyDataSource.getConnection(WildFlyDataSource.java:64)
	at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:151) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:115) [spring-jdbc.jar:5.0.4.RELEASE]
	at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:78) [spring-jdbc.jar:5.0.4.RELEASE]
	... 217 more
Caused by: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@6cc2006f[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@258f4e0 connection handles=1 lastReturned=1562583222519 lastValidated=1562582322918 lastCheckedOut=1562583222519 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@4c466379 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@387e9380[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@3def3afd[connectionListener=6cc2006f connectionManager=7388ab63 warned=false currentXid=null productName=PostgreSQL productVersion=10.6 jndiName=java:/ENGINEDataSource] txSync=null]
	at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.reconnectManagedConnection(AbstractConnectionManager.java:975)
	at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:792)
	at org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:138)
	... 221 more
Caused by: javax.resource.ResourceException: IJ000461: Could not enlist in transaction on entering meta-aware object
	at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.managedConnectionReconnected(TxConnectionManagerImpl.java:571)
	at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.reconnectManagedConnection(AbstractConnectionManager.java:970)
	... 223 more
Caused by: java.lang.IllegalStateException: Transaction Local transaction (delegate=TransactionImple < ac, BasicAction: 0:ffff0a0a600a:-6fccc7b3:5d05ffb0:3bc987e status: ActionStatus.ABORTED >, owner=Local transaction context for provider JBoss JTA transaction provider) is not active STATUS_ROLLEDBACK
	at org.jboss.jca.core.connectionmanager.listener.TxConnectionListener.enlist(TxConnectionListener.java:296)
	at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.managedConnectionReconnected(TxConnectionManagerImpl.java:564)
	... 224 more

Comment 26 Ryan Barry 2019-07-09 15:47:53 UTC
Checking the logs. Failing to connect to the database should basically never happen, and a brief look shows other operations mostly ok.

Martin, is there a known bug around this?

Comment 27 christian.grundmann 2019-09-19 09:33:31 UTC
Any Updates on this Bug? We hit this problem hundreds of times a day.

Comment 28 Ryan Barry 2019-09-19 14:14:25 UTC
Same error in the logs?

Comment 29 oliver.albl 2019-09-20 06:40:27 UTC
Created attachment 1617034 [details]
engine.log from today, 00:00 to 03.00am

Attached engine.log with current messages from today 00:00am to 03:00am.

re:Database connections: we use ENGINE_DB_MAX_CONNECTIONS=300 in /etc/ovirt-engine/engine.conf.d//99-custom.conf and 400 PostgreSQL connections:

engine=# SHOW max_connections;
 max_connections
-----------------
 400

Comment 30 Ryan Barry 2019-09-20 13:27:21 UTC
Martin, still jdbc connection errors. Any ideas from the log?

Comment 31 oliver.albl 2019-10-15 14:18:40 UTC
I checked database queries and locks - I seems there is a bottleneck in get_entity_permissions:

List of active PostgreSQL queries:

  pid  |          query_start          |     duration     |  wait_event   | wait_event_type | backend_xid |                                                                               query
-------+-------------------------------+------------------+---------------+-----------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
  9354 | 2019-10-15 14:18:28.288848+02 | 01:10:13.275115  |               |                 |             | autovacuum: VACUUM ANALYZE public.vm_interface_statistics
  8221 | 2019-10-15 14:35:38.510619+02 | 00:53:03.053344  |               |                 |             | autovacuum: VACUUM pg_toast.pg_toast_16876
  1200 | 2019-10-15 14:44:51.529132+02 | 00:43:50.034831  | tuple         | Lock            |  1851321666 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
 32740 | 2019-10-15 15:06:07.899729+02 | 00:22:33.664234  | tuple         | Lock            |  1851384698 | select * from deletevmstatic($1, $2) as result
   309 | 2019-10-15 15:07:10.595158+02 | 00:21:30.968805  | tuple         | Lock            |  1851387682 | select * from deletevmstatic($1, $2) as result
  5613 | 2019-10-15 15:07:13.238539+02 | 00:21:28.325424  | tuple         | Lock            |  1851387839 | select * from deletevmstatic($1, $2) as result
   932 | 2019-10-15 15:07:23.372251+02 | 00:21:18.191712  | tuple         | Lock            |  1851388326 | select * from deletevmstatic($1, $2) as result
  1188 | 2019-10-15 15:08:06.837363+02 | 00:20:34.7266    | tuple         | Lock            |  1851391569 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
  5628 | 2019-10-15 15:12:53.903368+02 | 00:15:47.660595  | tuple         | Lock            |  1851406867 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
   734 | 2019-10-15 15:15:44.639126+02 | 00:12:56.924837  | tuple         | Lock            |  1851415621 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
 27058 | 2019-10-15 15:18:42.728431+02 | 00:09:58.835532  | tuple         | Lock            |  1851425007 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
  5766 | 2019-10-15 15:19:25.293926+02 | 00:09:16.270037  | tuple         | Lock            |  1851425962 | select * from deletevmstatic($1, $2) as result
   627 | 2019-10-15 15:19:35.298312+02 | 00:09:06.265651  | tuple         | Lock            |  1851426409 | select * from deletevmstatic($1, $2) as result
 32742 | 2019-10-15 15:19:45.552827+02 | 00:08:56.011136  | tuple         | Lock            |  1851426803 | select * from deletevmstatic($1, $2) as result
   386 | 2019-10-15 15:19:47.744071+02 | 00:08:53.819892  | tuple         | Lock            |  1851426898 | select * from deletevmstatic($1, $2) as result
   388 | 2019-10-15 15:19:57.296754+02 | 00:08:44.267209  | tuple         | Lock            |  1851427299 | select * from deletevmstatic($1, $2) as result
   382 | 2019-10-15 15:20:01.688527+02 | 00:08:39.875436  | tuple         | Lock            |  1851427472 | select * from deletevmstatic($1, $2) as result
  5620 | 2019-10-15 15:20:28.162028+02 | 00:08:13.401935  | tuple         | Lock            |  1851428781 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
   381 | 2019-10-15 15:20:35.125342+02 | 00:08:06.438621  | tuple         | Lock            |  1851429498 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
  5776 | 2019-10-15 15:20:39.908977+02 | 00:08:01.654986  | tuple         | Lock            |  1851429508 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
   308 | 2019-10-15 15:20:51.269025+02 | 00:07:50.294938  | tuple         | Lock            |  1851430190 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
 24290 | 2019-10-15 15:21:27.759719+02 | 00:07:13.804244  | tuple         | Lock            |  1851431680 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
 32660 | 2019-10-15 15:21:48.720561+02 | 00:06:52.843402  |               |                 |             | autovacuum: VACUUM ANALYZE public.vm_statistics
 11809 | 2019-10-15 15:22:28.697261+02 | 00:06:12.866702  | tuple         | Lock            |  1851434923 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
  5621 | 2019-10-15 15:22:31.712517+02 | 00:06:09.851446  | tuple         | Lock            |  1851435592 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
 27072 | 2019-10-15 15:22:58.671025+02 | 00:05:42.892938  | tuple         | Lock            |  1851436269 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
 24170 | 2019-10-15 15:26:23.339741+02 | 00:02:18.224222  | transactionid | Lock            |  1851447148 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,
  4072 | 2019-10-15 15:27:30.742165+02 | 00:01:10.821798  |               |                 |             | autovacuum: VACUUM ANALYZE public.disk_image_dynamic
  5149 | 2019-10-15 15:28:29.687431+02 | 00:00:11.876532  |               |                 |             | autovacuum: VACUUM ANALYZE public.vds_interface_statistics
   921 | 2019-10-15 15:28:31.586535+02 | 00:00:09.977428  |               |                 |  1851441692 | select * from  get_entity_permissions($1, $2, $3, $4)
  4655 | 2019-10-15 15:28:37.609028+02 | 00:00:03.954935  |               |                 |  1851452535 | autovacuum: VACUUM public.ad_groups
 26927 | 2019-10-15 15:28:40.95742+02  | 00:00:00.606543  |               |                 |             | SELECT * FROM (SELECT * FROM vms WHERE ( vm_guid IN (SELECT distinct vms_with_tags.vm_guid FROM  vms_with_tags   WHERE  (  vms_with_tags.cluster_compatibility_v
  5528 | 2019-10-15 15:28:41.457081+02 | 00:00:00.106882  |               |                 |             | SELECT * FROM ((SELECT  audit_log.* FROM  audit_log   WHERE  audit_log.severity = '10'  AND not deleted)  ORDER BY audit_log_id DESC ) as T1 OFFSET (1 -1) LIMIT
 27059 | 2019-10-15 15:28:41.457081+02 | 00:00:00.106882  |               |                 |             | SELECT * FROM ((SELECT  audit_log.* FROM  audit_log   WHERE  audit_log.severity = '10'  AND not deleted)  ORDER BY audit_log_id DESC ) as T1 OFFSET (1 -1) LIMIT
  5527 | 2019-10-15 15:28:41.457081+02 | 00:00:00.106882  |               |                 |             | SELECT * FROM ((SELECT  audit_log.* FROM  audit_log   WHERE  audit_log.severity = '10'  AND not deleted)  ORDER BY audit_log_id DESC ) as T1 OFFSET (1 -1) LIMIT
 19771 | 2019-10-15 15:28:41.468637+02 | 00:00:00.095326  |               |                 |             | SELECT * FROM ((SELECT  distinct  audit_log.* FROM  audit_log   LEFT OUTER JOIN vms_with_tags ON audit_log.vm_id=vms_with_tags.vm_guid    WHERE  vms_with_tags.v
   395 | 2019-10-15 15:28:41.49595+02  | 00:00:00.068013  |               |                 |             | select * from  getdisksvmguid($1, $2, $3, $4)
  5727 | 2019-10-15 15:28:41.504785+02 | 00:00:00.059178  |               |                 |             | select * from  getdisksvmguid($1, $2, $3, $4)
   925 | 2019-10-15 15:28:41.526192+02 | 00:00:00.037771  |               |                 |             | select * from  getdisksvmguid($1, $2, $3, $4)
 30404 | 2019-10-15 15:28:41.534436+02 | 00:00:00.029527  |               |                 |             | select * from  getstorage_domains_by_storagepoolid($1, $2, $3)
  5537 | 2019-10-15 15:28:41.563963+02 | 00:00:00         |               |                 |             | select pid,query_start,now()-query_start,wait_event,wait_event_type,backend_xid,left(query,160) from pg_stat_activity where state = 'active' order by query_star
  1189 | 2019-10-15 15:28:41.565652+02 | -00:00:00.001689 |               |                 |  1851452555 | select * from updategroup($1, $2, $3, $4, $5, $6) as result
 32739 | 2019-10-15 15:28:41.567017+02 | -00:00:00.003054 |               |                 |  1851452554 | select * from public.updatevmstatistics(CAST ($1 AS numeric),CAST ($2 AS numeric),CAST ($3 AS numeric),CAST ($4 AS int4),CAST ($5 AS int4),CAST ($6 AS int4),CAS

All long running queries are directly or indirectly waiting for transactions running "select * from  get_entity_permissions($1, $2, $3, $4)"

Example:

  pid  |          query_start          |     duration     |  wait_event   | wait_event_type | backend_xid |                                                                               query
-------+-------------------------------+------------------+---------------+-----------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
  1200 | 2019-10-15 14:44:51.529132+02 | 00:43:50.034831  | tuple         | Lock            |  1851321666 | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28,

Pid 1200 is waiting for tuple: 

   locktype    | database |  relation  | page | tuple | virtualxid  | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |           mode           | granted | fastpath
---------------+----------+------------+------+-------+-------------+---------------+---------+-------+----------+--------------------+-------+--------------------------+---------+----------
 relation      |    16400 |      34042 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34041 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34037 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |  893909676 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |  893909668 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34117 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34110 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34231 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34184 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34183 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      34175 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      16698 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      16689 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      16821 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 |      16827 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 relation      |    16400 | 3915845230 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | t
 virtualxid    |          |            |      |       | 296/20574   |               |         |       |          | 296/20574          |  1200 | ExclusiveLock            | t       | t
 relation      |    16400 |      16737 |      |       |             |               |         |       |          | 296/20574          |  1200 | RowShareLock             | t       | f
 relation      |    16400 |      34044 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 transactionid |          |            |      |       |             |    1851321666 |         |       |          | 296/20574          |  1200 | ExclusiveLock            | t       | f
 relation      |    16400 |      34731 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34723 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34733 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34717 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34732 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34729 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34735 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34730 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34710 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 296/20574          |  1200 | AccessExclusiveLock      | f       | f ========> WAITING FOR THIS LOCK
 relation      |    16400 | 3412388157 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      35238 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34043 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f
 relation      |    16400 |      34734 |      |       |             |               |         |       |          | 296/20574          |  1200 | AccessShareLock          | t       | f

This lock is held by pid 24170:

   locktype    | database |  relation  | page | tuple | virtualxid  | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |           mode           | granted | fastpath
---------------+----------+------------+------+-------+-------------+---------------+---------+-------+----------+--------------------+-------+--------------------------+---------+----------
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 200/105144         |  5776 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 113/3389146        | 32742 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 41/4872932         |  5613 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 22/1438873         | 11809 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 70/4136129         |   381 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 240/38216          |   627 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 280/17506          |  1188 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 127/676286         | 27072 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 245/54549          |   382 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 104/2122318        | 32740 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 164/93028          |  5628 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 196/68074          |  5766 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 157/464766         |  5620 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 309/24703          | 24170 | AccessExclusiveLock      | t       | f ========> HOLDING FOR THIS LOCK
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 159/690083         |  5621 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 296/20574          |  1200 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 287/30226          |   932 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 106/2174716        | 27058 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 264/43344          |   734 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 247/109781         |   388 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 252/45259          |   386 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 187/73408          |   308 | AccessExclusiveLock      | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 189/85051          |   309 | ExclusiveLock            | f       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 325/37193          | 24290 | AccessExclusiveLock      | f       | f

Pid 24170 is waiting for transaction

   locktype    | database |  relation  | page | tuple | virtualxid  | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |           mode           | granted | fastpath
---------------+----------+------------+------+-------+-------------+---------------+---------+-------+----------+--------------------+-------+--------------------------+---------+----------
 relation      |    16400 |      34042 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34041 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34037 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |  893909676 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |  893909668 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34117 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34110 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34231 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34184 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34183 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      34175 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      16698 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      16689 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      16821 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 |      16827 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 relation      |    16400 | 3915845230 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | t
 virtualxid    |          |            |      |       | 309/24703   |               |         |       |          | 309/24703          | 24170 | ExclusiveLock            | t       | t
 transactionid |          |            |      |       |             |    1851447148 |         |       |          | 309/24703          | 24170 | ExclusiveLock            | t       | f
 relation      |    16400 |      16737 |      |       |             |               |         |       |          | 309/24703          | 24170 | RowShareLock             | t       | f
 transactionid |          |            |      |       |             |    1851441692 |         |       |          | 309/24703          | 24170 | ShareLock                | f       | f ========> WAITING FOR THIS LOCK
 relation      |    16400 |      34731 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34044 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34723 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34733 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34717 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34732 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34729 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34735 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34730 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34710 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 tuple         |    16400 |      16737 |  323 |    15 |             |               |         |       |          | 309/24703          | 24170 | AccessExclusiveLock      | t       | f
 relation      |    16400 | 3412388157 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      35238 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34043 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f
 relation      |    16400 |      34734 |      |       |             |               |         |       |          | 309/24703          | 24170 | AccessShareLock          | t       | f

Transaction 1851441692 is running get_entity_permissions:

  pid  |          query_start          |     duration     |  wait_event   | wait_event_type | backend_xid |                                                                               query
-------+-------------------------------+------------------+---------------+-----------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
   921 | 2019-10-15 15:28:31.586535+02 | 00:00:09.977428  |               |                 |  1851441692 | select * from  get_entity_permissions($1, $2, $3, $4)

Please note that the long running query is running for 45 minutes while get_entity_permissions is running for only 10 seconds. From my analysis, all long running querys end waiting for get_entity_permissions queries.

For the long running query I saw different waits for tuples, so the query proceeds but as soon as it got a tuple lock it is blocked while trying to access the next page or tuple.

Comment 32 oliver.albl 2019-10-15 14:20:48 UTC
PostgreSQL locking and query analysis was after upgrading to 4.3.6.7-1.el7

Comment 33 oliver.albl 2019-10-15 17:47:36 UTC
I checked into get_entity_permissions and identified engine_sessions as a potential candidate for this problem:
- get_entity_permissions calls getUserAndGroupsById for the user_id and queries engine_sessions.
- engine_sessions has more than 40.000 rows matching this query and getUserAndGroupsById takes more than 15 seconds to complete.

How can we cleanup engine_sessions?

Comment 34 christian.grundmann 2019-10-17 14:37:19 UTC
We managed to track the open sessions down to the used vagrant ovirt4 plugin.
The plugin doesn't close the ovirt connection, so the sessions stay open after each API call until the timeout (UserSessionTimeOutInterval)
We created a pull request https://github.com/myoung34/vagrant-ovirt4/pull/106 to close the connection after the used actions.
Now we are down to < 300 sessions per user and no longer have errors or timeouts creating vms.

In my opinion there are 2 issues left:

 - a high number of sessions from the same user must not block other operations.
 - add APISessionTimeoutInterval option to allow different settings for API and user sessions

Comment 35 Martin Perina 2020-03-02 14:49:42 UTC
(In reply to christian.grundmann from comment #34)
> We managed to track the open sessions down to the used vagrant ovirt4 plugin.
> The plugin doesn't close the ovirt connection, so the sessions stay open
> after each API call until the timeout (UserSessionTimeOutInterval)
> We created a pull request
> https://github.com/myoung34/vagrant-ovirt4/pull/106 to close the connection
> after the used actions.
> Now we are down to < 300 sessions per user and no longer have errors or
> timeouts creating vms.
> 
> In my opinion there are 2 issues left:
> 
>  - a high number of sessions from the same user must not block other
> operations.

It doesn't block other users, it just overloads your current system, so if you had more powerful server, engine would be responding better

>  - add APISessionTimeoutInterval option to allow different settings for API
> and user sessions

It's clearly defined, that clients needs to close their connections (sessions) when they don't need it, otherwise connections (sessions) will be discarded automatically after specified timeout. Introducing different session timeout for RESTAPI and UI would be quite problematic, not sure how it would comply with our OAUTH/OIDC integration. But anyway due to limited resources, closing as deferred.