Description of problem: While trying to diagnose a RHV environment that is not able to add storage domains, we noticed several problems. Among them, one may be a bug: when the engine is starting there are lots of "Interrupted attempting lock" SQL exceptions. There are also more DB exceptions later, but they seem to concentrate during initialization times, see: 2019-01-10 15:00:28,289Z INFO [org.ovirt.engine.core.bll.Backend] (ServerService Thread Pool -- 41) [] Running ovirt-engine 4.2.7.5-0.1.el7ev 2019-01-10 15:03:30,976Z INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase] (EE-ManagedThreadFactory-engineScheduled-Thread-16) [4575cbf0] Failed to get vds '88d5ae8c-fe39-4f2b-bd29-27b5238d8fd1', error: PreparedStatementCallback; uncategorized SQLException for SQL [select * from getvdsstaticbyvdsid(?)]; SQL state [null]; error code [0]; IJ031013: Interrupted attempting lock: org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@72c6f1fc; nested exception is java.sql.SQLException: IJ031013: Interrupted attempting lock: org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@72c6f1fc Several queries are impacted, among them: getvdsbyvdsid, getvdsstaticbyvdsid, getiscsiifacesbyhostidandstoragetargetid Version-Release number of selected component (if applicable): ovirt-engine 4.2.7.5-0.1.el7ev How reproducible: Unknown, but large scale environment engine=> select count(*) from vds; count ------- 127 engine=> select count(*) from storage_domains; count ------- 354 (1 row) Additional info: 1. There are several other apparent problems, we are working on possible network and unreachable storage issues. Still, it may be a scalability problem and these exceptions do not look right. 2. We already raised max_connections. Seems to be using ~110 during sosreport. /var/opt/rh/rh-postgresql95/lib/pgsql/data/postgresql.conf:max_connections = 250 Value of property 'ENGINE_DB_MAX_CONNECTIONS' is '200'.
sev 1 issue seen is fixed by a hardware upgrade. closing this bug, please reopen if applicable.