Description of problem: In a testing environment, I am seeing very regular, frequent statement timeouts. The exceptions in the server log look like, 5:05:28,558 WARN [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:7080-3) SQL Error: 0, SQLState: 57014 15:05:28,558 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:7080-3) ERROR: canceling statement due to statement timeout 15:05:28,590 ERROR [org.jboss.as.ejb3.invocation] (http-/0.0.0.0:7080-3) JBAS014134: EJB Invocation failed on component AvailabilityManagerBean for method public abstract void org.rhq.enterprise.server.measurement.AvailabilityManagerLocal.updateLastAvailabilityReportInNewTransaction(int): javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not execute statement ... Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout And in the postgres log I am seeing over and over, ERROR: canceling statement due to statement timeout STATEMENT: update RHQ_AGENT set LAST_AVAILABILITY_REPORT=$1, BACKFILLED=false where ID=$2 This issue is directly related to bug 998945. The build number is 3c8110a. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 788652 [details] The server log
Created attachment 788653 [details] postgres log
Is that still an issue or just temporary? Could this be related to the jdbc driver version that we use as 9.2-1000 seems to have introduced query timeouts: http://jdbc.postgresql.org/changes.html#version_9.2-1000 I personally did not see those timeouts yet.
I think this is related to bug 1000584. I have not see the timeouts in my local environment. I only encountered them in test environments with less resources (CPU, memory, etc.).
This was filed before our recent mergeAvail changes. The GWT times BZ may have solved this. But as the original description talks about backfilling, it may also be something else. Let's see if the recent changes helped to solve this issue.
commit d01bb14a36a1121d377dc2445d1456c611c37007 Author: Jay Shaughnessy <jshaughn> Date: Mon Dec 9 14:13:52 2013 -0500 I did a visual inspection of code looking for things that may contribute to the slow updates and/or timeouts reported in these BZs. In general the updates for availReports (infrequent but can spike in certain scenarios, 1 per avail report) and availPing (frequent, 1 per minute per agent) should be very fast and not in themselves be an issue. Slowness in these queries is likely caused by the Agent table being locked by some other longer running transaction. Everything looked pretty clean with one exception. When backfilling an agent due to an agent shutdown or suspect job detection, it looked like we may have locked th agent table for the duration of the backfilling, which can be a fairly large operation for a big agent inventory. I'm not sure this is the problem, but this commit should allevaite the Agent table locking while the backfilling is performed. Also: - move some inline JPA updates on the Agent to be NamedQuery in the Agent entity. No need to build those every time we update an agent ping time.
Closing, no follow up after fix 2 years ago.