Red Hat Bugzilla – Bug 999165
Frequent statement timeouts while updating availabilities
Last modified: 2015-12-21 08:59:34 EST
Description of problem:
In a testing environment, I am seeing very regular, frequent statement timeouts. The exceptions in the server log look like,
5:05:28,558 WARN [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:7080-3) SQL Error: 0, SQLState: 57014
15:05:28,558 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:7080-3) ERROR: canceling statement due to statement timeout
15:05:28,590 ERROR [org.jboss.as.ejb3.invocation] (http-/0.0.0.0:7080-3) JBAS014134: EJB Invocation failed on component AvailabilityManagerBean for method public abstract void org.rhq.enterprise.server.measurement.AvailabilityManagerLocal.updateLastAvailabilityReportInNewTransaction(int): javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not execute statement
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
And in the postgres log I am seeing over and over,
ERROR: canceling statement due to statement timeout
STATEMENT: update RHQ_AGENT set LAST_AVAILABILITY_REPORT=$1, BACKFILLED=false where ID=$2
This issue is directly related to bug 998945. The build number is 3c8110a.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Created attachment 788652 [details]
The server log
Created attachment 788653 [details]
Is that still an issue or just temporary?
Could this be related to the jdbc driver version that we
use as 9.2-1000 seems to have introduced query timeouts:
I personally did not see those timeouts yet.
I think this is related to bug 1000584. I have not see the timeouts in my local environment. I only encountered them in test environments with less resources (CPU, memory, etc.).
This was filed before our recent mergeAvail changes.
The GWT times BZ may have solved this.
But as the original description talks about backfilling, it may also be something else.
Let's see if the recent changes helped to solve this issue.
Author: Jay Shaughnessy <firstname.lastname@example.org>
Date: Mon Dec 9 14:13:52 2013 -0500
I did a visual inspection of code looking for things that may
contribute to the slow updates and/or timeouts reported in these BZs. In
general the updates for availReports (infrequent but can spike in
certain scenarios, 1 per avail report) and availPing
(frequent, 1 per minute per agent) should be very fast and not in
themselves be an issue. Slowness in these queries is likely caused by the
Agent table being locked by some other longer running transaction.
Everything looked pretty clean with one exception. When backfilling an agent
due to an agent shutdown or suspect job detection, it looked like we
may have locked th agent table for the duration of the backfilling, which
can be a fairly large operation for a big agent inventory. I'm not sure
this is the problem, but this commit should allevaite the Agent table
locking while the backfilling is performed.
- move some inline JPA updates on the Agent to be NamedQuery in the Agent
entity. No need to build those every time we update an agent ping time.
Closing, no follow up after fix 2 years ago.