Bug 999165 - Frequent statement timeouts while updating availabilities
Frequent statement timeouts while updating availabilities
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Core Server (Show other bugs)
4.9
Unspecified Unspecified
urgent Severity urgent (vote)
: ---
: ---
Assigned To: Jay Shaughnessy
Mike Foley
:
Depends On:
Blocks: 998945
  Show dependency treegraph
 
Reported: 2013-08-20 16:38 EDT by John Sanda
Modified: 2015-12-21 08:59 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-21 08:59:34 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
The server log (14.65 MB, text/plain)
2013-08-20 16:43 EDT, John Sanda
no flags Details
postgres log (58.43 KB, text/plain)
2013-08-20 16:44 EDT, John Sanda
no flags Details

  None (edit)
Description John Sanda 2013-08-20 16:38:33 EDT
Description of problem:
In a testing environment, I am seeing very regular, frequent statement timeouts. The exceptions in the server log look like,

5:05:28,558 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:7080-3) SQL Error: 0, SQLState: 57014
15:05:28,558 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:7080-3) ERROR: canceling statement due to statement timeout
15:05:28,590 ERROR [org.jboss.as.ejb3.invocation] (http-/0.0.0.0:7080-3) JBAS014134: EJB Invocation failed on component AvailabilityManagerBean for method public abstract void org.rhq.enterprise.server.measurement.AvailabilityManagerLocal.updateLastAvailabilityReportInNewTransaction(int): javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not execute statement
...
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout

And in the postgres log I am seeing over and over,

ERROR:  canceling statement due to statement timeout
STATEMENT:  update RHQ_AGENT set LAST_AVAILABILITY_REPORT=$1, BACKFILLED=false where ID=$2

This issue is directly related to bug 998945. The build number is 3c8110a. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 John Sanda 2013-08-20 16:43:45 EDT
Created attachment 788652 [details]
The server log
Comment 2 John Sanda 2013-08-20 16:44:20 EDT
Created attachment 788653 [details]
postgres log
Comment 3 Heiko W. Rupp 2013-09-04 08:00:05 EDT
Is that still an issue or just temporary?

Could this be related to the jdbc driver version that we
use as 9.2-1000 seems to have introduced query timeouts:

http://jdbc.postgresql.org/changes.html#version_9.2-1000

I personally did not see those timeouts yet.
Comment 4 John Sanda 2013-09-04 10:28:24 EDT
I think this is related to bug 1000584. I have not see the timeouts in my local environment. I only encountered them in test environments with less resources (CPU, memory, etc.).
Comment 5 Heiko W. Rupp 2013-12-09 04:16:31 EST
This was filed before our recent mergeAvail changes.
The GWT times BZ may have solved this.

But as the original description talks about backfilling, it may also be something else. 

Let's see if the recent changes helped to solve this issue.
Comment 6 Jay Shaughnessy 2013-12-09 15:09:01 EST
commit d01bb14a36a1121d377dc2445d1456c611c37007
Author: Jay Shaughnessy <jshaughn@redhat.com>
Date:   Mon Dec 9 14:13:52 2013 -0500

 I did a visual inspection of code looking for things that may
 contribute to the slow updates and/or timeouts reported in these BZs. In
 general the updates for availReports (infrequent but can spike in
 certain scenarios, 1 per avail report) and availPing
 (frequent, 1 per minute per agent) should be very fast and not in
 themselves be an issue. Slowness in these queries is likely caused by the
 Agent table being locked by some other longer running transaction.
 Everything looked pretty clean with one exception. When backfilling an agent
 due to an agent shutdown or suspect job detection, it looked like we
 may have locked th agent table for the duration of the backfilling, which
 can be a fairly large operation for a big agent inventory.  I'm not sure
 this is the problem, but this commit should allevaite the Agent table
 locking while the backfilling is performed.

 Also:
 - move some inline JPA updates on the Agent to be NamedQuery in the Agent
   entity.  No need to build those every time we update an agent ping time.
Comment 7 Jay Shaughnessy 2015-12-21 08:59:34 EST
Closing, no follow up after fix 2 years ago.

Note You need to log in before you can comment on or make changes to this bug.