Bug 826663 - Connections hang daily in hosted Candlepin
Connections hang daily in hosted Candlepin
Product: Subscription Asset Manager
Classification: Red Hat
Component: candlepin (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: rc
: 1.X
Assigned To: Bryan Kearney
: Triaged
Depends On: 826602
Blocks: sam12-tracker
  Show dependency treegraph
Reported: 2012-05-30 13:34 EDT by Chris Duryee
Modified: 2012-10-24 14:39 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 826602
Last Closed: 2012-10-24 14:39:29 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Chris Duryee 2012-05-30 13:34:48 EDT
This was reported to happen when candlepin 0.5.x runs in tomcat. I don't know if it affects sam, but the stacks are similar enough that it might. IT does not have a repro case, aside from waiting for the issue.

If the fix works for 826602 (against 0.5.x), I'd recommend upgrading c3p0 in candlepin for 0.6.x as well.

more info:
> http://sourceforge.net/tracker/?func=detail&aid=1383783&group_id=25357&atid=383690

+++ This bug was initially created as a clone of Bug #826602 +++

Description of problem: The Tomcat instances of Candlepin seem to intermittently hang and cause all requests to fail. This seems to likely be cause by c3p0.

Version-Release number of selected component (if applicable): c3p0-0.9.0

How reproducible: Happens in QA and Stage daily.

Steps to Reproduce:
Actual results:

Expected results:

The Ruby tier triggers a nagios alert and the following is found in the logs:

"Exception PartialOutageException: Could not find shard for XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"

In the Candlepin logs, among other errors, we see:

"Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 34,864,978 milliseconds ago.  The last packet sent successfully to the server was 34,864,979 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem."

Additional info: Our current version of c3p0 is from 2005

--- Additional comment from cduryee@redhat.com on 2012-05-30 11:23:03 EDT ---


Is ok? That seems to be the latest version.

Also, how difficult is it to reproduce this issue?

--- Additional comment from smunilla@redhat.com on 2012-05-30 11:36:02 EDT --- works. 

The issue crops up daily in QA and Stage. I don't think we can reproduce it on demand.

--- Additional comment from smunilla@redhat.com on 2012-05-30 11:38:14 EDT ---

(In reply to comment #1)
> Sam,
> Is ok? That seems to be the latest version.
> Also, how difficult is it to reproduce this issue?
Comment 2 RHEL Product and Program Management 2012-05-30 13:57:48 EDT
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Subscription Asset Manager (SAM). Unfortunately,
we are unable to address this request. Because we are in the final stages
of development in the current release, only significant, release-blocking
issues involving serious regressions and data corruption can be considered.

If you believe this issue meets the release blocking criteria as defined and
communicated to you by your Red Hat Support representative, please ask
your representative to file this issue as a blocker for the current release.
Otherwise, ask that it be evaluated for inclusion in the next release of SAM.
Comment 3 Bryan Kearney 2012-10-24 14:39:29 EDT
This has been resolved in hosted with a new keepalive directive in the jdbc adapter.

Note You need to log in before you can comment on or make changes to this bug.