Red Hat Bugzilla – Bug 826663
Connections hang daily in hosted Candlepin
Last modified: 2012-10-24 14:39:29 EDT
This was reported to happen when candlepin 0.5.x runs in tomcat. I don't know if it affects sam, but the stacks are similar enough that it might. IT does not have a repro case, aside from waiting for the issue.
If the fix works for 826602 (against 0.5.x), I'd recommend upgrading c3p0 in candlepin for 0.6.x as well.
+++ This bug was initially created as a clone of Bug #826602 +++
Description of problem: The Tomcat instances of Candlepin seem to intermittently hang and cause all requests to fail. This seems to likely be cause by c3p0.
Version-Release number of selected component (if applicable): c3p0-0.9.0
How reproducible: Happens in QA and Stage daily.
Steps to Reproduce:
The Ruby tier triggers a nagios alert and the following is found in the logs:
"Exception PartialOutageException: Could not find shard for XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
In the Candlepin logs, among other errors, we see:
"Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 34,864,978 milliseconds ago. The last packet sent successfully to the server was 34,864,979 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem."
Additional info: Our current version of c3p0 is from 2005
--- Additional comment from email@example.com on 2012-05-30 11:23:03 EDT ---
Is 0.9.1.2 ok? That seems to be the latest version.
Also, how difficult is it to reproduce this issue?
--- Additional comment from firstname.lastname@example.org on 2012-05-30 11:36:02 EDT ---
The issue crops up daily in QA and Stage. I don't think we can reproduce it on demand.
--- Additional comment from email@example.com on 2012-05-30 11:38:14 EDT ---
(In reply to comment #1)
> Is 0.9.1.2 ok? That seems to be the latest version.
> Also, how difficult is it to reproduce this issue?
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Subscription Asset Manager (SAM). Unfortunately,
we are unable to address this request. Because we are in the final stages
of development in the current release, only significant, release-blocking
issues involving serious regressions and data corruption can be considered.
If you believe this issue meets the release blocking criteria as defined and
communicated to you by your Red Hat Support representative, please ask
your representative to file this issue as a blocker for the current release.
Otherwise, ask that it be evaluated for inclusion in the next release of SAM.
This has been resolved in hosted with a new keepalive directive in the jdbc adapter.