Description of problem:
When candlepin thinks it lost connection to hornetQ (whatever reason it is), it makes no attempt to recover from it. That causes candlepin can't send events to qpidd, hence katello/foreman does not receive updates about subscription status.
Version-Release number of selected component (if applicable):
candlepin-0.9.54.26-1.el7.noarch
How reproducible:
???
Steps to Reproduce:
1. ??? (no idea how to trigger this) get to the state when candlepin thinks it lost connection to hornetQ, logging:
2018-04-04 13:33:38,528 [thread=Thread-5137 (HornetQ-client-global-threads-246930626)] [=, org=] WARN org.hornetq.core.client - HQ212037: Connection failure has been detected: HQ119015: The connection was disconnected because of server shutdown [code=DISCONNECTED]
2018-04-04 13:35:06,837 [thread=IoReceiver - bsul0081.fs01.vwf.vwfs-ad/10.43.225.233:5671] [=, org=] WARN org.apache.qpid.transport.network.security.ssl.SSLUtil - Exception received while trying to verify hostname
2018-04-04 13:35:08,999 [thread=localhost-startStop-1] [=, org=] WARN org.hibernate.id.UUIDHexGenerator - HHH000409: Using org.hibernate.id.UUIDHexGenerator which does not generate IETF RFC 4122 compliant UUID values; consider using org.hibernate.id.UUIDGenerator instead
2018-04-04 13:36:14,106 [thread=hornetq-failure-check-thread] [=, org=] WARN org.hornetq.core.client - HQ212037: Connection failure has been detected: HQ119014: Did not receive data from invm:0. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2. have an unentitled system you attach subscriptions
3. check WebUI / hammer for the status of the system
4. check /var/log/candlepin/error.log
Actual results:
3. shows unentitled
4. error.log have logs:
2018-04-01 03:06:03,681 [thread=http-bio-8443-exec-10] [req=14931ec4-de5e-438f-abb3-6b39193a87b5, org=Default_Organization] ERROR org.candlepin.audit.HornetqEventDispatcher - Error while trying to send event: Event [id=null, target=COMPLIANCE, type=CREATED, time=Sun Apr 01 03:06:03 CEST 2018, entity=8aa538916211fe4001621ab99e1201d9]
java.lang.NullPointerException: null
at org.hornetq.core.client.impl.ClientSessionFactoryImpl.createSessionInternal(ClientSessionFactoryImpl.java:940) ~[hornetq-core-client-2.3.5.Final.jar:na]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl.createSession(ClientSessionFactoryImpl.java:363) ~[hornetq-core-client-2.3.5.Final.jar:na]
at org.candlepin.audit.HornetqEventDispatcher.getClientSession(HornetqEventDispatcher.java:82) ~[HornetqEventDispatcher.class:na]
at org.candlepin.audit.HornetqEventDispatcher.sendEvent(HornetqEventDispatcher.java:111) ~[HornetqEventDispatcher.class:na]
at org.candlepin.audit.EventSinkImpl.sendEvents(EventSinkImpl.java:79) [EventSinkImpl.class:na]
..
Expected results:
3. should show entitled (assuming proper/sufficient subscriptions were provided)
4. no such error logs
Additional info:
(it is worth trying to understand why the connection fails..)
An attempt to disable TTL and connection checks:
diff -rup candlepin-0.9.54.26/src/main/java/org/candlepin/audit/HornetqEventDispatcher.java candlepin-0.9.54.26.2/src/main/java/org/candlepin/audit/HornetqEventDispatcher.java
--- candlepin-0.9.54.26/src/main/java/org/candlepin/audit/HornetqEventDispatcher.java 2017-12-07 17:36:23.000000000 +0100
+++ candlepin-0.9.54.26.2/src/main/java/org/candlepin/audit/HornetqEventDispatcher.java 2018-04-03 21:03:54.000000000 +0200
@@ -72,6 +72,8 @@ public class HornetqEventDispatcher {
ServerLocator locator = HornetQClient.createServerLocatorWithoutHA(
new TransportConfiguration(InVMConnectorFactory.class.getName()));
locator.setMinLargeMessageSize(largeMsgSize);
+ locator.setConnectionTTL(-1);
+ locator.setClientFailureCheckPeriod(-1);
return locator.createSessionFactory();
}
does not help. What I - as a totally noob on hornetQ, but knowing jms and reading hornetQ docs/API - rather think can help is to configure reconnections, calling on the same place:
locator.setReconnectAttempts(-1); # default is 0 i.e. no reconnect
locator.setInitialConnectAttempts(-1); # default # of connection attempts is 1, so no reconnect
(at least that is my deduction from:
https://activemq.apache.org/artemis/docs/javadocs/javadoc-1.4.0/org/apache/activemq/artemis/api/core/client/ServerLocator.html#setReconnectAttempts-int-
https://activemq.apache.org/artemis/docs/javadocs/javadoc-1.4.0/constant-values.html#org.apache.activemq.artemis.api.core.client.ActiveMQClient.DEFAULT_RECONNECT_ATTEMPTS
and around)
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2018:2927