1117944 – activemq appeared hung and wrapper failed to restart

Bug 1117944 - activemq appeared hung and wrapper failed to restart

Summary: activemq appeared hung and wrapper failed to restart

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	2.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Luke Meyer
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-07-09 16:49 UTC by Luke Meyer
Modified:	2017-01-13 22:40 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-01-13 22:40:39 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
/var/log/activemq/wrapper.log for the failed activemq (29.22 KB, text/x-log) 2014-07-09 16:49 UTC, Luke Meyer	no flags	Details
View All

Description Luke Meyer 2014-07-09 16:49:54 UTC

Created attachment 916892 [details]
/var/log/activemq/wrapper.log for the failed activemq

Description of problem:
I don't know if this is worth investigating, just going to note what happened. I have an OSE 2.1 installation on our internal OpenStack instance. I left it idle for a few days and when I returned, the activemq instance was no longer running.

How reproducible:
Probably not very

Steps to Reproduce:
1. Install OSE 2.1 (I have separate broker + node)
2. Keep it very busy for a few days with creating/removing gears
3. Leave it idle for a few days

Actual results:
/var/log/activemq/wrapper.log:
INFO   | jvm 2    | 2014/06/30 03:11:17 | Java Runtime: Sun Microsystems Inc. 1.6.0_30 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
INFO   | jvm 2    | 2014/06/30 03:11:17 |   Heap sizes: current=58816k  free=54792k  max=932096k
INFO   | jvm 2    | 2014/06/30 03:11:17 |     JVM args: -Dactivemq.home=/usr/share/activemq [...]
INFO   | jvm 2    | 2014/06/30 03:11:17 | Extensions classpath:
INFO   | jvm 2    | 2014/06/30 03:11:17 |   [/usr/share/activemq/lib,/usr/share/activemq/lib/camel,[...]
INFO   | jvm 2    | 2014/06/30 03:11:17 | ACTIVEMQ_HOME: /usr/share/activemq
INFO   | jvm 2    | 2014/06/30 03:11:17 | ACTIVEMQ_BASE: /usr/share/activemq
INFO   | jvm 2    | 2014/06/30 03:11:17 | ACTIVEMQ_CONF: /usr/share/activemq/conf
INFO   | jvm 2    | 2014/06/30 03:11:17 | ACTIVEMQ_DATA: /usr/share/activemq/data
INFO   | jvm 2    | 2014/06/30 03:11:17 | Loading message broker from: xbean:activemq.xml
ERROR  | wrapper  | 2014/07/07 11:11:22 | JVM appears hung: Timed out waiting for signal from JVM.
ERROR  | wrapper  | 2014/07/07 11:11:22 | JVM did not exit on request, terminated
STATUS | wrapper  | 2014/07/07 11:11:25 | JVM exited in response to signal SIGKILL (9).
ERROR  | wrapper  | 2014/07/07 11:11:25 | Unable to start a JVM
STATUS | wrapper  | 2014/07/07 11:11:25 | <-- Wrapper Stopped

/var/log/activemq/activemq.log:
2014-07-03 14:58:38,927 | INFO  | mcollective.reply.broker.hosts.ose211.example.com_3128 Inactive for longer than 300000 ms - removing ... | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[broker.hosts.ose211.example.com] Scheduler
2014-07-07 11:08:35,233 | WARN  | Transport Connection to: tcp://[...]:20147 failed: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30500) long: tcp://172.16.4.139:20147 | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ InactivityMonitor Worker
2014-07-07 11:09:34,675 | WARN  | Transport Connection to: tcp://[...]:24464 failed: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30500) long: tcp://172.16.4.137:24464 | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ InactivityMonitor Worker
(There's nothing more of interest - just showing that the last activity was on 07/03, then an exception on 07/07, and that's the end of the log. It's representative of the rest of the log.)

Expected results:
ActiveMQ should not go down inexplicably. Even if it does, the wrapper should be able to revive it.

Additional information:
I'll attach the wrapper.log which has some other interesting error messages in it. It seems this actually not uncommon (perhaps our OpenStack deployment does not provide consistent access to resources, perhaps the mostly-idle instances are getting swapped out some way, resulting in unexpectedly long delays in responses to monitoring signals?), and what's unusual here is that the wrapper wasn't able to recover.

Particularly note -
INFO   | jvm 1    | 2014/06/16 13:21:02 | WARNING - Unable to load the Wrapper's native library [...]
INFO   | jvm 1    | 2014/06/16 13:21:02 |           System signals will not be handled correctly.

Starting ActiveMQ manually after discovering this - worked fine, naturally.

Comment 2 Rory Thrasher 2017-01-13 22:40:39 UTC

OpenShift Enterprise v2 has officially reached EoL.  This product is no longer supported and bugs will be closed.

Please look into the replacement enterprise-grade container option, OpenShift Container Platform v3.  https://www.openshift.com/container-platform/

More information can be found here: https://access.redhat.com/support/policy/updates/openshift/

Note You need to log in before you can comment on or make changes to this bug.