Red Hat Bugzilla – Bug 622468
RFE: be able to restart the agent VM on OOM or core dump
Last modified: 2014-06-18 17:07:05 EDT
Description of problem:
Sometimes the agent my core dump (e.g. due to an error in a third party native library that causes a segfault) or may run out of memory (e.g. OutOfMemoryError).
It would be nice to have the agent VM restart whenever it encounters these. The SUN JVM has these options to easily enable this:
-XX:OnError="<cmd args>;<cmd args>" Run user-defined commands on fatal error. (Introduced in 1.4.2 update 9.)
<cmd args>" Run user-defined commands when an OutOfMemoryError is first thrown. (Introduced in 1.4.2 update 12, 6)
All we'd need to do is add these arguments to the VM startup command in rhq-agent.bat, rhq-agent.sh.
The only difficulty would be in determining what command to invoke - do we need to pass in the full VM command line (from "java" through to all VM opts and cmdlinea args again?). To make it easy, I say we just start the VM using the service wrapper script - rhq-agent-wrapper.bat/sh
Note that on Windows, we may already have configured the Java Service Wrapper to restart the agent when OutOfMemoryError messages are spit out - if these VM options work as advertised, we can unconfigure JSW and rely on the VM itself to restart itself.
We have the VM check thread, but this is different, this is actually having the VM restart on OOM crashes.