Bug 961437 - [as7] Start operation returns failure when start script returns exit code 0 (success)
[as7] Start operation returns failure when start script returns exit code 0 (...
Product: JBoss Operations Network
Classification: JBoss
Component: Operations, Plugin -- JBoss EAP 6 (Show other bugs)
JON 3.1.1
All All
urgent Severity high
: ER01
: JON 3.1.3
Assigned To: Larry O'Leary
Mike Foley
Depends On: 894493
Blocks: 961438
  Show dependency treegraph
Reported: 2013-05-09 11:26 EDT by Larry O'Leary
Modified: 2013-09-05 22:24 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 894493
: 961438 (view as bug list)
Last Closed: 2013-09-05 22:24:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 292683 None None None Never

  None (edit)
Description Larry O'Leary 2013-05-09 11:26:15 EDT
Back-port to the 3.1.x branch.

+++ This bug was initially created as a clone of JBoss ON 3.2 Bug #894493 +++

Description of problem:
After invoking the start operation on an AS7/EAP6 resource, the operation status is reported as failure even thought the resource is started and the script output indicates that the resource was started successfully and exited cleanly with return code 0.

The only time the start operation reports success is when the start script blocks (i.e. does not return).

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.  Install and configure EAP 6 standalone server
2.  Create custom standalone.sh start script wrapper which returns an exit code:

cat > "${JBOSS_HOME}/bin/standalone-wrapper.sh" << EOF

DIRNAME=\$(dirname "\$0")

eval \"\${DIRNAME}/standalone.sh\" "\$@" \&
exit \$?
chmod +x "${JBOSS_HOME}/bin/standalone-wrapper.sh"

3.  Using the newly created start script wrapper, start the EAP 6 standalone server
4.  Start JBoss ON system
5.  Import EAP 6 standalone server into inventory
6.  Configure the EAP 6 resource's connection settings to use the custom start script wrapper standalone-wrapper.sh
7.  After EAP 6 resource shows availability of UP, invoke its shutdown operation
8.  Wait until EAP 6 resource is reported as DOWN
9.  Invoke the EAP 6 resource's start operation

Actual results:
EAP server is started but operation status in UI shows Failure with the following error message available from the UI:

    java.lang.Exception: Start failed with error code 0:

	    at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278)
	    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	    at java.lang.Thread.run(Thread.java:636)

Expected results:
EAP server is started and the operation status in the UI shows Success.

Additional info:
This is a direct result of incorrectly handling the process exit code from .ProcessExecutionResults.getExitCode() in BaseServerComponent.startServer(). We treat <null> as success when really 0 is success and <null> simply means that the request has not yet returned or is blocking or timed out.

To fix this:

diff --git a/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/BaseServerComponent.java b/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/Ba
index 98a46d2..3dc12de 100644
--- a/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/BaseServerComponent.java
+++ b/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/BaseServerComponent.java
@@ -330,7 +330,7 @@ public abstract class BaseServerComponent<T extends ResourceComponent<?>> extend
         if (results.getError() != null) {
-        } else if (results.getExitCode() != null) {
+        } else if (results.getExitCode() != null && results.getExitCode() != 0) {
             operationResult.setErrorMessage("Start failed with error code " + results.getExitCode() + ":\n" + results.getCapturedOutput());
         } else {
             // Try to connect to the server - ping once per second, timing out after 20s.

--- Additional comment from Jirka Kremser on 2013-02-04 05:52:10 EST ---

I did something similar for as6 (feb01f468)

--- Additional comment from Jirka Kremser on 2013-03-06 06:41:08 EST ---


time:    Wed Mar 6 12:24:27 2013 +0100
commit:  c0fe931b74ac8eccfc5be8af2275bea5d8c0c917
author:  Jirka Kremser - jkremser@redhat.com
message: [BZ 894493] - [as7] Start operation returns failure when start script returns exit code 0 (success). Added the test for the exit code.
Comment 1 Larry O'Leary 2013-05-09 19:33:59 EDT
Committed to release/jon3.1.x as https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=ca4fdfafa7546c03d6d809d2c01a50f8cedc4432:

commit ca4fdfafa7546c03d6d809d2c01a50f8cedc4432
Author: Jirka Kremser <jkremser@redhat.com>
Date:   Wed Mar 6 12:24:27 2013 +0100

    [BZ 894493] - [as7] Start operation returns failure when start script returns exit code 0 (success). Added the test for the exit code.
    (cherry-pick from c0fe931b74ac8eccfc5be8af2275bea5d8c0c917)
Comment 2 Larry O'Leary 2013-09-05 22:24:49 EDT
Closing as there will not be a 3.1.3 release. This is being tracked for 3.2 in the 'depends on' field.

Note You need to log in before you can comment on or make changes to this bug.