Bug 894493 - [as7] Start operation returns failure when start script returns exit code 0 (success)
Summary: [as7] Start operation returns failure when start script returns exit code 0 (...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Operations, Plugin -- JBoss EAP 6
Version: JON 3.1.1
Hardware: All
OS: All
urgent
high
Target Milestone: ER01
: JON 3.2.0
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: 961437
TreeView+ depends on / blocked
 
Reported: 2013-01-11 21:25 UTC by Larry O'Leary
Modified: 2018-12-02 16:49 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 961437 (view as bug list)
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 292683 0 None None None Never

Description Larry O'Leary 2013-01-11 21:25:09 UTC
Description of problem:
After invoking the start operation on an AS7/EAP6 resource, the operation status is reported as failure even thought the resource is started and the script output indicates that the resource was started successfully and exited cleanly with return code 0.

The only time the start operation reports success is when the start script blocks (i.e. does not return).

Version-Release number of selected component (if applicable):
4.4.0.JON311GA

How reproducible:
Always

Steps to Reproduce:
1.  Install and configure EAP 6 standalone server
2.  Create custom standalone.sh start script wrapper which returns an exit code:

cat > "${JBOSS_HOME}/bin/standalone-wrapper.sh" << EOF
#!/bin/sh

DIRNAME=\$(dirname "\$0")

eval \"\${DIRNAME}/standalone.sh\" "\$@" \&
exit \$?
EOF
chmod +x "${JBOSS_HOME}/bin/standalone-wrapper.sh"

3.  Using the newly created start script wrapper, start the EAP 6 standalone server
4.  Start JBoss ON system
5.  Import EAP 6 standalone server into inventory
6.  Configure the EAP 6 resource's connection settings to use the custom start script wrapper standalone-wrapper.sh
7.  After EAP 6 resource shows availability of UP, invoke its shutdown operation
8.  Wait until EAP 6 resource is reported as DOWN
9.  Invoke the EAP 6 resource's start operation

Actual results:
EAP server is started but operation status in UI shows Failure with the following error message available from the UI:

    java.lang.Exception: Start failed with error code 0:

	    at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278)
	    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	    at java.lang.Thread.run(Thread.java:636)

Expected results:
EAP server is started and the operation status in the UI shows Success.

Additional info:
This is a direct result of incorrectly handling the process exit code from .ProcessExecutionResults.getExitCode() in BaseServerComponent.startServer(). We treat <null> as success when really 0 is success and <null> simply means that the request has not yet returned or is blocking or timed out.

To fix this:

diff --git a/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/BaseServerComponent.java b/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/Ba
index 98a46d2..3dc12de 100644
--- a/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/BaseServerComponent.java
+++ b/modules/plugins/jboss-as-7/src/main/java/org/rhq/modules/plugins/jbossas7/BaseServerComponent.java
@@ -330,7 +330,7 @@ public abstract class BaseServerComponent<T extends ResourceComponent<?>> extend
         logExecutionResults(results);
         if (results.getError() != null) {
             operationResult.setErrorMessage(results.getError().getMessage());
-        } else if (results.getExitCode() != null) {
+        } else if (results.getExitCode() != null && results.getExitCode() != 0) {
             operationResult.setErrorMessage("Start failed with error code " + results.getExitCode() + ":\n" + results.getCapturedOutput());
         } else {
             // Try to connect to the server - ping once per second, timing out after 20s.

Comment 1 Jirka Kremser 2013-02-04 10:52:10 UTC
I did something similar for as6 (feb01f468)

Comment 2 Jirka Kremser 2013-03-06 11:41:08 UTC
master
http://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=c0fe931b7

time:    Wed Mar 6 12:24:27 2013 +0100
commit:  c0fe931b74ac8eccfc5be8af2275bea5d8c0c917
author:  Jirka Kremser - jkremser
message: [BZ 894493] - [as7] Start operation returns failure when start script returns exit code 0 (success). Added the test for the exit code.

Comment 3 Larry O'Leary 2013-09-06 14:31:30 UTC
As this is MODIFIED or ON_QA, setting milestone to ER1.

Comment 4 Jan Stefl 2013-10-11 14:39:12 UTC
Verified with JON 3.2.ER3 + EAP 6.1.1 - PASSED


Note You need to log in before you can comment on or make changes to this bug.