Bug 1044989

Summary: RHQ Control - rhqctl script hangs when starting the JBoss ON agent in DEBUG mode
Product: [Other] RHQ Project Reporter: bkramer <bkramer>
Component: Core ServerAssignee: John Mazzitelli <mazz>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.9CC: hrupp, jshaughn
Target Milestone: ---   
Target Release: RHQ 4.10   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-23 12:32:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1044986    

Description bkramer 2013-12-19 11:41:45 UTC
Description of problem:

Starting RHQ components using rhqctl script when JBoss ON Agent is in DEBUG mode will properly start all components (storage, server and agent) but it will not return the prompt.

Version-Release number of selected component (if applicable):
RHQ 4.9

How reproducible:
Always

Steps to Reproduce:
1. Set the agent into DEBUG mode by editing $RHQ-AGENT/bin/rhq-agent-env.sh file and uncommenting out the line: #RHQ_AGENT_DEBUG=true
2. Save the change;
3. Start all three components by running: ./rhqctl start

Actual results:
All three components are properly started but the script is hanging and is not returning the prompt.

Expected results:
All three components are properly started and the script properly exits;

Additional info:

Adding "set -x" to the rhqctl and rhq-agent-wrapper.sh scripts showed that all commands in the script were executed, including the "exit 0" for a successful agent startup. It seems that the actual rhqctl script hangs in:

"$RHQ_SERVER_JAVA_EXE_FILE_PATH" ${RHQ_CONTROL_JAVA_OPTS} ${RHQ_CONTROL_ADDITIONAL_JAVA_OPTS} -jar "${RHQ_SERVER_JBOSS_HOME}/jboss-modules.jar" -mp "$_JBOSS_MODULEPATH" org.rhq.rhq-server-control "$@"


When running "kill -3" the threadump showed:


Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.1-b02 mixed mode):

"Thread-7" daemon prio=10 tid=0x00007fb1a852e800 nid=0xfd3 runnable [0x00007fb1ad3c5000]
   java.lang.Thread.State: RUNNABLE
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:177)
    at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)
    at java.lang.Thread.run(Thread.java:619)

"Thread-6" daemon prio=10 tid=0x00007fb1a852e000 nid=0xfc8 runnable [0x00007fb1ad4c6000]
   java.lang.Thread.State: RUNNABLE
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:199)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
    - locked <0x00007fb19a6e56c8> (a java.io.BufferedInputStream)
    at java.io.FilterInputStream.read(FilterInputStream.java:90)
    at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)
    at java.lang.Thread.run(Thread.java:619)

"Reference Reaper" daemon prio=10 tid=0x00007fb1a81eb000 nid=0xe17 in Object.wait() [0x00007fb1ad840000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00007fb19ab70958> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
    - locked <0x00007fb19ab70958> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
    at org.jboss.modules.ref.References$ReaperThread.run(References.java:68)

"Low Memory Detector" daemon prio=10 tid=0x00007fb1a8157800 nid=0xe15 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00007fb1a8154800 nid=0xe14 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00007fb1a8152800 nid=0xe13 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007fb1a8150800 nid=0xe12 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007fb1a8133000 nid=0xe11 in Object.wait() [0x00007fb1ade4c000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00007fb185086ae8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
    - locked <0x00007fb185086ae8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x00007fb1a8131000 nid=0xe10 in Object.wait() [0x00007fb1adf4d000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00007fb185083c38> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:485)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
    - locked <0x00007fb185083c38> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007fb1a8007800 nid=0xe0a in Object.wait() [0x00007fb1aef93000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00007fb19a615530> (a java.lang.Thread)
    at java.lang.Thread.join(Thread.java:1143)
    - locked <0x00007fb19a615530> (a java.lang.Thread)
    at java.lang.Thread.join(Thread.java:1196)
    at org.apache.commons.exec.PumpStreamHandler.stop(PumpStreamHandler.java:171)
    at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:363)
    at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:160)
    at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:147)
    at org.rhq.server.control.command.Start.startAgent(Start.java:192)
    at org.rhq.server.control.command.Start.exec(Start.java:86)
    at org.rhq.server.control.ControlCommand.exec(ControlCommand.java:126)
    at org.rhq.server.control.RHQControl.exec(RHQControl.java:83)
    at org.rhq.server.control.RHQControl.main(RHQControl.java:242)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.jboss.modules.Module.run(Module.java:270)
    at org.jboss.modules.Main.main(Main.java:411)


That also means that we were in the startAgent method ( at org.rhq.server.control.command.Start.startAgent(Start.java:192)) trying to execute line 192:

   private void startAgent() throws Exception {
        log.debug("Starting RHQ agent");

        File agentBinDir = new File(getAgentBasedir(), "bin");
        Executor executor = new DefaultExecutor();
        executor.setWorkingDirectory(agentBinDir);
        executor.setStreamHandler(new PumpStreamHandler());
        org.apache.commons.exec.CommandLine commandLine = getCommandLine("rhq-agent-wrapper", "start");

        if (isWindows()) {
            try {
                executor.execute(commandLine);
            } catch (Exception e) {
                // Ignore, service may not exist or be running, , script returns 1
                log.debug("Failed to start agent service", e);
            }
        } else {
            executor.execute(commandLine);  < line 192
        }
    }


However, the StreamPump (at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)) is waiting on the line 105 for some input:

104     try {
105     while ((length = is.read(buf)) > 0) {
106     os.write(buf, 0, length);
107     }

So, when debug is enabled, somehow the input from the rhq-agent-wrapper is still open although we can see the script getting to "exit 0".

Comment 1 Jay Shaughnessy 2014-01-08 22:11:22 UTC
Setting this to Linux, I could not reproduce on Windows.

Comment 2 Heiko W. Rupp 2014-02-28 11:39:56 UTC
git commit to master b3681f2

Comment 3 Heiko W. Rupp 2014-04-23 12:32:08 UTC
Bulk closing of 4.10 issues.

If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10.