Bug 738584

Summary: rhq-ant-bundle tool output is not the same as the JON servers behaviour
Product: [Other] RHQ Project Reporter: Nabeel Saad <nsaad>
Component: ProvisioningAssignee: RHQ Project Maintainer <rhq-maint>
Status: CLOSED WORKSFORME QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: medium    
Version: unspecifiedCC: gcooper, hrupp, mazz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-04 00:53:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 625146, 734807    
Attachments:
Description Flags
The bundle deploy.xml that fails starting up
none
jboss init script none

Description Nabeel Saad 2011-09-15 09:12:54 UTC
Description of problem:
I have attempted with a few different bundles to use the rhq-ant-bundle tool to test out my bundles first to ensure that they function correctly before uploading them into JON.  I've gotten two bundles working fine via the tool, but then when I upload them into JON and run the deployment, the behaviour of the ant script is not exactly the same.

I have attached a sample bundle - eap-5.1.1-vanilla - (sorry, 90MBs), but it's a cut down version of the JBoss application server (with only the default profile).  This bundle does a whole bunch of things (set bind address, port set, permissions, move the profile, etc...), but the one thing that works in the bundle tool, but not via the actual bundle is the starting of the jboss server using the following ant code in the deploy.xml:

	<echo>Starting up the server... with: ${jboss-as}/bin/jboss_init_redhat.sh start node${port-set}</echo>
	<exec executable="/bin/bash"  failonerror="true"> 
	        <arg value="-c" /> 
		<arg value="${jboss-as}/bin/jboss_init_redhat.sh start node${port-set}"/>
	</exec>

Version-Release number of selected component (if applicable):
JON 2.4.1 with all recent patches

How reproducible:
Everytime

Steps to Reproduce:
1. Log into a JON server with a group with at least one platform in it
2. Create a New bundle, using the "Upload" bundle option with the provided file.
3. Click into the bundle, click Deploy.
4. Create a new destination
   a. Name = test-destination
   b. Root deployment directory = /opt/example1
   c. Group = {test group that you had}
5. Input properties
   a. instance = default
   b. port-set = 03
   c. jmx-admin-password (leave default)
   d. java-home = {location to your jdk}
6. Click Next through till the end of the wizard.
7. Click into the Deployment [1] of Version [2.0] of [test-destination]
8. Once the deployment completes, you should see "Success" status
9. Check the running process with ps -elf | grep example1, and you won't find any, otherwise check in /opt/example1/jboss-eap-5.1/jboss-as/server/node01/ and you'll see no logs have been created given that the server hasn't started up.
    	
Actual results:
Deployed Server from bundle does not start up as expected

Expected results:
Deployed server from bundle should start up

Additional info:
If you extract the rhq-ant-bundle tool to a local folder.  Then extract the contents of the uploaded bundle into that folder.  Run the following command:

	rhq-ant -Drhq.deploy.dir=/opt/jon-demo/example1 -Dport-set=01 -Dinstance=default -Djava-home=/opt/jdk1.6.0_26

A lot of output will steam by, but towards the end, in the postinstall phase, you'll see:

postinstall:
     [echo] Updating jboss_init_redhat.sh with appropriate values...
     [echo] Updating JMX users admin password to jd-admin...
     [echo] Setting port attribute for default to ports-01.
     [echo] Moving /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/server/default to /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/server/node01.
     [move] Moving 460 files to /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/server/node01
     [echo] Done deploying EAP 5.1.1 vanilla bundle to /opt/jon-demo/example1.
     [echo] Changing permissions, owner and group to jboss.
     [echo] Starting up the server... with: /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/bin/jboss_init_redhat.sh start node01
     [exec] Starting with command: cd /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/bin; /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/bin/run.sh -c node01 -b 127.0.0.1 >/opt/jon-demo/example1/jboss-eap-5.1/jboss-as/server/node01/log/server.log 2>&1 &
     [exec] The server is starting up, check the logs to confirm when start up is complete:
     [exec] tail -f /opt/jon-demo/example1/jboss-eap-5.1/jboss-as/server/node01/log/server.log

And if you were to check the logs at /opt/example1/jboss-eap-5.1/jboss-as/server/node01/ you'll find that logs/server.log exists and tailing will show the server coming up.

Also, there is another instance where the rhq-ant-bundle works but the JON deployment doesn't.  In the bug (TODO), if you were to deploy the deploy_failure.xml (as described in the "Steps to Reproduce") but via the rhq-ant-bundle, no errors would be thrown.

Comment 1 Nabeel Saad 2011-09-15 15:04:34 UTC
Created attachment 523391 [details]
The bundle deploy.xml that fails starting up

You will need to generate a bundle with an eap-5.1.1.zip attachment as well.

Comment 2 Nabeel Saad 2011-09-15 15:06:55 UTC
I referenced another bug that succeeds with the ant deploy tool but fails with JON, you can use the deploy_fail.xml from there:
  https://bugzilla.redhat.com/show_bug.cgi?id=738088

Comment 3 Charles Crouch 2011-09-26 14:00:23 UTC
It may not be possible in all cases to get the bundle test tooling to function 
identically to when it runs inside of the agent, but we will investigate this 
and determine if there is an issue to be fixed.

Comment 4 Charles Crouch 2011-09-26 14:00:32 UTC
Nabeel, is the following suitable content to reproduce the issue with: 
http://download.devel.redhat.com/released/JBEAP-5/5.1.1/zip/jboss-eap-5.1.1.zip

Comment 5 Nabeel Saad 2011-09-27 22:55:43 UTC
Hello Charles, yes that EAP zip you provided should work fine for testing this issue.  For simplicity sake, I would recommend deleting all the profiles other then default (to just decrease the size of the zip).

Also, you would need to put the above zip and the provided deploy.xml into another zip, say bundle.zip and then use that as your bundle.

Let me know if you have any questions.

(Same comment as 738088)

Comment 6 Heiko W. Rupp 2011-09-28 15:54:19 UTC
Mazz, can you please check this?

Comment 7 John Mazzitelli 2011-09-29 14:46:14 UTC
can you attach your jboss_init_redhat.sh file? I got the app server .zip file but it doesn't have it in there (which is why I guess you have it listed as a separate file in the bundle recipe). can't test this myself without your .sh file.

Comment 8 John Mazzitelli 2011-09-29 14:57:11 UTC
for the record, I grabbed the jboss_init_redhat.sh from JIRA that you attached over there ( https://issues.jboss.org/browse/JBPAPP-3194?focusedCommentId=12625818&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12625818 )  and tried it failed with this:

[postinstall] <exec> cat: /home/rsoares/opt/jboss-eap-5.1.1-JBM/jboss-as/server/production/conf/props/jmx-console-users.properties: No such file or directory
[postinstall] <exec> cat: /home/rsoares/opt/jboss-eap-5.1.1-JBM/jboss-as/server/production/conf/props/jmx-console-users.properties: No such file or directory
[postinstall] <exec> usage: /tmp/deleteme/1/jboss-eap-5.1/jboss-as/bin/jboss_init_redhat.sh start|stop|restart|kill|status 
[postinstall] <exec> 
/home/mazz/source/rhq/modules/enterprise/agent/target/rhq-agent/data/tmp/bundle-versions/10021/ant-bundle-recipe5069565723731706642.xml:73: exec returned: 1

First, it looks like your ant recipe hardcodes your own /home/rsoares paths, so that explains those two messages, then it seems like the jboss_init_redhat.sh I have wants a parameter and I guess your recipe wasn't passing? So it looks like I need your jboss_init_redhat.sh specifically to test with, and I'm not sure if the hardcoded /home/rsoares paths will cause problems with my testing.

Comment 9 Nabeel Saad 2011-09-29 15:22:48 UTC
Hello John, I've attached the jboss_init_redhat.sh - the one that you picked up is not anything I've previously uploaded.  My version is customized and not necessarily used by customers, etc...

This does not/should not hard code any values in it regarding home directories... and the deploy.xml file should modify the JBOSS_HOME and JAVA_HOME variables in the deployed bundle.

As for the parameters, yes, my script will eventually require the name of the instance you want to operate (default, all, production, etc...) and the command (start, stop, status).  Those are passed in the <exec> command that I put in my description: 

 <exec executable="/bin/bash"  failonerror="true"> 
         <arg value="-c" /> 
         <arg value="${jboss-as}/bin/jboss_init_redhat.sh start node${port-set}"/>
 </exec>

Although, I do see that if that were to be used, it would need to be more like this:

 <exec executable="/bin/bash"  failonerror="true"> 
         <arg value="-c" /> 
         <arg value="${jboss-as}/bin/jboss_init_redhat.sh" />
         <arg value="start" />
         <arg value="node${port-set}"/>
 </exec>

I haven't tried this one out, but I had also tried the following below and that didn't work either:

 <exec dir="/bin/" executable="bash" spawn="yes">
	<arg line="${jboss-as}/bin/jboss_init_redhat.sh start node${port-set}"/>
 </exec>

Also, just as a note, so that I can use that debugging as well, the [postinstall] output that you received with the failure errors, did you get that from an agent running with debug -e?  If so, I'll make sure to run with that in the future to have more info with my bugs.

Thanks,
Nabeel

Comment 10 John Mazzitelli 2011-10-24 11:46:33 UTC
(In reply to comment #9)
> Hello John, I've attached the jboss_init_redhat.sh

I don't see any attachment like that on this issue. I just see one attachment named "The bundle deploy.xml that fails starting up" which was uploaded on 2011-09-15.

Comment 11 Nabeel Saad 2011-10-24 11:53:06 UTC
Created attachment 529842 [details]
jboss init script

Comment 12 Nabeel Saad 2011-10-24 11:53:40 UTC
Sorry John, somehow had missed doing that.  It's attached now.  Cheers.

Comment 13 John Mazzitelli 2011-10-24 14:36:55 UTC
looking closer at the stack trace, I see this is where it technically failed:

Caused by: javax.persistence.PersistenceException: org.hibernate.exception.DataException: could not execute query

	at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:629)

	at org.hibernate.ejb.QueryImpl.getSingleResult(QueryImpl.java:99)

	at org.rhq.enterprise.server.util.CriteriaQueryRunner.getCount(CriteriaQueryRunner.java:124)

	at org.rhq.enterprise.server.util.CriteriaQueryRunner.execute(CriteriaQueryRunner.java:76)

	at org.rhq.enterprise.server.resource.ResourceTypeManagerBean.findResourceTypesByCriteria(ResourceTypeManagerBean.java:474)


...
	at $Proxy214.findResourceTypesByCriteria(Unknown Source)

	at org.rhq.enterprise.server.resource.metadata.PluginManagerBean.isReadyForPurge(PluginManagerBean.java:108)


Here's the code building the criteria that gets the types:

    public boolean isReadyForPurge(Plugin plugin) {
        ResourceTypeCriteria criteria = new ResourceTypeCriteria();
        criteria.addFilterPluginName(plugin.getName());
        criteria.setRestriction(Criteria.Restriction.COUNT_ONLY);
        PageList results = resourceTypeMgr.findResourceTypesByCriteria(subjectMgr.getOverlord(), criteria);

        return results.getTotalSize() == 0;
    }

Is it possible for you to attach the plugin jar to this BZ, or at least attach the plugin descriptor .xml?

Comment 14 John Mazzitelli 2011-10-24 14:38:28 UTC
ignore that last comment - wrong BZ

Comment 15 Nabeel Saad 2011-10-24 16:58:03 UTC
So is this appropriately marked as awaiting info from me?  If so, what info would that be.  Let me know if you need anything from me.

Cheers,
Nabeel

Comment 16 John Mazzitelli 2011-10-24 18:38:12 UTC
my current testing on the current code base reveals the following.

I had a simple test bundle with this as the receipt:

----
<?xml version="1.0"?>
<project name="mazz" default="main" xmlns:rhq="antlib:org.rhq.bundle">
   <rhq:bundle name="mazz" version="2.0" description="testing ant exec">
      <rhq:deployment-unit name="appserver" postinstallTarget="postinstall">
         <rhq:file name="mazz.sh" destinationFile="mazz.sh" />
      </rhq:deployment-unit>
   </rhq:bundle>

   <target name="main" />
   <target name="postinstall">
      <rhq:audit>Chmod'ing ${rhq.deploy.dir}/mazz.sh</rhq:audit>
      <chmod dir="${rhq.deploy.dir}" perm="755" includes="**/*.sh"/>
      <rhq:audit>START: ${rhq.deploy.dir}/mazz.sh</rhq:audit>
      <exec executable="/bin/bash"  failonerror="true">
         <arg value="-c" />
         <arg value="${rhq.deploy.dir}/mazz.sh start"/>
      </exec>
      <rhq:audit>DONE: ${rhq.deploy.dir}/mazz.sh</rhq:audit>
   </target>
</project>
----

My "mazz.sh" script was very simple - it just echo'ed out the arguments to a file:

----
echo $* >> /tmp/deleteme/MAZZ_OUT.txt
----

I put that mazz.sh and deploy.xml into a bundle.zip and deployed it. It all worked. My /tmp/deleteme/MAZZ_OUT.txt showed "start" as the argument passed to it. So that tells me at least <exec> works from the agent's ant subsystem used to run bundle ant recipes.

There must be something specifically done differently in the scripts used to replicate this issue (compared to the scripts I just successfully used).

Comment 17 John Mazzitelli 2011-10-24 19:06:43 UTC
I now just added some additional debug output to the attached recipe and jboss init script to see if they are being run properly.

Everything from the RHQ point of view is being exec'ed properly. Specifically, in the jboss init script, I added this as the first line it executes:

   echo starting: $* >> /tmp/deleteme/MAZZ_OUT.txt

Notice it will echo the cmd line args passed to it into this .txt file.

In addition, right after the setting of CMD_START, echoed it out as well:

   #define the start/stop commands to use in the terminal
   CMD_START="cd $JBOSS_HOME/bin; $JBOSSSH" 
   echo $CMD_START >> /tmp/deleteme/MAZZ_OUT.txt

Once I deployed this to /tmp/deleteme/testeap (that is my bundle target directory), I see this in my MAZZ_OUT.txt file:

starting: start node03
cd /tmp/deleteme/testeap/jboss-eap-5.1/jboss-as/bin; /tmp/deleteme/testeap/jboss-eap-5.1/jboss-as/bin/run.sh -c node03 -b 127.0.0.1

So here you can see that the Ant recipe was invoked by RHQ, its <exec> was successfully called and you see the arguments passed to jboss_init_redhat.sh are correct ("start node03"). In addition, the CMD_START command line looks good.

Why it doesn't actually start I do not know. However, at this point, we are outside of RHQ's control and inside the JBoss launch script.

Comment 18 John Mazzitelli 2011-10-24 19:56:38 UTC
Nabeel - can you do the same kind of debugging on your end and see what your jboss_init_redhat.sh script is doing? Add some "echo ABC >> /tmp/NABEEL.out" lines in your jboss_init_redhat.sh script where ABC is some message to help debug your launcher script.

We need to find out why your launcher script is not behaving properly. It could be due to the user the script is running as (maybe it doesn't have permissions to run jboss?) or file permissions (maybe the user doesn't have execute permissions on some jboss script or read permissions on some jboss config file)?

Comment 19 Charles Crouch 2011-11-02 14:22:26 UTC
Dropping priority as everything appears to be working from the RHQ side. 
Waiting for more feedback and can raise priority again based on that.

Comment 20 Nabeel Saad 2011-11-04 00:35:57 UTC
Hello gents,

I do now have a functional recipe that starts up my server with the exec, so I think that scenario is no longer valid.

And my other scenario in 2.4.1 with rhq-input type boolean that fails can no longer be tested in JON 3 as that bug has been fixed...

So, for the moment, I would suggest closing this bug.  If I do run into any other scenarios where the ant test tool and the agent (in the JON 3 world) do different things, I will provide that as an example on this bug and re-open it.

Many thanks for looking into this.
Nabeel