Bug 1006352 - rhqctl install should only install the server and not start it unless explicitly instructed
Summary: rhqctl install should only install the server and not start it unless explici...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Launch Scripts
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ER03
: JON 3.2.0
Assignee: Jay Shaughnessy
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: jon32-Beta-Blockers-1006862
TreeView+ depends on / blocked
 
Reported: 2013-09-10 13:29 UTC by Larry O'Leary
Modified: 2014-01-02 20:34 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-02 20:34:06 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
rhq-installer.log (27.53 KB, text/x-log)
2013-10-11 14:16 UTC, Armine Hovsepyan
no flags Details
rhqctl-install-start (2.18 KB, text/x-log)
2013-10-11 14:25 UTC, Armine Hovsepyan
no flags Details
rhq-upgrade.log (51.53 KB, text/x-log)
2013-10-14 10:38 UTC, Armine Hovsepyan
no flags Details
needinfo (47.76 KB, text/x-log)
2013-10-14 11:41 UTC, Armine Hovsepyan
no flags Details

Description Larry O'Leary 2013-09-10 13:29:28 UTC
Description of problem:
Using an install command one would expect the result to be an "install." Currently, when performing an install using rhqctl install not only is an install performed but the components are also started and left in a running state. 

The rhqctl script should be updated to only perform install when the install command-line argument is used and to perform an install and start when the start command-line argument is used.

Comment 1 John Mazzitelli 2013-09-11 13:59:51 UTC
the server needs to be started in order to install. the installer runs remote CLI commands to configure the running AS server. without starting the server, the installer will fail.

Comment 2 Larry O'Leary 2013-09-11 14:14:52 UTC
The point of this BZ is that INSTALL = INSTALL and START = START/RUNNING. It is understood that the server may require starting while it is being setup. But one would imagine that I could do one of the following:

rhqctl start

 --- and if the server is not yet configured/installed, it will be installed and then left in a running state.


rhqctl install
 --- the server will be installed or re-installed and will not continue to run afterwords.

Comment 3 John Sanda 2013-09-12 19:43:07 UTC
I do not see a problem as long as things are clearly documented. The behavior is consistent with the older, GUI installer. If you run rhqctl start and nothing is installed, it logs a message to that effect.

Comment 4 Larry O'Leary 2013-09-12 20:36:38 UTC
Hm. That isn't really the same. In previous versions I run rhq-server.sh start and if it isn't installed, it is installed. If it is installed, it is started. In the end, I used the start command so it was started. 

By separating start and install, the install parameter seems like it should, install -- not start.

Comment 5 Jay Shaughnessy 2013-10-01 19:01:28 UTC
OK, after much discussion the chosen approach is:

* Keep the 'rhqctl install' command
* By default it will stop the services after install completes
* We'll offer a new 'rhqctl install --start' option to leave services running
* We'll remove the now unnecessary --agent-auto-start option
* 'rhqctl start' will exit if service is not installed, and provide a good message

Comment 6 John Mazzitelli 2013-10-01 19:10:36 UTC
(In reply to Jay Shaughnessy from comment #5)
> OK, after much discussion the chosen approach is:
> 
> * Keep the 'rhqctl install' command
> * By default it will stop the services after install completes
> * We'll offer a new 'rhqctl install --start' option to leave services running
> * We'll remove the now unnecessary --agent-auto-start option
> * 'rhqctl start' will exit if service is not installed, and provide a good
> message


I think a better, less intrusive, less confusing solution is the following:

rename "install" to "install-and-start"

Now, our code doesn't have to change, it works like we want (that is, services all start up after being installed), and the confusion is gone because users don't have to read the docs to understand that "install-and-start" will both install and start everything.

Even if they try to "rhqctl install" they will get an error with --help output, so they can see immediately what command they have to use.

Problem solved, with hardly any code changes or process changes - everything works like it always has. Just the command name has been altered to make it clear what its doing.

Comment 7 John Sanda 2013-10-01 19:17:17 UTC
I am -1 on these changes. In comment 4 Larry said that the behavior is different from previous versions. I disagree. The behavior is the same. The difference is in the scripts and command used to install/start things. The user no longer runs rhq-server.sh to install or start the server. He runs rhqctl. By simply reading the documentation for rhqctl it will be clear what the install command does. And I know that an argument is made that users do not read documentation. I don't think that is a valid argument in this case because how else will users know to use rhqctl? Sure, the user might just run rhq-server.sh and see the message that says to use rhqctl. Then what? How does the user figure out what to do with it? At some point, he is going to have to look at some form of documentation. I much prefer adding a --nostart option to rhqctl if we do anything at all.

Comment 8 Jay Shaughnessy 2013-10-01 19:31:33 UTC
I guess the jury is still out...

Comment 9 Larry O'Leary 2013-10-01 20:13:58 UTC
(In reply to John Sanda from comment #7)
> I am -1 on these changes. In comment 4 Larry said that the behavior is
> different from previous versions. I disagree. The behavior is the same.

The statement in comment #4 was based on fact. In JON 3.1, start means "install + start". As of right now, in JON 3.2, start means "start only". This is what is confusing. 

The opinion comes in the form of, the issue is made worse by having a new command called install that means "install + start".

With most services, "install" means install only. It does not leave the service or its processes in a started or running state. It only means that it has been configured so that the service can be controlled and started using the service start command.

If we attempt to apply common practice to rhqctl we are now in a confusing situation if install = "install + start". This is because to start a service, one must use "service start". However, that will fail if install was used and left the process running even though the service start command had not yet been executed. Take for example, to install and start PostgreSQL:

    service postgresql initdb
    service postgresql start

As you can see, two distinctive commands. In this case, initdb even tells you once it is done that you can now start the service in the normal way. This ensures that the process is running using the configured service method for the desired OS. In Windows, after the data store has been defined and initialized:

    net start postgresql

Again, the initialization or installation doesn't leave the process running. The same applies to our own agent service:

    rhq-agent config
    rhq-agent start

Even when installing those services:

    chkconfig postgresql on

The service isn't running. It is only "installed" and "configured" so that it can be started using the appropriate start command.

Now that does not mean that we can't go against the grain and change the definition of install to mean more then install. It just doesn't seem like a wise decision for the long run. 

Perhaps "install" should go away completely and we be left only with "start?" This would further simplify the life-cycle and not introduce a extra set of commands and parameters? For users that only want to install, they would do what they have been doing in JON 3.1 using rhq-server:

    rhqctl start

    ... provide all the info if not specified in the properties file...

    rhqctl stop

Comment 10 John Mazzitelli 2013-10-01 20:55:14 UTC
rhqctl install-and-start

solves all problems IMO and is very, very easy to implement. If this BZ gets assigned to me, that's how I'm implementing the fix.

Comment 11 John Sanda 2013-10-01 21:46:20 UTC
The idea of altogether doing away with the install command is interesting. As pointed out it simplifies rhqctl and is more consistent with the behavior in JON 3.1.

Suppose the user unzips the JON distro and runs,

$ rhqctl start --server

The server installation would fail since the storage node has not even been installed which I think is ok. We cannot add a simple check to make sure the storage node is installed because a storage node could be running on a different machine.

Comment 12 Larry O'Leary 2013-10-01 22:36:35 UTC
(In reply to John Mazzitelli from comment #10)
> rhqctl install-and-start
> 
> solves all problems IMO and is very, very easy to implement. If this BZ gets
> assigned to me, that's how I'm implementing the fix.

Although I agree that this makes the usage very clear, the fact that we are using such long arguments seems a little clunky. Which is why it originally made sense to add an optional argument to the install parameter named --start. 

(In reply to John Sanda from comment #11)
> The idea of altogether doing away with the install command is interesting.
> As pointed out it simplifies rhqctl and is more consistent with the behavior
> in JON 3.1.

Right. I suppose this isn't as much of an issue with the "install" command as it is that install and start just don't make sense when we have "start" as well. It is the combination of these commands that seems confusing.

This is why dropping install altogether seems like a good way forward assuming everyone is okay with that and there is no drawback to losing a install command. This BZ was only proposing that to give value to the install command that we make it "install only" so that there is no confusion to the two (install vs. start) and for the "start" command to perform the install if install has not yet been done.

I suppose another way to accomplish this is to simply ignore the start command and its output when used in combination with install:

$ rhqctl install
$ rhqctl start

Has the same affect as:

$ rhqctl install

And therefore, we don't need an rhqctl start command. But logic tells me that would even be more confusing. User's would then say "How do I start the service once I install it?" Which brings us back to the original point that install is not the same as start and start is not the same as install. However, start could take care of initial configuration or bootstrapping necessary for the first time startup.

> Suppose the user unzips the JON distro and runs,
> 
> $ rhqctl start --server
> 
> The server installation would fail since the storage node has not even been
> installed which I think is ok. We cannot add a simple check to make sure the
> storage node is installed because a storage node could be running on a
> different machine.

Well, I am assuming you are explicitly calling out that a user would specify that the "server" be started explicitly? In that case I think it would be fine for the server to start assuming the database said it was configured. However, if the database was not available or said it wasn't installed, I would expect that this would fail.

I would also expect

$ rhqctl start

To perform the installation if it determined that the system has not yet been installed. 

I think my primary difference is that perhaps the explicit server start request fail due to the system not yet being installed/ready whereas the simple start would perform the install in the same situation.

So, with no knowledge of the rhqctl at the moment here is what I would expect as a user:

$ rhqctl start

-- if not yet installed, run install and then start all services
-- if already installed, start all services

$ rhqctl start --server
$ rhqctl start --storage-node 

-- if not yet installed, display an error indicating that the respective service can not start due to reason 'x'
-- if already installed, start the respective service

$ rhqctl start --agent

-- if not yet installed, run agent install and then start agent service
-- if already installed, start agent service


If we feel install provides some added functionality or extension point, the usage should be similar to:

$ rhqctl install

-- if not yet installed, run install and be sure to shutdown services in the event they were started during install
-- if already installed, not sure if a re-install should be supported here or what the expectation should be...

$ rhqctl start

-- same as above


Or alternatively:
$ rhqctl install --start

-- if not yet installed, run install and then start all services
-- if already installed, not sure if a re-install should be supported here or what the expectation should be...

Comment 15 Jay Shaughnessy 2013-10-03 15:47:07 UTC
After more discussion and consideration, The approach in Comment 5 has been applied.

master commit a632102141b729f6a99c330d6e3d52efae94219b
Author: Jay Shaughnessy <jshaughn>
Date:   Thu Oct 3 11:42:44 2013 -0400

New behavior in place.  The rhqctl install and upgrade commands no longer
leave the installed or upgraded services running, by default. In other
words, they perform only an install or upgrade.

Each now supports a --start option which will ensure the installed or upgraded
services are running when the install or upgrade completes.

Note that for the install command this applies only to service actually
installed by that command invocation (i.e. install can not substitute for
start).

Note that for upgrade the storage service will always be left running after
upgrade if the migrate-data option is specified, to support the migration.

Note that the --agent-auto-start option has been removed from both commands
since they are superseded by the --start option.

also:
- fixed a recent regression in upgrade, now again aborts if it detects
  previously upgraded services.

Comment 16 Simeon Pinder 2013-10-04 20:53:35 UTC
Followup commit to master for this fix: 5335a680124e

Comment 17 Simeon Pinder 2013-10-08 07:41:48 UTC
Moving to ON_QA for testing.

Comment 18 Armine Hovsepyan 2013-10-11 14:16:46 UTC
Created attachment 811151 [details]
rhq-installer.log

Comment 19 Armine Hovsepyan 2013-10-11 14:25:03 UTC
Created attachment 811154 [details]
rhqctl-install-start

Comment 20 Armine Hovsepyan 2013-10-14 10:38:49 UTC
Created attachment 811934 [details]
rhq-upgrade.log

Comment 21 Armine Hovsepyan 2013-10-14 11:41:41 UTC
Created attachment 811949 [details]
needinfo

Comment 22 Armine Hovsepyan 2013-10-14 11:44:37 UTC
verified
please get attached logs



Need info: 

Scenario:
1. install storage only (with agent) and start it - rhqctl install--storage --start
2. install server without start operation

Actual result: Storage is stopped with server

Expected result: I would expect storage to stay started

Comment 23 Larry O'Leary 2013-10-14 15:38:34 UTC
@Armine, this scenario needs to be captured in a new BZ. As it is a little different then what was described in the original discussion around this BZ I would still call this VERIFIED. 

We can triage the new one but I would say that it is a low severity. It is a real issue. The use-case for your scenario seems to be:

 - Originally installed storage cluster on its own host (separate from JON server)
 - Later, decide to add a new JON server to an HA system

In that situation it is important to note that the act of running rhqctl install --server by itself will result in the storage node being shutdown? Or was your scenario using the rhqctl install without specifying the component?

Comment 24 Armine Hovsepyan 2013-10-14 16:30:38 UTC
The scenario was:
1. rhqctl install --storage --start
2. rhqctl install --server

and after step 2 storage was stopped (while agent was not), all 3 were installed on the same host. 

Would this be an expected behaviour?

Comment 25 Larry O'Leary 2013-10-14 16:52:02 UTC
No. This doesn't seem like expected behavior. Get it captured as a new BZ against 3.2. We can leave this on VERIFIED with a See Also link to the new BZ.

Comment 26 Armine Hovsepyan 2013-10-14 17:00:53 UTC
Thanks Larry.

bz #1018917 filed

Original (1006352) bug is verified.


Note You need to log in before you can comment on or make changes to this bug.