Bug 1020066 - Script server execution arguments do not interpret quoted arguments including spaces as a single parameter
Script server execution arguments do not interpret quoted arguments including...
Status: CLOSED CURRENTRELEASE
Product: JBoss Operations Network
Classification: JBoss
Component: Operations (Show other bugs)
JON 3.1.2
All All
unspecified Severity high
: DR01
: JON 3.2.1
Assigned To: Lukas Krejci
Mike Foley
:
Depends On: 1049608
Blocks: jon321-blockers 1062782 1062789
  Show dependency treegraph
 
Reported: 2013-10-16 19:43 EDT by Marc Shirley
Modified: 2014-05-08 13:44 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1049608 (view as bug list)
Environment:
Last Closed: 2014-05-08 13:44:17 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 514753 None None None Never

  None (edit)
Description Marc Shirley 2013-10-16 19:43:58 EDT
Description of problem:
Executing a script server resource with an argument string including a quoted argument that contains spaces results in the quoted argument being broken out into multiple arguments.

For example, "a b" is treated as two arguments.  '"a' is the first argument, and 'b"' is the second.

Version-Release number of selected component (if applicable):
JBoss ON 3.1.2

How reproducible:
very

Steps to Reproduce:
1. Import script server resource (script example below)
2. Perform "Execute" operation on script server resource
3. Input arguments "a b"
4. Verify output

testScript.sh contents:
#!/bin/sh

echo $# number of arguments > output.txt
echo $* >> output.txt
echo $@ >> output.txt


Actual results:
Script reports that 2 arguments have been passed into it.

Expected results:
Script should report that 1 argument was passed in.

Additional info:
Comment 1 Larry O'Leary 2013-10-22 18:06:45 EDT
This issue is a direct result of how the operation's arguments property value is passed along to executeExecutable and then split into multiple arguments using the space:

    https://git.fedorahosted.org/cgit/rhq/rhq.git/tree/modules/plugins/script/src/main/java/org/rhq/plugins/script/ScriptServerComponent.java#n405

The end result is that:
    Arg1 "Arg 2" Arg3

Becomes:
    Arg1
    "Arg
    2"
    Arg3

And:
    Arg1 'Arg 2' Arg3

Becomes:
    Arg1
    'Arg
    2'
    Arg3

And:
    Arg1 Arg\ 2 Arg3

Becomes:
    Arg1
    Arg\
    2
    Arg3


To fix this, a more suitable regular expression and matcher should be applied to handle quoted (double or single) parameters and/or escapes. Possibly a simple command-line parser?

The most basic form is to support:

Arg1 "Arg 2" Arg3
Arg1 'Arg 2' Arg3

However, from a command-line perspective, the quote may need to be escaped:

Arg1 'Arg\'s 2 Value' Arg3

Additionally, the following format seems to be common within the Linux realm:

Arg1 Arg\ 2 Arg3
Comment 2 Lukas Krejci 2014-01-07 15:53:15 EST
BZ 1049608 fixes this in master and is trivially portable to JON 3.2.x stream.

However JON 3.1.x doesn't support updating the plugin configuration during resource upgrade. 

This is used to define the needed escape character (which defaults to ^ on Windows and \ on other platforms). Note that this parameter is there purely for user convenience (so that args can be copied from other places without changes for the specific platform) but the plugin will actually work with any escape character (not just ^ or \).

To overcome this, we can for example hard-code the escape character in JON 3.1.x stream to be \.

As of now, the fix is only in master (by means of above mentioned BZ) but has not been ported to any JON stream.

What do you say, Larry?
Comment 3 Larry O'Leary 2014-01-07 16:34:52 EST
(In reply to Lukas Krejci from comment #2)
> However JON 3.1.x doesn't support updating the plugin configuration during
> resource upgrade. 

Why would we need to upgrade plug-in configuration? I have not looked at your commit but even if you were to introduce a new configuration property in rhq-plugin.xml, the code that uses that property could use hard-coded defaults if null is returned. Or perhaps if it is undefined (null), do doing any parsing/escaping and continue to work as we did.

> As of now, the fix is only in master (by means of above mentioned BZ) but
> has not been ported to any JON stream.

The ultimate goal (at the moment) is to get this in 3.2.x. But I think my above statement applies to 3.2.x as well. We shouldn't need to upgrade/update any existing plug-in configuration. The lack of any value should indicate that the old/original behavior is expected.
Comment 4 Larry O'Leary 2014-01-07 19:32:50 EST
It has come to my attention that I may have rushed the 3.1.x request on this. 

Our intention is to get this fix into 3.2.x. At this time, let's disregard the 3.1.x request as this can be handled via a different BZ if we actually require it. 

Sorry for any confusion.
Comment 5 Lukas Krejci 2014-01-08 03:38:59 EST
The need to update the plugin configuration programatically stems from the fact that the default should be different based on the operating system. So while we can provide "\" as the default escape character, it will be slightly confusing, and misleading even, for the users using the script server for running windows programs, where '\' is a path separator and '^' is used for escaping in the command line.
Comment 6 Lukas Krejci 2014-01-08 03:42:03 EST
release/jon3.2.x 7181e17fdbf05fe9e8de563626c4dae35cd5e093
Author: Lukas Krejci <lkrejci@redhat.com>
Date:   Tue Jan 7 21:43:42 2014 +0100

    [BZ 1049608] Support quotes and escapes in script args in script plugin
Comment 7 Larry O'Leary 2014-01-08 13:27:03 EST
(In reply to Lukas Krejci from comment #5)
> The need to update the plugin configuration programatically stems from the
> fact that the default should be different based on the operating system. So
> while we can provide "\" as the default escape character, it will be
> slightly confusing, and misleading even, for the users using the script
> server for running windows programs, where '\' is a path separator and '^'
> is used for escaping in the command line.

I still don't understand this. From my point of view, we have two options:

- use reasonable default if not defined

  The plug-in itself takes care of this. If escape character is null, then depending on what platform we are on, use \ or ^.

- legacy mode

  If escape character is null, then use the command-line arguments as we were prior to this fix. No special parsing or escaping or quoting. 


What am I missing? Why do we need to perform plug-in configuration update for this fix? Seems like existing inventory can be left alone and one of the two mentioned options could be used. For any newly imported resources, a reasonable default could be set during discovery.
Comment 8 Lukas Krejci 2014-01-08 15:27:47 EST
I was considering the following scenarios:

Consider a group of script servers on different platforms to which you need to pass escaped params. There is no way of doing it for the whole group en masse unless you define a custom escape param that is common for the whole group.

On the other hand people will intuitively expect that they can use the same escape chars as on the platform the script server is defined (until they need to support the above scenario ;) ).

IMHO, we need to cater for both possible use cases if possible. Your first option actually accomplishes it, too, but I took a different approach. Instead of considering null value as "the same as on platform" (which is your approach), I require the escape char to be not null and default it to a platform dependent value (i.e. the only difference between your and my approach is that the user always explicitly sees what the escape char is in my case, whereas in your case they would have to infer it from docs and the platform the script is on).

Considering that the fix is currently only meant for 3.2.x which supports both approaches, I don't think we need to change it. But I am easily convinced otherwise and if you would rather have it the way you outlined I can easily change it.

I would not go for your second option though - the legacy behavior of "blindly" splitting by a space is just wrong and having to define an escape character to be able to handle quotes is counter-intuitive.
Comment 9 Larry O'Leary 2014-01-08 16:13:54 EST
Okay. Makes sense.

However, you do bring up an interesting point that now has me concerned.

If I have a group of script servers -- some running on Windows and others running on Linux -- group configuration updates are broken if using the escape character when I have not explicitly set it.

In other words, the user must remember that they need to set the escape character to the same thing for all members before updating any script argument value which uses escaping.

Perhaps we should just define our own escape character and make the user use it? That way it is always the same regardless of platform? 

I don't see why we can't tell a user that backslash (\) is the escape character and for a literal \ it must be escaped as \\.

i.e. 
arg1 arg\ 2 arg_three = {arg1, arg 2, arg_three}
arg1 "arg 2" arg_three = {arg1, arg 2, arg_three}
arg1 \"arg\ 2\" arg_three = {arg1, "arg 2", arg_three}
arg1 \"arg\\ 2\" arg_three = {arg1, "arg\,  2", arg_three}

Regardless of which platform I am on, we parse them the same way. If we think we should still make the escape character configurable then just set it to backslash by default -- again, regardless of what platform.

In regards to the legacy option, the more I think about it the better it sounds. What if user's are already using ^ or \ in their script server's arguments? This new behavior would break existing configuration. Unless we treat null as legacy behavior with no escaping performed.
Comment 10 Lukas Krejci 2014-01-09 02:12:02 EST
You make a good point about supporting the legacy mode.

I don't like \ as the default escape character because of the fact that it is separator in paths on Windows. Paths are quite likely to be part of the arguments (or at least they are as probable as anything else) and thus we would be breaking all existing script servers on windows with paths in their arguments upon upgrade.

So I guess we can change the impl to do the following:
1) new boolean plugin config property "Enable support for quotes and escapes in arguments", defaulting to false to maintain the legacy behavior
2) The default escape character should be neutral and different from \ or ^. Perhaps '$' (literal $ escaped as $$)?

I like this better than falling back to legacy behavior when escape is undefined because of its explicitness. Having the escape character set to something non-standard by default IMHO makes the user think about how it is going to affect the arguments, which the default might not (I'd rather have them have a wth moment which will make them think than blindly accepting something they know and assume, not thinking about the consequences).
Comment 11 Larry O'Leary 2014-01-09 13:15:02 EST
Perhaps this requires more discussion on rhq-devel.

I don't like the idea of introducing a new boolean that enables/disables escaping. It should be available by default. For legacy support, the escape character being null should be enough to handle existing resource configuration.

For new resources, it seems that escaping should just be supported and a consistent and common default be available. But your concern with file system paths is a very big one.

Perhaps we don't do full escaping? In other words, \ only acts as an escape if used outside of a quoted string and followed by a ", ', or space or if used inside a quoted string followed by a matching quote?

This should handle my examples from comment 9 plus the following:

/a C:\TEMP\MyConfig.ini arg_three = {/a, C:\TEMP\MyConfig.ini, arg_three}
/a C:\TEMP\My\ Config.ini arg_three = {/a, C:\TEMP\My Config.ini, arg_three}
/a C:\TEMP\My\\ Config.ini arg_three = {/a, C:\TEMP\My\,  Config.ini, arg_three}
/a "C:\TEMP\My\\ Config.ini" arg_three = {/a, C:\TEMP\My\\ Config.ini, arg_three}


Of course this still doesn't address the issue that a Windows path with non-quoted escaping could be:
    /a C:\TEMP\My^ Config.ini arg_three 

which one would expect to become:
   {/a, C:\TEMP\My Config.ini, arg_three}

but instead, due to a single escape character used by the script server plug-in would become:
   {/a, C:\TEMP\My^,  Config.ini, arg_three}

... However, perhaps that is still valid once the command is executed? Most likely not though. I suspect Windows would treat it similar to:
    myscript /a "C:\TEMP\My^" " Config.ini" arg_three


The only alternative I can see is perhaps we make a single consistent escape character part of a future major release and temporarily for 3.x we we use the escaping boolean approach with it set to disabled by default. This allow the user to enable it with the understanding that all backslash characters will be interpreted as escape unless enclosed in double-quotes (using POSIX standard). Therefore if they want to enable the use of escaping to support spaces in a single argument they will need to ensure they enclose all Windows file system paths in double-quotes or change the backslash in the path to a forward slash.
Comment 12 Larry O'Leary 2014-01-17 06:07:45 EST
@Lukas, any update on this?
Comment 13 Lukas Krejci 2014-01-17 09:49:18 EST
Ok, this is getting a little bit complex.

Let's first recap what our goals for the escaping are and then discuss how to achieve them:

1) For back-compat reasons, we should just split by " " by default
2) Ideally, escaping is platform-sensitive, so that users don't need to "re-invent" their command lines
3) directory separators should not confuse the default escaping
4) The introduction of platform-sensitive escape mechanism should not confuse existing resource groups with resources on different platforms.

IMHO, the easiest solution to satisfy all of above constraints is to do the following:
1) Introduce 'escape char' plugin config prop, defaulting to empty string meaning dummy escape by space
2) Update the description of args config props to explain the situation with escaping

This achieves the goals in the following manner:
1) default escape char is empty, which translates to dummy splitting by " "
2) The prop descriptions will detail what the escape chars should be for Win and *nix so users can provide proper escaping chars without "thinking".
3) Again by an informed choice, users can avoid the clash of escape char and dir separator
4) We keep the default behavior after upgrade, so pre-existing groups will function as before. If users need escaping of args on a group that contains scripts for both Windows and *nix (how frequent is this going to be?) then they will need to choose an escape char that will fit them in this advanced usecase.

The only thing I am a little bit unhappy about is the keeping the default of "dummy" behavior. But in my eyes keeping backwards compatibility is unachievable without that.

If you agree with that I will commit the necessary changes to both branches...
Comment 14 Larry O'Leary 2014-01-17 11:54:35 EST
Are we going to handle quoting though? That is the real issue here. Escaping only comes into play when you need to use quotes inside the quote. In both Windows and Linux, quoting is used to separate arguments.

"argument one" "argument two"
'argument one' 'argument two'

Without the need for escaping. If the quote is to be used within the quoted argument, it would to be escaped:

'O\'Leary'

And finally, quoting can be avoided if escaping the space:

argument\ one argument\ two

In which case, space is still used as the separator. This is true for both Windows and POSIX.

As for the escape character, the description should not recommend one over the other but instead provide a common example for Windows and Linux.

... A character that represents an escape that will force the character following it to be used as a literal. For Linux this is normally the backslash (\) and for Windows the caret (^).
Comment 15 Lukas Krejci 2014-01-24 11:43:48 EST
Well, this is where I really start to prefer the additional flag of "enable quoting and escaping" (defaulting to false), because in order to retain full back-compat, we'd have to switch off quoting by default, too.

IMHO, it'd be more confusing to realize that by defining an escape character one also enables quoting of the parameters, than it would be to explicitly check one more checkbox.

As for quoting behavior - your suggestion to stick with POSIX escaping (i.e. escaping in non-quoted and double-quoted strings, no escaping in single-quoted strings) makes most sense. As far as I read, there is no such standard behavior on Windows, but because we need to provide some behavior, why not stick with the proven and easily understandable standard.

Btw. I assume you did a mistake in comment #11 where you suggested escaping is not done in double-quoted strings. Am I right in assuming that? As far as I am aware these are the rules for escaping:
1) In unquoted: \ preserves the value of any following character
2) In double-quoted: \ escapes $ `  " and \ otherwise is left as is
3) In single-quoted: no escaping can occur

Wrt POSIX-compliant escaping: Currently I don't do it that way, I just replace every \x with x (where x stands for any character). POSIX requires more elaborate rules (above). Considering the fact that we a) do not provide variable expansion nor back-ticks and b) the special chars are different on windows anyway, do you consider this a drawback or rather something that would simplify life for users if documented properly? We can change this to be more in line with POSIX to only escape " and escape char in double-quoted but I'd not touch $ or ` because that could look like we should be supporting them, which I think is out of scope.

Alternatively, we can change the boolean "enable quoting and escaping" to "quoting and escaping mode" with choices "Simple", "POSIX", "None". 

"None" would mean backwards compatible and would be default. 

"Simple" would be what we have now (escape anything in unquoted, something in double-quoted, nothing in single-quoted).

For the "POSIX" option, implementing the parameter and arithmetic expansion and command substitution possible under POSIX after the $ sign is a bit involved (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html). So I'm not so much in favor of implementing "POSIX". But I am easily convinced otherwise ;)
Comment 16 Larry O'Leary 2014-01-24 14:25:30 EST
(In reply to Lukas Krejci from comment #15)
> Well, this is where I really start to prefer the additional flag of "enable
> quoting and escaping" (defaulting to false), because in order to retain full
> back-compat, we'd have to switch off quoting by default, too.

My preference would be to default to false for existing and new resources for 3.2. However, for 3.3 default to false for existing resources (which do not define a value) and true for newly created resource -- Considering this is something we should have been doing the hole time. Which brings me back to my point of not even needing the boolean in some future release. 

> IMHO, it'd be more confusing to realize that by defining an escape character
> one also enables quoting of the parameters, than it would be to explicitly
> check one more checkbox.

My point is that quoting is an expected behavior in all operating systems. Command arguments are delimited by a space unless they are enclosed in quotes. The fact that we are would separate escaping from quoting just doesn't make sense to me. Quoting is a fact of many standards. Even SQL-92 defines quoting as a means of using a space for a command argument.

As it stands currently, quoting is completely broken and can't have any real meaning or value. Once we introduce quoting as a means to define single arguments that contain spaces we introduce the need for escaping to support parameters that may actually contain a single quote or double quote as part of the value (i.e. surname=O'Leary).

> As for quoting behavior - your suggestion to stick with POSIX escaping (i.e.
> escaping in non-quoted and double-quoted strings, no escaping in
> single-quoted strings) makes most sense. As far as I read, there is no such
> standard behavior on Windows, but because we need to provide some behavior,
> why not stick with the proven and easily understandable standard.

Right. Supporting two different escape characters doesn't make sense to me. We don't even have to use POSIX -- we could come up with our own escaping style/syntax. However, POSIX seems to make the most sense as most operating systems support it and it seems to be well defined.

> Btw. I assume you did a mistake in comment #11 where you suggested escaping
> is not done in double-quoted strings. Am I right in assuming that? As far as
> I am aware these are the rules for escaping:
> 1) In unquoted: \ preserves the value of any following character
> 2) In double-quoted: \ escapes $ `  " and \ otherwise is left as is
> 3) In single-quoted: no escaping can occur

Correct. My point was that in non-quoted use, \ acts as an escape for the character that follows. However, if double quotes are used, it is restricted to a list of characters as you describe. 

> Wrt POSIX-compliant escaping: Currently I don't do it that way, I just
> replace every \x with x (where x stands for any character). POSIX requires
> more elaborate rules (above). Considering the fact that we a) do not provide
> variable expansion nor back-ticks and b) the special chars are different on
> windows anyway, do you consider this a drawback or rather something that
> would simplify life for users if documented properly? We can change this to
> be more in line with POSIX to only escape " and escape char in double-quoted
> but I'd not touch $ or ` because that could look like we should be
> supporting them, which I think is out of scope.

My only concern with escaping in double-quotes is with Windows file system paths. My goal was to prevent "C:\TEMP\My Config.ini" from becoming "C:TEMPMy Config.ini". 

I agree that we don't have to worry about $ or ` as they have no meaning to us at this time. If that changes, we can control how variable substitution is handled.

> Alternatively, we can change the boolean "enable quoting and escaping" to
> "quoting and escaping mode" with choices "Simple", "POSIX", "None". 
> 
> "None" would mean backwards compatible and would be default. 
> 
> "Simple" would be what we have now (escape anything in unquoted, something
> in double-quoted, nothing in single-quoted).
> 
> For the "POSIX" option, implementing the parameter and arithmetic expansion
> and command substitution possible under POSIX after the $ sign is a bit
> involved
> (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html).
> So I'm not so much in favor of implementing "POSIX". But I am easily
> convinced otherwise ;)

At first, I like this idea. However, it seems like it is unnecessary. We can define escaping however we want. I only suggest that we be POSIX like. However, we should not mention POSIX at all. Instead, we should just define and document what is supported. So perhaps our implementation is:

 1) In unquoted: \ preserves the value of any following character
 2) In double-quoted: \ escapes " and \ otherwise is left as is
 3) In single-quoted: no escaping can occur

And most importantly, an argument enclosed in " or ' is treated as a single argument. And due to the escaping rule, any " or ' proceeded by a \ which is not already enclosed in " or ', will become a literal " or ' and not be treated as a argument boundary.
Comment 17 Jirka Kremser 2014-01-31 06:06:44 EST
commit in the release branch: 8058e6ba2
Comment 18 Simeon Pinder 2014-02-18 10:08:40 EST
Moving to ON_QA as available for testing in the following brew build:
https://brewweb.devel.redhat.com//buildinfo?buildID=336752

Note: the installed version is still JON 3.2.0.GA by design and this represents part of the payload for JON 3.2.1 also known as cumulative patch 1 for 3.2.0.GA.  How this will be delivered to customers is still being discussed.
Comment 19 Jan Bednarik 2014-02-24 06:39:10 EST
Verified on JON 3.2.1 DR1 (Build Number: c758688:4c03150)

I experimented on both quoted (" and ') and unquoted arguments with escape character set to \ " or ' -> all worked.
Comment 20 Mike Foley 2014-05-08 13:44:17 EDT
JON 3.2.1 released week of 5/5/2014

Note You need to log in before you can comment on or make changes to this bug.