Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1139667

Summary: Special characters in remote operation parameters causes various issues
Product: [Retired] JBoss BPMS Platform 6 Reporter: Ivo Bek <ibek>
Component: Business CentralAssignee: Shelly McGowan <smcgowan>
Status: CLOSED EOL QA Contact: Ivo Bek <ibek>
Severity: medium Docs Contact:
Priority: high    
Version: 6.1.0CC: ibek, kverlaen
Target Milestone: ER6   
Target Release: 6.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-27 19:43:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
server log none

Description Ivo Bek 2014-09-09 11:53:52 UTC
Description of problem:

Form parameters in REST API and commands enable to set special characters in parameters of some operations such as starting a process.

Below I mention 2 issues I faced against when I tried some of the special characters. In the examples I always start a process with the tested parameter.

1) Is it intended to remove quotes from the passed strings when they are at the beginning or ending in REST API?

I mention the beginning and ending because the following example works:

 > This is \"quoted string\" that works.

However, after I send the parameter below, I don't get the same text when I get ProcessInstanceVariableLog:

 > "quoted string"

expected:<"["quoted string"]"> but was:<"[quoted string]"> -- this is a case for REST API

JMS API that uses StartProcessCommand works well with "quoted string".

2) A string containing ampersand cannot be passed via StartProcessCommand (REST or JMS), however REST API start process operation works well with the ampersand inside a string:

 > String with ampersand &.

I get Exception - Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 390; The entity name must immediately follow the '&amp;' in the entity reference.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Ivo Bek 2014-09-09 11:54:21 UTC
Created attachment 935651 [details]
server log

Comment 4 Marco Rietveld 2015-02-12 12:48:46 UTC
This note is here for others to read:

This is a relatively expensive feature to support. In short, this is because we can not predict what inputs a user will use. 

For example, we could try to simply encode *only* the characters, such as '&' for JAXB, that cause problems. However, how do we encode that? We can not simply substitute "XAMPX" for '&'  in the input because "XAMPX" might also be in the input! When we decode the string on the server side, how do we know which "XAMPX" should be changed back into '&' and which into "XAMPX"? 

The only solution here is thus to uniformly encode all input, regardless of whether or not it contains special characters: however, encoded strings are at least 2 times and sometimes up to 4 times as long, not to mention that there's a performance penalty for encoding and decoding the string. 

As a result, the fix here is optional and by default turned off. See other comments below for more informatoin.

Comment 7 Ivo Bek 2015-02-12 14:59:34 UTC
Lowering severity, since there is a workaround to use already encoded strings like:

"String with ampersand &amp;."
"&quot;quoted string&quot;"

from the examples in the BZ description.

Comment 8 Marco Rietveld 2015-02-12 17:03:35 UTC
Ivo, 

In addition to &, >, <, ' and ", there are also all of the characters outside of the set that the XML standard mandates:

http://www.w3.org/TR/xml/#charsets

My main concern is that I want a solution that I don't ever have to worry about again, as opposed to a whitelist solution. Adding a "whitelist" of characters that should be encoded is, in my opinion, a piece of code that will come back to poke me a couple times. Maybe there is (or will be) an additional character out there that a client will use -- or maybe there are or will be characters in languages like Chinese or Arabic that fall outside of the XML specification: if that happens, I don't want to have to reengineeer this code. 

In other words, this is a small problem, I don't want to write a solution for it that I or other engineers might have to care for later, especially if there's a solution that takes care of it that requires less maintenance. 


Does that make sense to you? Are there points that I've gotten wrong or have otherwise missed?

Comment 9 Ivo Bek 2015-02-13 12:48:20 UTC
Sure there might be also other characters and I did not mean creating a whitelist and replacing the characters manually. I was trying Arabic and Chineese alphabet and these are working well. 

I propose to use https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringEscapeUtils.html#escapeXml%28java.lang.String%29 for escaping the xml content. I believe that it should cover all the most common characters. I share your fear but I would say that the Appache Commons library will solve it good enough. WDYT?

Comment 12 Ivo Bek 2015-03-09 16:32:04 UTC
Verified in BPMS 6.1.0.ER6 though I mentioned some concerns in Comment 11.

Comment 13 Marco Rietveld 2015-03-09 17:45:08 UTC
Ivo, 

Sorry for the late answer. 

No problem and good point: my original thoughts were that using the library would be too much of a performance cost. Also, the changes I added already fit into the existing code framework where using the StringEscapeUtils would mean that I would have to refactor some code. 

Later, I realized that the network would always cost more than this. 

If you want, feel free to open a bug concerning this for 6.2? What do you think?