Created attachment 814848 [details] This test application show the bug Description of problem: In order to use international Character such as Korean, Japanese and Chinese, JSP file should contain this line to encode/decode properly: <%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="utf8"%> But it does not work properly. For example, there is A.jsp , B.jsp files A.jsp pass "가나다" character to B.jsp using GET METHOD as a parameter. From URL, I can see localhost:8080/KoreanChar/A.jsp?msg=%EA%B0%80%EB%82%98%EB%8B%A4. The parameter "가나다" is encoded by UTF-8. However, at B.jsp, the parameter is decoded automatically by ISO-8859-1 so it show broken character ê°ëë¤. If I encode the broken character by ISO-8859-1 and decode it by UTF-8 then I can see original character "가나다". Even I set following system-property, it is not working properly. <system-properties> <property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/> <property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/> </system-properties> Version-Release number of selected component (if applicable): How reproducible: I attached KoreanChar.zip. Steps to Reproduce: 1. Import & deploy it using JBDS or Eclipse 2. http://localhost:8080/KoreanChar/ ==>console : Original Msg : 가나다 Encode OriginalMsg by ISO: %3F%3F%3F Encode OriginalMsg by UTF: %EA%B0%80%EB%82%98%EB%8B%A4 Decode unicodeString by ISO : ì´ì£¼í¸ Decode unicodeString by UTF : 가나다 getCharacterEncoding :null getContentType :null 3. input 가나다 and submit 4. See broken Character. ===> Page show : getCharacterEncoding :null getContentType :null ------------------------------------------------ #### Original Msg : ê°ëë¤###### #### Encode OriginalMsg by ISO : %EA%B0%80%EB%82%98%EB%8B%A4###### #### Decode EncodedMsgbyISO by UTF가나다###### Add system-property in standalone.xml <system-properties> <property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/> <property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/> </system-properties> Then test again.=> Same result. Actual results: getCharacterEncoding :null getContentType :null ------------------------------------------------ #### Original Parameter Msg : ê°ëë¤###### Expected results: getCharacterEncoding :null getContentType :null ------------------------------------------------ #### Original Parameter Msg : 가나다 Additional info:
This is only about a GET request that would have its URI badly decoded. This works. It seems you tried to set all possible configuration options and UTF-8 is mentioned just about everywhere, but actually USE_BODY_ENCODING_FOR_QUERY_STRING likely overrides everything (your GET has no charset to specify for its non existent body, and the HTTP default is not UTF-8). Encodings in URI is not a very good idea unless you like problems ...
Hi Remy, Basically, for using international language, I suppose encoding is one of must to do to avoid character problems. As you mentioned, Http default is not UTF-8 so sometimes it makes some problems happen with global language such as Chinese, Korean and so on. Hence, normally this kind of option which override charset enforcely are used. However, even I tested with jsp which contains charset=utf-8 & encoding=utf8, problem was occurred. Although I set system properties "URI_ENCODING and USE_BODY_ENCODING_FOR_QUERY_STRING" to override the charset once agagin, it was also same result. Actually, I didn't test POST but I am not sure why you think Encoding in URI is not a very good way. As I mentioned above, it is usual to use encoding uri for international words. Moreover, I think this is definitely a bug that paramter come from previous page is forcebly decoded by ISO-8859-1 even though charset is defined as utf-8 on the top of file.
Yes, you set everything you can, but that's counter productive. So drop USE_BODY_ENCODING_FOR_QUERY_STRING.
Using the configuration proposed by Rémy (setting only the org.apache.catalina.connector.URI_ENCODING property to UTF-8), I obtained expected correctly encoded output: Results: #### Original Parameter Msg : 가나다###### Issue was verified against EAP 6.3.0.ER10.