Bug 467735

Summary: Multi-byte characters in Routing key or Queue name are encoded wrong in Java via JMS
Product: Red Hat Enterprise MRG Reporter: David Sommerseth <davids>
Component: qpid-javaAssignee: Arnaud Simon <asimon>
Status: CLOSED ERRATA QA Contact: Kim van der Riet <kim.vdriet>
Severity: high Docs Contact:
Priority: medium    
Version: 1.0CC: freznice, gsim, ovasik
Target Milestone: 1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-04 15:37:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Sommerseth 2008-10-20 16:14:57 UTC
When creating routing keys or queue names containing multi-byte characters, they are encoded wrong.  Exchange names get the right encoding.

Creating new exchange as 'Testdata 3 - æ Æ'
Creating new queue as 'Testdata 2 - å Å'
Creating new routing key as 'Testdata 1 - ø Ø'

ø = 0xC3 0xB8
Ø = 0xC3 0x98

å = 0xC3 0xA5
Å = 0xC3 0x85

æ = 0xC3 0xA6
Æ = 0xC3 0x86


The low level AMQP API in Java seems to handle this correctly:
----------------------------------------------------------
2008-oct-20 18:01:00 trace RECV [10.34.32.136:56082]: Frame[BEbe; channel=1; {ExchangeDeclareBody: exchange=Testdata 3 - \C3\A6 \C3\86; type=direct; alternate-exchange=; arguments={}; }]
2008-oct-20 18:01:00 debug ca0af174-312b-4dbe-bb81-9ce53da63cb5@guest@QPID: recv cmd 0: {ExchangeDeclareBody: exchange=Testdata 3 - \C3\A6 \C3\86; type=direct; alternate-exchange=; arguments={}; }

2008-oct-20 18:01:00 trace RECV [10.34.32.136:56082]: Frame[BEbe; channel=1; {QueueDeclareBody: queue=Testdata 2 - \C3\A5 \C3\85; alternate-exchange=; arguments={}; }]
2008-oct-20 18:01:00 debug ca0af174-312b-4dbe-bb81-9ce53da63cb5@guest@QPID: recv cmd 1: {QueueDeclareBody: queue=Testdata 2 - \C3\A5 \C3\85; alternate-exchange=; arguments={}; }

2008-oct-20 18:01:00 trace RECV [10.34.32.136:56082]: Frame[BEbe; channel=1; {ExchangeBindBody: queue=Testdata 2 - \C3\A5 \C3\85; exchange=Testdata 3 - \C3\A6 \C3\86; binding-key=Testdata 2 - \C3\A5 \C3\85; arguments={}; }]
2008-oct-20 18:01:00 debug ca0af174-312b-4dbe-bb81-9ce53da63cb5@guest@QPID: recv cmd 2: {ExchangeBindBody: queue=Testdata 2 - \C3\A5 \C3\85; exchange=Testdata 3 - \C3\A6 \C3\86; binding-key=Testdata 2 - \C3\A5 \C3\85; arguments={}; }
----------------------------------------------------------


But when processing this through the JMS API, the encoding gets worse - Sending data:
----------------------------------------------------------
2008-oct-20 18:01:01 trace RECV [10.34.32.136:56083]: Frame[BEbe; channel=1; {ExchangeDeclareBody: exchange=Testdata 3 - \C3\A6 \C3\86; type=direct; alternate-exchange=; arguments={}; }]
2008-oct-20 18:01:01 debug 1596a536-ab0c-4f85-9158-1139d831e349@guest@QPID: recv cmd 0: {ExchangeDeclareBody: exchange=Testdata 3 - \C3\A6 \C3\86; type=direct; alternate-exchange=; arguments={}; }

2008-oct-20 18:01:01 trace RECV [10.34.32.136:56083]: Frame[be; channel=1; header (123 bytes); properties={{MessageProperties: content-length=961; message-id=812064e0-613f-376a-9ac4-9fec216bc8b2; content-type=application/octet-stream; }{DeliveryProperties: priority=4; delivery-mode=1; timestamp=1224518461017; expiration=0; exchange=Testdata 3 - \C3\A6 \C3\86; routing-key=Testdata 2 - \EF\BF\A5 \EF\BF\85; }}]

2008-oct-20 18:01:01 warning DirectExchange Testdata 3 - \C3\A6 \C3\86 could not route message with key Testdata 2 - \EF\BF\A5 \EF\BF\85
----------------------------------------------------------


Receiving data via JMS API:
----------------------------------------------------------
2008-oct-20 18:01:01 trace RECV [10.34.32.136:56084]: Frame[BEbe; channel=1; {ExchangeDeclareBody: exchange=Testdata 3 - \C3\A6 \C3\86; type=direct; alternate-exchange=; arguments={}; }]
2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: recv cmd 0: {ExchangeDeclareBody: exchange=Testdata 3 - \C3\A6 \C3\86; type=direct; alternate-exchange=; arguments={}; }

2008-oct-20 18:01:01 trace RECV [10.34.32.136:56084]: Frame[BEbe; channel=1; {QueueDeclareBody: queue=Testdata 2 - \EF\BF\A5 \EF\BF\85; alternate-exchange=; arguments={}; }]
2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: recv cmd 2: {QueueDeclareBody: queue=Testdata 2 - \EF\BF\A5 \EF\BF\85; alternate-exchange=; arguments={}; }

2008-oct-20 18:01:01 debug Configured queue with no-local=0
2008-oct-20 18:01:01 debug Configured queue Testdata 2 - \EF\BF\A5 \EF\BF\85 with qpid.trace.id='' and qpid.trace.exclude='' i.e. 0 elements

2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: receiver marked completed: 2 incomplete: { } unknown-completed: { [1,2] }
2008-oct-20 18:01:01 trace RECV [10.34.32.136:56084]: Frame[BEbe; channel=1; {ExecutionSyncBody: }]
2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: recv cmd 3: {ExecutionSyncBody: }
2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: receiver marked completed: 3 incomplete: { } unknown-completed: { [1,3] }
2008-oct-20 18:01:01 trace SENT [10.34.32.136:56084]: Frame[BEbe; channel=1; {SessionCompletedBody: commands={ [1,3] }; }]
2008-oct-20 18:01:01 trace RECV [10.34.32.136:56084]: Frame[BEbe; channel=1; {ExchangeBindBody: queue=Testdata 2 - \EF\BF\A5 \EF\BF\85; exchange=Testdata 3 - \C3\A6 \C3\86; binding-key=Testdata 2 - \C3\A5 \C3\85; arguments={x-match:(0x95)V2:3:}; }]
2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: recv cmd 4: {ExchangeBindBody: queue=Testdata 2 - \EF\BF\A5 \EF\BF\85; exchange=Testdata 3 - \C3\A6 \C3\86; binding-key=Testdata 2 - \C3\A5 \C3\85; arguments={x-match:(0x95)V2:3:}; }

2008-oct-20 18:01:01 trace RECV [10.34.32.136:56084]: Frame[BEbe; channel=1; {MessageSubscribeBody: queue=Testdata 2 - \EF\BF\A5 \EF\BF\85; destination=1; accept-mode=0; acquire-mode=0; resume-id=; resume-ttl=0; arguments={}; }]
2008-oct-20 18:01:01 debug 9b0da973-c587-4b94-867c-38e9a4dadac0@guest@QPID: recv cmd 6: {MessageSubscribeBody: queue=Testdata 2 - \EF\BF\A5 \EF\BF\85; destination=1; accept-mode=0; acquire-mode=0; resume-id=; resume-ttl=0; arguments={}; }

2008-oct-20 18:01:01 debug No messages to dispatch on queue 'Testdata 2 - \EF\BF\A5 \EF\BF\85'
----------------------------------------------------------

According to Arnaud, the reason it is correct on exchange name and not routing or queue name, is because of the use of AMQShortString.

Comment 1 Gordon Sim 2008-10-24 12:26:30 UTC
David reports that Arnaud has fixed this on trunk; moving to modified.

Comment 2 Arnaud Simon 2008-10-24 12:41:47 UTC
See QPID-1384
Two tests have been added in systests, see class org.apache.qpid.test.unit.message.UTF8Test

Comment 3 Frantisek Reznicek 2008-11-13 14:26:27 UTC
RHTS test qpid_i18n_multibyte_tests verifies that Java client is now UTF8 compatible. 
Another RHTS test 'qpid_compilation_unit_tests' verifies also org.apache.qpid.test.unit.message.UTF8Test java client test

Validated on RHEL4.7 / 5.2 i386 / x86_64 on packages:
rhm-0.3.2783-1.el5, qpidd-0.3.713378-1.el5, qpidd-xml-0.3.713378-1.el5 [...]
->VERIFIED

Comment 5 errata-xmlrpc 2009-02-04 15:37:17 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0035.html