Bug 397021 - Problems converting to iso-2022-jp//translit
Problems converting to iso-2022-jp//translit
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: glibc (Show other bugs)
All Linux
low Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2007-11-23 11:03 EST by John Haxby
Modified: 2008-05-21 12:52 EDT (History)
0 users

See Also:
Fixed In Version: RHBA-2008-0083
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-05-21 12:52:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
UTF8 character sequence used in describing the problem. (73 bytes, text/plain)
2007-11-23 11:03 EST, John Haxby
no flags Details

  None (edit)
Description John Haxby 2007-11-23 11:03:46 EST
Description of problem:

While I'm logging this problem against RHEL5, it's actually present in all the
version of glibc that I've been able to track down: everything from 2.3.2 (on
RHEL3) to 2.7 (on Fedora 8).

How reproducible:


Steps to Reproduce:
1. With a UTF-8 locale, eg en_GB.UTF-8:

    echo £€ | iconv -t iso2022jp//translit | iconv -f iso2022jp
    iconv -t iso2022jp//translit < attachment
    echo -e '\xe3\x88\xb1' | iconv -t iso2022jp//translit

The attachment is a series of UTF-8 characters some of which can be translated
to iso-2022-jp some (the numbered bullets, for example) cannot.
Actual results:

   First command:
     £鍍iconv: illegal input sequence at position 7

   Second command:
     ^[$BF|K\8l^[(B ^[$B5!<o0MB8J8;z(1)(2)(3)iiiiiiIIIIII(^[$B3t^[(B)

   and "(^[$B3t^[(B)" is repeated forever -- iconv never completes.

   Third command:
     no output, iconv just consumes 100% CPU until you get bored :-)

Expected results:

   The first command should produce "£EUR" because while there's a sterling
symbol in iso-2022-jp there isn't a Euro symbol.  The illegal input sequence is
as a result of not shifting back to ASCII after putting out the sequenc that
represents a sterling symbol.   You can see what happens if you look at the
output from just converting a £ to iso-2022-jp and then at the combined output.

   The second command is seriously problematic.  In a program that is converting
a fairly short string in a buffer to another in a buffer that grows as needed,
the target buffer will grow arbitrarily large, or it would if the OOM killer
didn't step in.

   The third command extract just one character from the UTF8 sequence
(represented as three bytes) and iconv spins with this.

Additional info:

I strongly suspect all three problems are different aspects of the same bug. 
This bug has been around for quite a while and it wasn't until a collegue in
Japan was testing support for some of the more unusual characters used in
Japanese text (that aren't actually in ISO-2022-JP but are in a common
extension, CP50221 aka ISO-2022-JP-MS).   This rather unfortunately behaviour
has been causing chaos!
Comment 1 John Haxby 2007-11-23 11:03:46 EST
Created attachment 267661 [details]
UTF8 character sequence used in describing the problem.
Comment 3 RHEL Product and Program Management 2008-01-08 09:04:49 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 7 errata-xmlrpc 2008-05-21 12:52:50 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.