Bug 65251 - Bash and readline don't support multibyte locales
Summary: Bash and readline don't support multibyte locales
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: distribution
Version: 7.3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: wdovlrrw
QA Contact: Ben Levenson
URL:
Whiteboard:
Depends On:
Blocks: 65252 79579
TreeView+ depends on / blocked
 
Reported: 2002-05-20 22:27 UTC by Owen Taylor
Modified: 2007-04-18 16:42 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-01-15 13:48:01 UTC
Embargoed:


Attachments (Terms of Use)

Description Owen Taylor 2002-05-20 22:27:54 UTC
Readline doesn't have any idea of anything but single-byte encodings.
This means that if you are running in in a locale such as en_US.UTF-8
and have multibyte characters on the command line, then editing
will edit partial bits of characters, resulting in incorrect
behavior and invalid text.

A partial patch for bash support UTF-8 is found at:

 http://www.tldp.org/HOWTO/Unicode-HOWTO-4.html#ss4.1

I haven't tried it out or studied it in detail. 

Explicit UTF-8 has:

 - The advantage of knowing the encoding so its easier to deal
   with things like invalid sequences in a robust manner.

 - The disadvantage, compared to using generic functions
   like mblen, mbtowc, of working only in UTF-8 locales, 
   and not in other multibyte locales like ja_JP.eucJP.

(I'm filing this against bash because it has its own copy of readline
and bash is the most important place this will be noticed. The
same thing seems to apply to readline as well. I don't think
any support for multibyte encodings is really needed in bash
other than in readline.)

Comment 1 Owen Taylor 2002-05-23 19:14:46 UTC
There is a big, ugly, but plausible-looking multibyte-support-for-bash patch at:

http://oss.software.ibm.com/developer/opensource/linux/patches/?patch_id=34

Comment 2 Karsten Hopp 2002-07-13 00:02:56 UTC
Phil has looked into this and hopefully fixed it. CC:'d him for confirmation.

Comment 3 Jay Turner 2003-01-10 13:41:33 UTC
What's the status of this?  Can we close it out?

Comment 4 Miloslav Trmac 2003-01-10 13:44:58 UTC
I'm running RHL 8.0 with bash-2.05b-7 (IIRC) from rawhide without problems
with UTF-8.

Comment 5 Jay Turner 2003-01-15 13:48:01 UTC
Closing out based on feedback from reporter.


Note You need to log in before you can comment on or make changes to this bug.