Bug 142265 - readline can not edit utf-8 multi-byte characters (backspace)
readline can not edit utf-8 multi-byte characters (backspace)
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: ftp (Show other bugs)
3
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Tim Waugh
Ben Levenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-12-08 12:23 EST by Doncho N. Gunchev
Modified: 2007-11-30 17:10 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-12 10:30:10 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ftp-locale.patch (459 bytes, patch)
2004-12-15 12:13 EST, Tim Waugh
no flags Details | Diff

  None (edit)
Description Doncho N. Gunchev 2004-12-08 12:23:42 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
    With readline-4.3-13 (FC3's current) erasing utf-8 multi-byte
characters with backspace fails.

Version-Release number of selected component (if applicable):
readline-4.3-13

How reproducible: Always

Steps to Reproduce:
1. install FC3 and update
2. start python/use bash to 'read' variables
3. type some utf-8 multi-byte characters and use backspace

Actual Results:    backspace erases byte by byte

Expected Results:  backspace erasing character by character

Additional info:
The following example can reproduce it (tested with konsole and
gnome-terminal, xterm shows nothing, the word 'Проба' is in Bulgarian
and means 'Test'):
--- cut ---
$ echo -n "Test: "; read a; echo "[$a]"
Test: Проба
<You can paste this, then hit backspace 4 times to get the next line>
Test: П
<Press Enter here and you get>
[Про]
$ _
--- cut ---
    If I erase one character I get '[Проб�'. Pádraig Brady noted that
readline 5 probably fixes this -
http://www.redhat.com/archives/fedora-devel-list/2004-December/msg00259.html
(the message has charset="utf-8", but the cyrilic letters are
scrambled), http://cnswww.cns.cwru.edu/php/chet/readline/CHANGES for
readline 5 info. This affects bash (read function only, command
editing works), python and probably all other console applications
that use readline.
Comment 1 Tim Waugh 2004-12-08 12:49:46 EST
Two points to make to clarify this issue:

1. You are not using readline at all when you use bash's read builtin like that.
 You must pass the -e option for that: "read -e a"

2. The bash package (only) in FC3 is already using readline 5.

So, combining the fact that "read -e" with bash *does* handle UTF-8 correctly
(try it) and it fails as you say in Python, it looks like readline 5 does fix this.
Comment 2 Tim Waugh 2004-12-08 12:51:37 EST
Actually, hang on -- does python use readline?  It doesn't link to it.
Comment 3 Tim Waugh 2004-12-08 12:52:23 EST
Oh, never mind, it dlopens it or something similar.
Comment 4 Doncho N. Gunchev 2004-12-08 22:11:49 EST
Yes, I confirm comment # 1, 'read -e -p "Test: " a; echo "[$a]"' works fine.
What do bash's read (without -e), pdksh and zsh for example (which even prints
every utf-8 multi-byte character as two) use, their own code? Does this mean
every single console application has to be checked? vi works, jed fails...
Comment 5 Tim Waugh 2004-12-09 04:56:57 EST
Can't speak for other applications, but "read" without -e on bash just invokes
the read() system call, so you just get tty line discipline.
Comment 6 Pádraig Brady 2004-12-09 06:40:24 EST
Note "read -e" works for bash-2.05b-20.1 (redhat 9) also.
I don't think that Was that using readline 5?

What did you do exactly to get python to fail.
python on rh9 seems to work for me at least?
Comment 7 Doncho N. Gunchev 2004-12-09 07:46:33 EST
    Reply to comment # 6:
1. start python on the command line in say konsole.
2. Copy and paste this on the prompt '>>> ' (you should see "Tect",
but it consists of cyrilic letters): "Тест"
3. Start erasing with backspace - you should be able to erase '>>> '
too. If you paste 2 letters - you should be able to erase it to '>>'.
    I found this problem playing with trac-admin.
Comment 8 Pádraig Brady 2004-12-14 08:39:55 EST
Can't reproduce this on RH9:
python-2.2.2-26
readline-4.3-5
Comment 9 Tim Waugh 2004-12-15 12:12:45 EST
Hang on a minute, I think this is an optical illusion.

Python doesn't (of its own accord) call setlocale(LC_CTYPE,""), so of _course_
it won't handle multibyte issues when you just invoke it and start typing.

However, if you try this:

python
>>> import locale
>>> locale.setlocale (locale.LC_CTYPE, "")
>>> Тест

then backspace until the prompt, you'll find that you cannot delete it. 
Multibyte is now being handled correctly.

So python is a bad choice for demonstrating readline bugs.

Similarly, /usr/bin/ftp seems to neglect to call setlocale(LC_CTYPE,""), so
that's an ftp bug.

However, lftp *does* do the right thing, and shows that readline is not the culprit.

The only bug here is in ftp.
Comment 10 Tim Waugh 2004-12-15 12:13:53 EST
Created attachment 108633 [details]
ftp-locale.patch

..and here's the fix.
Comment 11 Tim Waugh 2004-12-15 12:21:50 EST
Fixed in CVS.
Comment 13 Doncho N. Gunchev 2004-12-16 05:32:16 EST
(In reply to comment #9)
> Hang on a minute, I think this is an optical illusion.
> 
> Python doesn't (of its own accord) call setlocale(LC_CTYPE,""), so of _course_
> it won't handle multibyte issues when you just invoke it and start typing.
> 
> However, if you try this:
> 
> python
> >>> import locale
> >>> locale.setlocale (locale.LC_CTYPE, "")
> >>> Тест
> 
> then backspace until the prompt, you'll find that you cannot delete it. 
> Multibyte is now being handled correctly.
> 
    A complicated story. I was told it could be readline in the -devel list
AFAIR, that's why I filled it against readline...
    I don't agree it's ftp bug only, I think if fedora's default encoding is
UTF-8, then all programs should use/default to it. If I'm right isn't this a
'distribution' bug or do I/we have to test every single program (this part can
not be avoided) and fill separate bugs for each? Why on RH9 python "works"? I'm
a bit confused... what now?
Comment 14 Tim Waugh 2004-12-16 05:54:31 EST
Python cannot call setlocale() on its own -- it isn't allowed to.  That's up to
the Python *script* that it runs.  Python is just the interpreter.

Feel free to file bugs against any other readline-using applications that
exhibit this bad behaviour, but bear in mind that program interpreters are
sometimes a special case.
Comment 15 Doncho N. Gunchev 2004-12-22 01:33:37 EST
    About Comment #9: it's not /usr/bin/ftp, it works fine for me (ftp-0.17-22),
/usr/kerberos/bin/ftp is not. This means the component is krb5-workstation, not
ftp, right (or maybe bouth?)?
    What about bash, does the direct read() system call make it buggy?
Comment 16 Tim Waugh 2004-12-22 04:18:16 EST
Doncho: ftp-0.17-22 fails for me, so I think you are testing in a different way.
 A fix has been committed to CVS.

I think that the krb5 ftp intentionally avoids linking to readline (but may be
wrong).

No, bash is not buggy: see the documentation and understand the -e parameter to
the read builtin.

Note You need to log in before you can comment on or make changes to this bug.