Bug 144707
Summary: | non-CJK text broken by default for Western locale | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | James Ralston <ralston> | ||||
Component: | emacs | Assignee: | Chip Coldwell <coldwell> | ||||
Status: | CLOSED NEXTRELEASE | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | rawhide | CC: | petersen | ||||
Target Milestone: | --- | Keywords: | FutureFeature, Reopened | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Enhancement | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2006-11-06 16:32:00 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
[Bah, I didn't realize the Bugzilla beta doesn't wrap lines anymore.
Here's a problem report without the lines wrapped.]
I have two machines running FC2. On those machines, I have no
problems with emacs and Japanese text (hiragana, katakana, kanji). I
can visit files containing Japanese text encoded in UTF-8; emacs
autodetects UTF-8 encoding and displays the text properly. I can save
buffers containing Japanese text using the utf-8 coding system.
In short, on the FC2 machines, Japanese text in emacs just works.
I just loaded FC3 on a third machine. On the FC3 machine, I can't get
emacs Japanese text support working. If I visit a buffer containing
Japanese text encoded in UTF-8, I just get a bunch of gibberish
(backslash octal characters and empty boxes). I can paste Japanese
text (copied from another application) into Emacs buffers, and it
displays properly, but then if I attempt to save the buffer, I receive
this message:
> These default coding systems were tried:
> utf-8
> However, none of them safely encodes the target text.
>
> Select one of the following safe coding systems:
> euc-jp shift_jis iso-2022-jp iso-2022-jp-2 x-ctext
> japanese-iso-7bit-1978-irv iso-2022-7bit raw-text emacs-mule
> no-conversion iso-2022-7bit-lock-ss2 ctext-no-compositions
> iso-2022-8bit-ss2 iso-2022-7bit-lock iso-2022-7bit-ss2
> tibetan-iso-8bit-with-esc thai-tis620-with-esc lao-with-esc
> korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc
> greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc
> iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc
> iso-latin-2-with-esc iso-latin-1-with-esc
> in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc
> chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc
I cannot see how this message can be correct, because UTF-8 encodes
*everything*.
Emacs Japanese text support worked just fine on FC2, but now it
appears to be broken on FC3. All other FC3 applications I've used
have worked just fine; it only seems to be emacs that is broken. Does
anyone know what's going on and how to fix it?
(I did an "Everything" install when I loaded my FC3 machine, so I am
hoping that the problem is something simple, like I accidentally
included some bogus "backwards-compatibility" package designed for
Japanese text support in the days before UTF-8.)
It seems to work fine for me. Have tried with "emacs -q"? Running "emacs -q" yields the same results. I will go post on gnu.emacs.help and see if anyone there has any ideas. Created attachment 110618 [details] screenshot of Emacs failing to save a buffer with Japanese text I posted to both gnu.emacs.help and gnu.emacs.bug: http://groups-beta.google.com/group/gnu.emacs.help/browse_thread/thread/2bc4eb72af963da3/c85c2cf0be051ef6 http://groups-beta.google.com/group/gnu.emacs.bug/browse_thread/thread/1f5edc323f8c05d9/035de9a8da1d3782 The single reply that I received to my gnu.emacs.help post doesn't seem to be relevant. I am open to the possibility that I am doing something wrong. However, all available evidence indicates that this is a bug with Emacs, especially since everything worked under FC2. I've attached a self-explanatory screenshot that describes the problem. If you could hit some of the Emacs developers over the head with this, I'd appreciate it. Well this is the first report of this problem I've heard so I suspect something is wrong on your side. Are there any local modifications to the site config files for emacs? Does "rpm -V emacs emacs-common" output anything? Peter Salvi finally clued me in to the solution back in August 2005. The problem is that Unicode support isn't loaded by default. You need to specifically load Unicode support by adding the following line to your .emacs file: (require 'un-define) The above line is all that's necessary, but from Googling, I also found this suggestion: ;;; Load Unicode support. (when (locate-library "un-define") (require 'un-define) (require 'unicode) ;; requiring unidata is optional: (require 'unidata)) In either case, this is *completely* unintuitive. I spent weeks researching this problem, and I never once encountered *any* mention that Unicode support had to be specifically enabled. It's 2006. The fact that a program that is Unicode-capable requires specific contortions to enable Unicode support is mind-boggling. I'm guessing the only reason Emacs doesn't load Unicode support by default is because it dramatically increases Emacs' startup time (try launching Emacs with and without having the un-define line in your .emacs file and you'll see what I mean). But that's no excuse--Unicode support should be enabled by default, and if people complain about poor startup times, then the solution is to fix the startup time with Unicode support enabled, not break Unicode support. Well that is not strictly correct: Emacs supports utf-8 for non-Asian characters by default, but not Asian characters. So yes, if you're in a European/Western locale then you need to load un-define yourself since it slows down Emacs startup significantly for users that don't need Asian language support. un-define is set to load at startup when Emacs is started in an Asian locale. So if you want to make use of that you can set LC_CTYPE=ja_JP.UTF-8 for example for Emacs, or borrow the setup code in lang-coding-systems-init.el for your .emacs if you prefer not to do that. (BTW in Emacs 22 un-define will no longer be necessary so that will be a big win.) I suggest adding this to dotemacs.el ;;; uncomment for CJK utf-8 support for non-Asian users ;; (require 'un-define) However the problem here really is that emacs22 has been released yet. devel: fixed in 21.4-18 |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 Description of problem: I have two machines running FC2. On those machines, I have no problems with emacs and Japanese text (hiragana, katakana, kanji). I can visit files containing Japanese text encoded in UTF-8; emacs autodetects UTF-8 encoding and displays the text properly. I can save buffers containing Japanese text using the utf-8 coding system. In short, on the FC2 machines, Japanese text in emacs just works. I just loaded FC3 on a third machine. On the FC3 machine, I can't get emacs Japanese text support working. If I visit a buffer containing Japanese text encoded in UTF-8, I just get a bunch of gibberish (backslash octal characters and empty boxes). I can paste Japanese text (copied from another application) into Emacs buffers, and it displays properly, but then if I attempt to save the buffer, I receive this message: > These default coding systems were tried: > utf-8 > However, none of them safely encodes the target text. > > Select one of the following safe coding systems: > euc-jp shift_jis iso-2022-jp iso-2022-jp-2 x-ctext > japanese-iso-7bit-1978-irv iso-2022-7bit raw-text emacs-mule > no-conversion iso-2022-7bit-lock-ss2 ctext-no-compositions > iso-2022-8bit-ss2 iso-2022-7bit-lock iso-2022-7bit-ss2 > tibetan-iso-8bit-with-esc thai-tis620-with-esc lao-with-esc > korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc > greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc > iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc > iso-latin-2-with-esc iso-latin-1-with-esc > in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc > chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc I cannot see how this message can be correct, because UTF-8 encodes *everything*. Emacs Japanese text support worked just fine on FC2, but now it appears to be broken on FC3. All other FC3 applications I've used have worked just fine; it only seems to be emacs that is broken. Does anyone know what's going on and how to fix it? (I did an "Everything" install when I loaded my FC3 machine, so I am hoping that the problem is something simple, like I accidentally included some bogus "backwards-compatibility" package designed for Japanese text support in the days before UTF-8.) Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Run emacs and try to use Japanese text. Actual Results: See above. Expected Results: See above. Additional info: