Red Hat Bugzilla – Bug 139496
Emacs doesn't handle utf-8 well
Last modified: 2007-11-30 17:10:54 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.3; Linux; X11; i686;
fr, en_US, de, ja) (KHTML, like Gecko)
Description of problem:
Whenever I try to mix CJK ideograms and Latin-1 in a file, emacs
If I try to input something like "fenÃªtre sur ? " ( last character
is #&24237;) via emacs, it complains utf-8 is not appropriate.
kwrite and yudit handle a file containing this string just fine.
And, if I try to reopen this file in emacs, I have:
"FenÃªtre sur \345\272\255 " (which is the correct UTF-8 encoding of
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Open a file enconded in UTF-8 in emacs
2. Try to save a file in UTF-8 in emacs
Actual Results: As far as I see, only the Latin-1 characters are
Expected Results: All characters should be correctly handled in
It sounds like you're not using MuleUCS: emacs-21.3 can't handle
CJK utf-8 without it.
If you're in a non-CJK locale, then you need to add something like
(require 'un-define) to your ".emacs" file say.
>add (require 'un-define) to .emacs
Thanks. it just works now.
Sorry for filing a bug against this, but this behaviour was
surprising; I expected it to work "out of the box" and not have to
tweak .emacs .
Perhaps would it be nice to document this function somewhat more.
Perhaps it should have gone into the release notes at some point.
The reason un-define is not loaded by default for all locale
is that it takes quite some time to load and is normally only
needed by CJK users. (For a CJK utf-8 locale it is loaded by
default see "lang-coding-systems-init.el".)
The next version of Emacs should have much better builtin utf-8
support so MuleUCS should no longer be needed then.