From Bugzilla Helper: User-Agent: Mozilla/5.0 (compatible; Konqueror/3.3; Linux; X11; i686; fr, en_US, de, ja) (KHTML, like Gecko) Description of problem: Whenever I try to mix CJK ideograms and Latin-1 in a file, emacs fails. If I try to input something like "fenêtre sur ? " ( last character is #&24237;) via emacs, it complains utf-8 is not appropriate. kwrite and yudit handle a file containing this string just fine. And, if I try to reopen this file in emacs, I have: "Fenêtre sur \345\272\255 " (which is the correct UTF-8 encoding of the ideogram) Version-Release number of selected component (if applicable): emacs-21.3-12 How reproducible: Always Steps to Reproduce: 1.Open a file enconded in UTF-8 in emacs Or 2. Try to save a file in UTF-8 in emacs Actual Results: As far as I see, only the Latin-1 characters are properly interpreted. Expected Results: All characters should be correctly handled in UTF-8
It sounds like you're not using MuleUCS: emacs-21.3 can't handle CJK utf-8 without it. If you're in a non-CJK locale, then you need to add something like (require 'un-define) to your ".emacs" file say.
>add (require 'un-define) to .emacs Done that. Thanks. it just works now. Sorry for filing a bug against this, but this behaviour was surprising; I expected it to work "out of the box" and not have to tweak .emacs . Perhaps would it be nice to document this function somewhat more.
Perhaps it should have gone into the release notes at some point. The reason un-define is not loaded by default for all locale is that it takes quite some time to load and is normally only needed by CJK users. (For a CJK utf-8 locale it is loaded by default see "lang-coding-systems-init.el".) The next version of Emacs should have much better builtin utf-8 support so MuleUCS should no longer be needed then.