139496 – Emacs doesn't handle utf-8 well

Bug 139496 - Emacs doesn't handle utf-8 well

Summary: Emacs doesn't handle utf-8 well

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	emacs
Sub Component:
Version:	2
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jens Petersen
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-11-16 11:51 UTC by Johan Buret
Modified:	2007-11-30 22:10 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-11-16 14:57:26 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Johan Buret 2004-11-16 11:51:24 UTC

From Bugzilla Helper: 
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.3; Linux; X11; i686; 
fr, en_US, de, ja) (KHTML, like Gecko) 
 
Description of problem: 
Whenever I try to mix CJK ideograms and Latin-1 in a file, emacs 
fails. 
 
If I try to input something like "fenÃªtre sur ?  " ( last character 
is #&24237;) via emacs, it complains utf-8 is not appropriate. 
 
kwrite and yudit handle a file containing this string just fine. 
 
And, if I try to reopen this file in emacs, I have: 
"FenÃªtre sur \345\272\255 " (which is the correct UTF-8 encoding of 
the ideogram) 
 
Version-Release number of selected component (if applicable): 
emacs-21.3-12 
 
How reproducible: 
Always 
 
Steps to Reproduce: 
1.Open a file enconded in UTF-8 in emacs 
Or 
2. Try to save a file in UTF-8 in emacs 
     
 
Actual Results:  As far as I see, only the Latin-1 characters are 
properly interpreted. 
 
Expected Results:  All characters should be  correctly handled in 
UTF-8

Comment 1 Jens Petersen 2004-11-16 13:49:14 UTC

It sounds like you're not using MuleUCS: emacs-21.3 can't handle
CJK utf-8 without it.

If you're in a non-CJK locale, then you need to add something like
(require 'un-define) to your ".emacs" file say.

Comment 2 Johan Buret 2004-11-16 14:57:26 UTC

>add (require 'un-define) to .emacs 
Done that. 
Thanks. it just works now. 
 
Sorry for filing a bug against this, but this behaviour was 
surprising; I expected it to work "out of the box" and not have to 
tweak .emacs . 
 
Perhaps would it be nice to document this function somewhat more.

Comment 3 Jens Petersen 2004-11-17 00:27:35 UTC

Perhaps it should have gone into the release notes at some point.

The reason un-define is not loaded by default for all locale
is that it takes quite some time to load and is normally only
needed by CJK users.  (For a CJK utf-8 locale it is loaded by
default see "lang-coding-systems-init.el".)

The next version of Emacs should have much better builtin utf-8
support so MuleUCS should no longer be needed then.

Note You need to log in before you can comment on or make changes to this bug.