Bug 447960 - Mutt converts mail to legacy character set
Summary: Mutt converts mail to legacy character set
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: mutt
Version: 9
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Miroslav Lichvar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-22 16:43 UTC by David Woodhouse
Modified: 2008-05-27 10:18 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2008-05-27 10:18:49 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description David Woodhouse 2008-05-22 16:43:23 UTC
It's the 21st Century. We should be using UTF-8.

But when I compose a mail in mutt using non-ascii characters, it'll send it as
iso8859-1 (unless I force the issue by using characters not in iso8859-1 either).

Comment 1 Miroslav Lichvar 2008-05-22 16:49:32 UTC
How is set the send_charset variable?

Comment 2 David Woodhouse 2008-05-22 17:14:09 UTC
That would be in .muttrc? I don't have a .muttrc -- I only fired up mutt to
reproduce this bug when it was reported to me.

Comment 3 David Woodhouse 2008-05-22 17:17:20 UTC
Setting it to 'us-ascii:utf-8' in /etc/Muttrc fixes the problem. Please consider
doing this in our package (and likewise for any other cases where we might use
legacy 8-bit charsets instead of utf-8).

Thanks.

Comment 4 Miroslav Lichvar 2008-05-23 09:58:41 UTC
What exactly is the problem caused by using iso-8859-1?

"us-ascii:iso-8859-1:utf-8" is the upstream default and I'm not sure there isn't
a client that can handle iso-8859-1, but not utf-8.

Comment 5 David Woodhouse 2008-05-23 10:34:33 UTC
iso8859-1 is a legacy character set.

One of the problems with charset labelling is that the labels tend to "fall
off". If we were good at labelling everything and keeping the labels, things
would have been a lot easier and a lot of the motivation for UTF-8 wouldn't have
been there.

One of the main advantages of our "UTF-8 everywhere" policy is that the
labelling increasingly doesn't matter -- it's more and more valid just to assume
UTF-8. Anything generated within your own system (be that a single computer, a
network of computers, whatever) should be in UTF-8.

In this particular case, for example, I should be able to grep my local maildir
of outgoing mail for 'naïve' and get the expected result. I wouldn't expect to
find my own outgoing mail encoded in iso8859-1, euc-jp, or anything but UTF-8
(or ascii, as a subset of UTF-8).

Another case where it causes problems is sending patches for the Linux kernel.
The kernel is in UTF-8, and patches which are charset-converted will either not
apply, or will add invalid text to the kernel (git-am will do charset conversion
from legacy charsets on the cosmetic parts, including the commit comment, but it
leaves the body of the patch exactly as it was).


Comment 6 Miroslav Lichvar 2008-05-23 11:16:10 UTC
Ok, I'll change it in the next release. Thanks.

Comment 7 Patrice Dumas 2008-05-23 12:15:52 UTC
I think that using iso-latin1 when there is only iso-latin1 characters
is safer. On the receiving end the system may be much older than fedora
and don't know about utf8, and though some systems may not know about utf8
all should know about latin1.

Comment 8 Miroslav Lichvar 2008-05-27 10:18:49 UTC
It seems that thunderbird is also using iso8859-1 by default, so mutt isn't the
only MUA in Fedora behaving like this.

I'll leave it as it is for now. Users that need utf8 to be used can always
override the setting in .muttrc.


Note You need to log in before you can comment on or make changes to this bug.