75681 – UTF-8 vs ISO-8859-1, norwegian characters garbled

Bug 75681 - UTF-8 vs ISO-8859-1, norwegian characters garbled

Summary: UTF-8 vs ISO-8859-1, norwegian characters garbled

Keywords:
Status:	CLOSED DUPLICATE of bug 75280
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	xchat
Sub Component:
Version:	8.0
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Mike A. Harris
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-10-10 23:46 UTC by Andreas-Johann Ulvestad
Modified:	2008-05-01 15:38 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-11-01 06:26:20 UTC
Embargoed:

Attachments	(Terms of Use)

Description Andreas-Johann Ulvestad 2002-10-10 23:46:30 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.6 (X11; Linux i686; U;) Gecko/20020827

Description of problem:
When writing norwegian characters in X-Chat, the other side sees them as garbled
text when they are using ISO-8859-1 (in all honesty, 99,9% do!).   Shouldn't the
program be running in non-UTF mode?

Comment 1 Thomas Schuett 2002-10-12 18:47:33 UTC

possible workaraound:

1. move xchat to xchat.bin
2. create a shell-script xchat:

---snip

#!/bin/sh
unset LANG
xchat.bin

----snip end

not nice but works for me...

Comment 2 Andreas Rogge 2002-10-17 21:20:12 UTC

The problem ist true for german umlauts, too.

Also the fix works for me.

Comment 3 Mike A. Harris 2002-10-23 21:48:36 UTC

No, this is not a bug.  When people communicate over IRC, and are
using non-ASCII text, then everyone who is involved in the communication
must use the same 8bit encoding.  The IRC protocol does not specify
any specific encoding, other than that it must be 8bit.

As such, if you are using UTF-8, and someone else is not, they will
rightfully see garbled characters.  The solution, is either for them
to configure their client to use UTF-8 as well, or for you to configure
your client to use the encoding that they are using.

This is something that can _only_ be done by the persons involved
in communication, and is not something that can be autodetected or
otherwise occur without user intervention.  The same thing occurs
if one user is using ISO8859-1, and another user is using
ISO8859-9 or some other encoding.  The only difference with UTF-8
is that non-ASCII characters are encoded as a multiple byte stream
of up to 6 bytes depending on where the characters fall into unicode
space.  So a single foreign non-ASCII character encoded as UTF-8,
might appear on another computer as 2-6 bytes of random garbage
if the other computer is using ISO8859-x or some other 8 bit encoding.

Again, this isn't a bug - things are working very much the way they
should be.  Configure your xchat client to use the same encoding that
others you communicate with are using, or convince everyone else to
use UTF-8, or just deal with the inconvenience until everyone is using
UTF-8.  Either way, there isn't anything that software can do about
it.

Closing as NOTABUG

Comment 4 Mike A. Harris 2002-10-23 21:57:25 UTC

This "issue" is being reported a LOT lately, and it's consuming a rather
large amount of time merely to just respond to people and explain the
reason why it is not a bug, and why it is expected behaviour that two
people communicating through networking, or two applications exchanging
text must agree upon an encoding to use that they both understand.

Since there is absolutely nothing that I can do about the 'problem', nor
anyone else for that matter, since it isn't a bug, I'm making this bug
as the master "bug to close future bug reports on the same topic as
duplicates" against.

The end resolution of all such problems in short, boils down to:

         "Unicode Growing Pains(TM)"

Once all systems in common usage are using unicode by default (like
most other operating systems have been for about 10 years), then
issues like this will become non-issues, and internationalized text
in Linux wont suck anymore.

What would be really really nice, is if a new RFC came out for IRC
which stated that any compliant IRC client and server for the new
protocol *must* use UTF-8 unicode encoding for text transmission.
That would solve the problem quickly.  But I'm dreaming....  that
wont happen.

Oh well...

Comment 5 Mike A. Harris 2002-11-01 06:26:51 UTC

Reclosing bug report as duplicate...

*** This bug has been marked as a duplicate of 75280 ***

Comment 6 Petri T. Koistinen 2002-11-02 22:42:44 UTC

I added LC_CTYPE="fi_FI@euro" to /etc/sysconfig/i18n so that X-Chat will work OK.

Note You need to log in before you can comment on or make changes to this bug.