Bug 128615

Summary: Character set should ISO-2022-JP (Japanese) for ja_JP locale by default
Product: [Fedora] Fedora Reporter: Nakai <ynakai>
Component: xchatAssignee: Daniel Reed <djr>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: eng-i18n-bugs, fedora-ja-list
Target Milestone: ---Keywords: i18n
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-07-27 18:10:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nakai 2004-07-27 08:41:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; ja-JP; rv:1.6) Gecko/20040510

Description of problem:
IRC charset must be ISO-2022-JP by default for Japanese locales.
(ja_JP.UTF-8, ja_JP.eucJP)
So when user run xchat in Japanese locales, it should set
character set as ISO-2022-JP (Japanese) by default.
Users don't want use a pico seconds of their time to select
the appropriate character set.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Run xchat in Japanese locale.
2. Join Japanese IRC channel.
3. Scrambled.
    

Actual Results:  Scrambled by default.

Expected Results:  ISO-2022-JP by default.

Additional info:

Red Hat/Fedora must always be responsible which Japanese
codeset you are posting to the Internet.

Comment 2 Daniel Reed 2004-07-27 18:10:42 UTC
Unfortunately, the IRC protocol has no standard support for
internationalization. The protocol was not designed to support
explicit negotiation of character sets between users, by channel, or
by server/network, and it was not designed with a way to add such
support as an authoritative extension (however, see below).

Traditionally, all messages sent through IRC are assumed to be using a
modified form of US-ASCII (possibly refered to as the "ascii character
set" with "rfc1459 case mappings"), and it may be a feature that all
messages received by IRC clients are displayed in that character set
rather than their host session's character set (unless explicitly
reconfigured by the user).

An alternative, where the IRC client presumes that all incoming
messages are in the character set of its host session, does not sound
more reasonable than the traditional behaviour.

Another alternative may be available on servers that use the
non-standard 005 "extended version" numeric sent during connect. This
numeric exists on many major IRC servers and includes both "CHARSET"
and "CASEMAPPING" fields, which could be used to tell the client which
character set and case map to use by default (instead of US-ASCII and
the pseudo-Scandinavian "rfc1459" case map). This would be
per-network, however, as it would not be reasonable to have some
servers on a network advertize one character set and other servers
advertize a different one, so it would only make sense for
pure-Japanese IRC networks (it could not be used by a Japanese user
using the Red Hat IRC network, the EFnet IRC network, etc.).

The relevant documentation is RFC 1459 "Internet Relay Chat Protocol"
section 2.2 "Character codes", which specifically states that no
thought was given to character code specification in the protocol.