Bug 136290 - Add Japanese SJIS locale
Add Japanese SJIS locale
Status: CLOSED DUPLICATE of bug 91894
Product: Fedora
Classification: Fedora
Component: glibc (Show other bugs)
rawhide
All Linux
medium Severity high
: ---
: ---
Assigned To: Jakub Jelinek
: i18n
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-10-19 03:24 EDT by Nakai
Modified: 2016-11-24 11:13 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-10-19 03:44:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Nakai 2004-10-19 03:24:40 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; ja-JP; rv:1.6) Gecko/20040510

Description of problem:
Add Japanese SJIS locale.

Red Hat doesn't support, but the community do.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. SJIS!
2. SJIS!
3. SJIS!
    

Actual Results:  No SJIS locale

Expected Results:  SJIS locale

Additional info:

This severity should High for Japanese support.
Comment 1 Jakub Jelinek 2004-10-19 03:37:09 EDT
There is a big problem with SJIS, particularly that it isn't ASCII
compatible.  Therefore many POSIX programs will misbehave when in
SJIS locale, which is not something that we should IMHO support.
Comment 2 Ulrich Drepper 2004-10-19 03:44:56 EDT
Using a SJIS-based locale means violating standards.  And beside, we
only support UTF-8 locales from now on.
Comment 3 Nakai 2004-10-19 08:04:22 EDT
Sounds like a very dictator logic by European and American
as always happened on codeset issues.

Other commercial UNIXs have that SJIS locale and troublesome.
Users knows them but still want SJIS locale.
Comment 4 Jakub Jelinek 2004-10-19 08:19:45 EDT
Users that want to shoot themselves in the foot can do so themselves.
localedef -i ja_JP -c -f SHIFT_JIS /usr/lib/locale/ja_JP.sjis
will create that for them, but I don't think that is something
we should promote and support.

iconv(1) and iconv(3) certainly support SJIS, so if users have documents
in SJIS charset, they can convert to UTF-8 (and back).
Comment 5 Ulrich Drepper 2004-10-19 11:03:53 EDT
> Sounds like a very dictator logic by European and American
> as always happened on codeset issues.

This is an issue of maintenance for the distribution.  Using a
non-ASCII clean locale means that various programs will just cease to
work.  How will pay all the wasted hour of our developers who get
reports a program doesn't work and it turns out it is the fault of
this locale?

Arguing about other Unixes having support is irrelevant.  There is no
Unix in existence even today which has internationalization as
complete as Linux.  They simply don't have these programs.  And in
case of, say, Solaris there is an additional problem that they have a
non-fixed wchar_t representation (in old code, they switch for new
code as well) which makes it all but impossible to write correct
i18n'ed software.

I would appreciate if you don't use words like the above anymore since
we actually know what the implications are since we actually wrote the
code.
Comment 6 Shun Fukuzawa 2004-10-19 18:35:33 EDT
> iconv(1) and iconv(3) certainly support SJIS, so if users have documents
> in SJIS charset, they can convert to UTF-8 (and back).

I can not find a useful convertion tool in FC3 to convert from SJIS to
UTF-8...
Comment 7 Jakub Jelinek 2004-10-19 18:37:53 EDT
Read the man page for the utility you were quoting.
iconv -f SHIFT-JIS -t UTF-8
Comment 8 Shun Fukuzawa 2004-10-19 19:06:15 EDT
I can not find iconv from GNOME's or KDE's menu. When using iconv,
file name must be changed.
Comment 9 Takanori MATSUURA 2004-10-19 22:51:39 EDT
Did you check the man of iconv(1) carefully, especialy in AUTHOR section?
It says "iconv is written by Ulrich Drepper as part of the GNU C
Library.".

iconv is included in glibc-common package.
Comment 10 Nakai 2004-10-20 13:13:44 EDT
>This is an issue of maintenance for the distribution.  Using a
>non-ASCII clean locale means that various programs will just cease to
>work.  How will pay all the wasted hour of our developers who get
>reports a program doesn't work and it turns out it is the fault of
>this locale?

Adding SJIS locale doesn't mean directly that you need accept and fix
all SJIS among applications. There should be what our developers should
and what they should not. Just rejecting SJIS locales will get rid of
any chances. Don't skew open source freedom by just your unwillingness.

>Arguing about other Unixes having support is irrelevant.  There is no
>Unix in existence even today which has internationalization as
>complete as Linux.  They simply don't have these programs.  And in
>case of, say, Solaris there is an additional problem that they have a
>non-fixed wchar_t representation (in old code, they switch for new
>code as well) which makes it all but impossible to write correct
>i18n'ed software.

Stop messing around.
Linux is just the best implemented the incomplete international
standards, ignoring many imoportant domestic issues.
There are many UNIXs with much better Japanese support than Linux.
Even there were versions of Japanese by Japanese for Japanese.
Why can you glibc is enough and ready for real Japanese already?

When migrating from other unices to Linux, SJIS is the biggest problem.
Wnen migrating from Windows to Linux, SJIS is the biggest problem.

Your opinion is like that of who are enjoying to wath other people's
fatal suffers through TV, sitting in the comfortable, safty rooms
in a far distance. 

>I would appreciate if you don't use words like the above anymore since
>we actually know what the implications are since we actually wrote the
>code.
Sorry, just learned from your mails before.
I don't bow to the dictator.
Comment 11 GOTO Masanori 2004-10-21 14:40:33 EDT
I couldn't understand why you guys insist to support ja_JP.SJIS locale.
I guess you want to use SJIS for text convertion (use iconv, nkf),
or use SJIS on filesystem (use mount option).  But why do you want
to use ja_JP.SJIS _environment_?

ja_JP.SJIS locale is broken for unix environment.
Most Japanese locale developers do not consider about ja_JP.SJIS locale
for daily processing because SJIS is not suitable for unix environment.
We have tested for ja_JP.ujis, ja_JP.eucJP, ja_JP.EUC-JP, and 
ja_JP.UTF-8.  However ja_JP.SJIS (and ja_JP.ISO-2022-JP variants) are
out of scope.

Only the thing I heard about using ja_JP.SJIS is to handle SJIS on
xmms.  But I think it's just local hack for that user, and I doubt it's
usable for everyone (and read comments; it's not).

I think it's useful that you check whether ja_JP.SJIS is usable or not,
and which part violates POSIX behavior before complaining, because
no one has tested and collected it, IIRC.
Comment 12 atsuya takagi 2004-10-21 14:54:51 EDT
i agree with GOTO Masanori.
as long as sjis breaks many POSIX programs and redhat/fedora have a 
lot of POSIX programs installed, how redhat/fedora can support such 
encoding type? it's not about a dictator ship, but an implementation 
of sjis.

however, i understand that there are people who want to use sjis, but 
as stated on comment #4, you still can set your locale to whatever 
you want. even sjis.

compatibility is important, but moving forward to utf-8 is really 
important as well.
Comment 13 Nakai 2004-10-22 12:18:22 EDT
You might disagree, and anybody else might disagree too, but for me,
your opinions are the Japanese typical empty follow-ups which
just repeat the previous major (and winner's) comments.

I put a interesting number here from Google:
ja_JP.eucJP: 28600
ja_JP.UTF-8: 6100
ja_JP.sjis: 7270

Those numbers are not fair enough, but can be an objective evidence
that sjis locale is not for crazies.

But inspite of my expectations, SJIS locale issue is already
over unless somebody else insist as a member of the community.
For Red Hat Enterprise Linux, we should discuss on the other
place even if the conclusion is same, because bugzilla is not
a sales tool...
Comment 14 atsuya takagi 2004-10-23 16:51:38 EDT
many people using sjis doesn't necessary mean they are not crazy.

However, i agree with Yukihiro Nakai that he wants to support sjis in
order to migrate people to linux, but if you want to talk about that
kind of stuff, that is not a community problem anymore, but a business
strategy of redhat in japan, which i have nothing to do with.
Comment 15 Joel Rees 2004-10-25 02:00:06 EDT
I know this is officially CLOSED, NOTFIX, but this needs to be said by
a foreigner who understands the problem.

I have no fond feelings for shift-JIS. I have written code to parse
it, so I know how broken the encoding scheme is.

However, the numbers Nakai quotes don't tell even one percent of the
story. 

You simply aren't going to find business documents encoded in
ja_JP.eucJP. Business documents encoded in ja_JP.UTF-8 exist, but not
where even your ordinary geek is going to find them. I personally
don't see non-shifted JIS business documents in a typical day in the
almost typical Japanese IT business office where I work, and those are
at least an order of magnitude more common than documents in ja_JP.UTF-8.

Japanese implies shift-JIS in the desktop world.

There is no way to shoehorn shift-JIS support into the suit-and-tie
RedHat Linux distributions without pumping it through Fedora. That's
why it is not just a business-side issue. If you want Japanese people
to be able to use Linux on the desktop, you have to understand that
Linux shall support shift-JIS before it succeeds on the Japanese desktop.

Frankly, any application that can only deal with UTF-8 has a high
probability of not really dealing properly with UTF-8. (In other
words, those are broken in the Japanese context any way you look at
it.) I know that the only way to do this is to go beyond the standards
the Unicode Consortium has given us, and to look hard where Microsoft
is saying, "nothing here, move on". It will require a parallel
standard of suggested internal encodings, I suppose, so that we can
mathematically convert between the local version locales using local
encodings and the international version locales using Unicode. But it
has to be done sometime.

Sorry to be so pointy headed in a bug report, but there is no way to
understand the bug without the context. 
Comment 19 Norman Diamond 2004-11-24 02:31:11 EST
Another foreigner who understands the problem agrees with Mr. Rees.

I also hate Shift-JIS, just as much as I hate paying sales tax and 
doing laundry.  But guess what, hate doesn't matter.  If you don't 
support Shift-JIS then you are locking yourself out of 99% of the 
Japanese market.

You know what has happened in the past to other US companies who 
targeted limited markets while Japanese companies targeted worldwide 
markets?  Of course next will be Chinese companies.  Anyway, in the 
long run you will regret locking yourself out of markets.

By the way I think Nakai-san's statistics are skewed by counting only 
some web pages that are designed for ordinary computers.  If you 
count web sites that are designed for display on cell phones then I 
think they're 100% Shift-JIS.  When editing those pages it is 
enormously helpful to run an editor directly in the despicable hated 
Shift-JIS environment.
Comment 20 Ryan Thompson 2005-03-29 20:55:41 EST
I respectfully but strongly disagree with the decision not to support Shift-JIS.

I am the CTO for a Japanese media company, and we use Redhat Linux for all of 
our web, file and print servers.  Our desktops are, as is common, Macintosh and 
Windows.  All of which, utilize the Shift-JIS locale or Microsoft's variation 
thereof.

1. In the case of our file servers, the problem of name conversion is 
insurmountable.  Files written by Windows machines into the shared folders have 
Shift-JIS-encoded filenames, and those written by Redhat use EUC; for files 
which must be exchange between both systems, we are forced to use English 
filenames.  This is confusing for a large number of our staff, and we 
continually have problem with filename encoding.

2. From Windows client, it is impossible to utilize a SSH session to our Redhat 
server, unless we force to use the English locale.  Therefore, engineers must 
read English command help and man pages, unless they travel to the server room 
and log in from the console. 

3. All of our web and document content is provided in Shift-JIS, so we are 
unable to perform maintenance on or even view the contents of those documents 
from the Redhat console, unless we use a character set converter to make a copy 
in EUC-JP.  But, iconv is not a substitute for being able to edit the same 
document on both type of system.

4. Databases are all in Shift-JIS encoding, so again maintenance of our data 
resources cannot be performed from the Redhat console.  It is required that we 
use remote tools for database maintenance.

5. Our XML data feed system utilizes UTF-8, but all other Japanese systems are 
Shift-JIS, with the exception of the EUC-JP-only Redhat servers.  

In summary, we have no need for EUC support, and every day require Shift-JIS.

We had at one point considered using OpenOffice and Redhat on our desktops, but 
it is impossible for us to maintain external compatibility for outside 
companies were we to take this step.  Instead, we stay with Windows.

Does anyone know of a third-party add-on which can provide this support?
Comment 21 Paul Gampe 2005-04-14 00:37:37 EDT

*** This bug has been marked as a duplicate of 91894 ***

Note You need to log in before you can comment on or make changes to this bug.