Bug 136290
Summary: | Add Japanese SJIS locale | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Nakai <ynakai> |
Component: | glibc | Assignee: | Jakub Jelinek <jakub> |
Status: | CLOSED DUPLICATE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | drepper, eng-i18n-bugs, fedora-ja-list, fweimer, htl10, mfuruta, thompson_r, wtogami |
Target Milestone: | --- | Keywords: | i18n |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-10-19 07:44:56 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Nakai
2004-10-19 07:24:40 UTC
There is a big problem with SJIS, particularly that it isn't ASCII compatible. Therefore many POSIX programs will misbehave when in SJIS locale, which is not something that we should IMHO support. Using a SJIS-based locale means violating standards. And beside, we only support UTF-8 locales from now on. Sounds like a very dictator logic by European and American as always happened on codeset issues. Other commercial UNIXs have that SJIS locale and troublesome. Users knows them but still want SJIS locale. Users that want to shoot themselves in the foot can do so themselves. localedef -i ja_JP -c -f SHIFT_JIS /usr/lib/locale/ja_JP.sjis will create that for them, but I don't think that is something we should promote and support. iconv(1) and iconv(3) certainly support SJIS, so if users have documents in SJIS charset, they can convert to UTF-8 (and back). > Sounds like a very dictator logic by European and American
> as always happened on codeset issues.
This is an issue of maintenance for the distribution. Using a
non-ASCII clean locale means that various programs will just cease to
work. How will pay all the wasted hour of our developers who get
reports a program doesn't work and it turns out it is the fault of
this locale?
Arguing about other Unixes having support is irrelevant. There is no
Unix in existence even today which has internationalization as
complete as Linux. They simply don't have these programs. And in
case of, say, Solaris there is an additional problem that they have a
non-fixed wchar_t representation (in old code, they switch for new
code as well) which makes it all but impossible to write correct
i18n'ed software.
I would appreciate if you don't use words like the above anymore since
we actually know what the implications are since we actually wrote the
code.
> iconv(1) and iconv(3) certainly support SJIS, so if users have documents
> in SJIS charset, they can convert to UTF-8 (and back).
I can not find a useful convertion tool in FC3 to convert from SJIS to
UTF-8...
Read the man page for the utility you were quoting. iconv -f SHIFT-JIS -t UTF-8 I can not find iconv from GNOME's or KDE's menu. When using iconv, file name must be changed. Did you check the man of iconv(1) carefully, especialy in AUTHOR section? It says "iconv is written by Ulrich Drepper as part of the GNU C Library.". iconv is included in glibc-common package. >This is an issue of maintenance for the distribution. Using a >non-ASCII clean locale means that various programs will just cease to >work. How will pay all the wasted hour of our developers who get >reports a program doesn't work and it turns out it is the fault of >this locale? Adding SJIS locale doesn't mean directly that you need accept and fix all SJIS among applications. There should be what our developers should and what they should not. Just rejecting SJIS locales will get rid of any chances. Don't skew open source freedom by just your unwillingness. >Arguing about other Unixes having support is irrelevant. There is no >Unix in existence even today which has internationalization as >complete as Linux. They simply don't have these programs. And in >case of, say, Solaris there is an additional problem that they have a >non-fixed wchar_t representation (in old code, they switch for new >code as well) which makes it all but impossible to write correct >i18n'ed software. Stop messing around. Linux is just the best implemented the incomplete international standards, ignoring many imoportant domestic issues. There are many UNIXs with much better Japanese support than Linux. Even there were versions of Japanese by Japanese for Japanese. Why can you glibc is enough and ready for real Japanese already? When migrating from other unices to Linux, SJIS is the biggest problem. Wnen migrating from Windows to Linux, SJIS is the biggest problem. Your opinion is like that of who are enjoying to wath other people's fatal suffers through TV, sitting in the comfortable, safty rooms in a far distance. >I would appreciate if you don't use words like the above anymore since >we actually know what the implications are since we actually wrote the >code. Sorry, just learned from your mails before. I don't bow to the dictator. I couldn't understand why you guys insist to support ja_JP.SJIS locale. I guess you want to use SJIS for text convertion (use iconv, nkf), or use SJIS on filesystem (use mount option). But why do you want to use ja_JP.SJIS _environment_? ja_JP.SJIS locale is broken for unix environment. Most Japanese locale developers do not consider about ja_JP.SJIS locale for daily processing because SJIS is not suitable for unix environment. We have tested for ja_JP.ujis, ja_JP.eucJP, ja_JP.EUC-JP, and ja_JP.UTF-8. However ja_JP.SJIS (and ja_JP.ISO-2022-JP variants) are out of scope. Only the thing I heard about using ja_JP.SJIS is to handle SJIS on xmms. But I think it's just local hack for that user, and I doubt it's usable for everyone (and read comments; it's not). I think it's useful that you check whether ja_JP.SJIS is usable or not, and which part violates POSIX behavior before complaining, because no one has tested and collected it, IIRC. i agree with GOTO Masanori. as long as sjis breaks many POSIX programs and redhat/fedora have a lot of POSIX programs installed, how redhat/fedora can support such encoding type? it's not about a dictator ship, but an implementation of sjis. however, i understand that there are people who want to use sjis, but as stated on comment #4, you still can set your locale to whatever you want. even sjis. compatibility is important, but moving forward to utf-8 is really important as well. You might disagree, and anybody else might disagree too, but for me, your opinions are the Japanese typical empty follow-ups which just repeat the previous major (and winner's) comments. I put a interesting number here from Google: ja_JP.eucJP: 28600 ja_JP.UTF-8: 6100 ja_JP.sjis: 7270 Those numbers are not fair enough, but can be an objective evidence that sjis locale is not for crazies. But inspite of my expectations, SJIS locale issue is already over unless somebody else insist as a member of the community. For Red Hat Enterprise Linux, we should discuss on the other place even if the conclusion is same, because bugzilla is not a sales tool... many people using sjis doesn't necessary mean they are not crazy. However, i agree with Yukihiro Nakai that he wants to support sjis in order to migrate people to linux, but if you want to talk about that kind of stuff, that is not a community problem anymore, but a business strategy of redhat in japan, which i have nothing to do with. I know this is officially CLOSED, NOTFIX, but this needs to be said by a foreigner who understands the problem. I have no fond feelings for shift-JIS. I have written code to parse it, so I know how broken the encoding scheme is. However, the numbers Nakai quotes don't tell even one percent of the story. You simply aren't going to find business documents encoded in ja_JP.eucJP. Business documents encoded in ja_JP.UTF-8 exist, but not where even your ordinary geek is going to find them. I personally don't see non-shifted JIS business documents in a typical day in the almost typical Japanese IT business office where I work, and those are at least an order of magnitude more common than documents in ja_JP.UTF-8. Japanese implies shift-JIS in the desktop world. There is no way to shoehorn shift-JIS support into the suit-and-tie RedHat Linux distributions without pumping it through Fedora. That's why it is not just a business-side issue. If you want Japanese people to be able to use Linux on the desktop, you have to understand that Linux shall support shift-JIS before it succeeds on the Japanese desktop. Frankly, any application that can only deal with UTF-8 has a high probability of not really dealing properly with UTF-8. (In other words, those are broken in the Japanese context any way you look at it.) I know that the only way to do this is to go beyond the standards the Unicode Consortium has given us, and to look hard where Microsoft is saying, "nothing here, move on". It will require a parallel standard of suggested internal encodings, I suppose, so that we can mathematically convert between the local version locales using local encodings and the international version locales using Unicode. But it has to be done sometime. Sorry to be so pointy headed in a bug report, but there is no way to understand the bug without the context. Another foreigner who understands the problem agrees with Mr. Rees. I also hate Shift-JIS, just as much as I hate paying sales tax and doing laundry. But guess what, hate doesn't matter. If you don't support Shift-JIS then you are locking yourself out of 99% of the Japanese market. You know what has happened in the past to other US companies who targeted limited markets while Japanese companies targeted worldwide markets? Of course next will be Chinese companies. Anyway, in the long run you will regret locking yourself out of markets. By the way I think Nakai-san's statistics are skewed by counting only some web pages that are designed for ordinary computers. If you count web sites that are designed for display on cell phones then I think they're 100% Shift-JIS. When editing those pages it is enormously helpful to run an editor directly in the despicable hated Shift-JIS environment. I respectfully but strongly disagree with the decision not to support Shift-JIS. I am the CTO for a Japanese media company, and we use Redhat Linux for all of our web, file and print servers. Our desktops are, as is common, Macintosh and Windows. All of which, utilize the Shift-JIS locale or Microsoft's variation thereof. 1. In the case of our file servers, the problem of name conversion is insurmountable. Files written by Windows machines into the shared folders have Shift-JIS-encoded filenames, and those written by Redhat use EUC; for files which must be exchange between both systems, we are forced to use English filenames. This is confusing for a large number of our staff, and we continually have problem with filename encoding. 2. From Windows client, it is impossible to utilize a SSH session to our Redhat server, unless we force to use the English locale. Therefore, engineers must read English command help and man pages, unless they travel to the server room and log in from the console. 3. All of our web and document content is provided in Shift-JIS, so we are unable to perform maintenance on or even view the contents of those documents from the Redhat console, unless we use a character set converter to make a copy in EUC-JP. But, iconv is not a substitute for being able to edit the same document on both type of system. 4. Databases are all in Shift-JIS encoding, so again maintenance of our data resources cannot be performed from the Redhat console. It is required that we use remote tools for database maintenance. 5. Our XML data feed system utilizes UTF-8, but all other Japanese systems are Shift-JIS, with the exception of the EUC-JP-only Redhat servers. In summary, we have no need for EUC support, and every day require Shift-JIS. We had at one point considered using OpenOffice and Redhat on our desktops, but it is impossible for us to maintain external compatibility for outside companies were we to take this step. Instead, we stay with Windows. Does anyone know of a third-party add-on which can provide this support? *** This bug has been marked as a duplicate of 91894 *** I arrived at this bug, because I was able to test/trouble shoot a Japanese localization problem of an windows application under wine in Ubuntu WSL (yes, this is strange, running windows application, inside wine, inside Ubuntu Windows Subsystem for Linux, inside Windows 10), but was not able to do so with native Fedora. I tried native Fedora because I needed biarch/32-bit/64-bit wine also, but WSL can only do 64-bit wine. Note that ja_JP.sjis works on Ubuntu WSL, just not on Fedora. For the record, on Fedora, I needed to do: dnf install -y glibc-locale-source before localedef -i ja_JP -f SHIFT_JIS /usr/lib/locale/ja_JP.sjis Hope it helps others. After that, I could then do: LANG=ja_JP.sjis WINEPREFIX={/tmp/32,/tmp/64} WINEARCH={win32,win64} wine \ test_app.exe Note again that I did not need to do anything on Ubuntu WSL; but needed to jump through quite a few loops for Fedora. (In reply to Jakub Jelinek from comment #4) > Users that want to shoot themselves in the foot can do so themselves. > localedef -i ja_JP -c -f SHIFT_JIS /usr/lib/locale/ja_JP.sjis > will create that for them, but I don't think that is something > we should promote and support. See addedum above about needing "glibc-locale-soure" (In reply to Ulrich Drepper from comment #5) ... > Arguing about other Unixes having support is irrelevant. There is no > Unix in existence even today which has internationalization as > complete as Linux. ... I see your point about other unices. But as stated, Ubuntu supports this; just not Fedora. (In reply to GOTO Masanori from comment #11) > I couldn't understand why you guys insist to support ja_JP.SJIS locale. > I guess you want to use SJIS for text convertion (use iconv, nkf), > or use SJIS on filesystem (use mount option). But why do you want > to use ja_JP.SJIS _environment_? ... Yes, it is useful sometimes, as stated above in my use case. The application was written for Japanese windows, hence it needs ja_JP.SJIS environment encapsulating wine. I don't think I am asking for a full ja_JP.SJIS desktop environment and everything works in that desktop environment. But, I think it is useful to ___have the capability____ of running specific applications with LANG=ja_JP.SJIS . And setting LANG=ja_JP.SJIS should just work (within limits), without having to jump through the loop of installing locale-source and running localedef. My two $cent. |