From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2b) Gecko/20021008 Description of problem: It's very unfortunate that RH 8 doesn't have UTF-8 locales for CJK by default. Even if there are some problems with CJK UTF-8 locale, it'd be nice to give users a choice by adding them to /etc/X11/gdm/locale.aliases. As Solaris/AIX do, RH8 can have multiple locales with different codesets for CJK. For instance, locale.aliases file for gdm can have two entries for Korean: Korean(EUC) ko_KR Korean(UTF-8) ko_KR.UTF-8 There are a couple of things to change if CJK UTF-8 locale is added. Firstly, XLC_LOCALE file for CJK UTF-8 locale has to be customized per locale instead of using XLC_LOCALE for en_US.UTF-8. I'll attach XLC_LOCALE files for ja_JP.UTF-8, ko_KR.UTF-8 and zh_CN.UTF-8 . With this change, KDE3 works very well out of the box under ko_KR.UTF-8. I'm less sure of Gnome2 because nautilus in Gnome2 is really sluggish on my machine (taking up 400M memory). However, nautilus is as sluggish under ko_KR.EUC-KR as under ko_KR.UTF-8 so that UTF-8 locale is not a culprit for this problem. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. At gdm login screen, click on Language 2. Choose either Japanese or Korean locale 3. After log-in, check out the locale (e.g. echo $LANG) Actual Results: The locale codeset is a legacy encoding instead of UTF-8 Expected Results: gdm should offer Korean(UTF-8) and Japanese(UTF-8) Additional info: Korean(UTF-8) was the first language (along with English) for which UTF-8 locale was supported on both Solaris 7/8 and AIX 4.x. That's not without a reason. EUC-KR codeset for Korean has a very critical problem and virtually every Korean wants to move onto UTF-8 locale as soon as possible to solve that problem.
Created attachment 80224 [details] XLC_LOCALE for ko_KR.UTF-8
Created attachment 80225 [details] XLC_LOCALE for ja_JP.UTF-8
Created attachment 80226 [details] locale.dir (X11) and locale.alias(gdm) : diff
Created attachment 80227 [details] a sample gtkrc file for ko_KR.UTF-8 (perhaps not necessary for gtk2-based RH8)
It might be argued that UTF-8 locales for CJK are not yet "mature" enough for the prime time in some aspects. Then, I can't help wondering why RH decided to include SC (GB18030) while not including Korean(UTF-8) and Japanese(UTF-8). Well, PRC requires that all OS' shipped in PRC support GB 18030 and probably RedHat didn't have any other choice. Whatever the rationale for including SC(GB18030) in RH was, I think that's a strong indication that Korean(UTF-8) and Japanese(UTF-8) could well be included as alternatives to Korean(EUC) and Japanse(EUC) because GB 18030 is nothing other than another UTF (Unicode Transformation Format) just like UTF-8. It has to be noted that I'm NOT saying that UTF-8 has to be the *only* codeset for Japanese and Korean. Rather, I think UTF-8 should be offered to those who want to use it *in addition to* Korean and Japanese locales based on legacy encodings (EUC-KR and EUC-JP). If RedHat is so much concerned about potential problems of CJK UTF-8 locales, it can warn its usres that CJK UTF-8 locale support is experimental and that legacy enoding based CJK locale is recommended in a prominent place of the release notes. For the record, I have been using ko_KR.UTF-8 for about half a year and have yet to find a single aspect in which ko_KR.UTF-8 locale is worse than ko_KR.EUC-KR. My comments in bug 75832 have a couple of compelling reasons why Koreans need UTF-8 locale right now.
We don't (in general) think it is appropriate to offer users a choice of encodings when logging in; encodings are something that should "just work"; the user shouldn't have to think about them. We didn't actually offer a choice between GB18030 and GB2312 in the user interface in 8.0 - if you picked simplified chinese, you got GB18030. We are hoping that soon we can just use UTF-8 locales for CJK as we do for all other languages; at that point, we won't offer a choice between UTF-8 and the traditional encoding; when you select the language in gdm, redhat-config-languages, or the installer, you'll just get The UTF-8 locale. To pick alternate encodings, you can put "LANG=ko_KR.UTF-8" in your ~/.i18n file. I'm assigning this bug to XFree86 since the XLC_LOCALE patches are the substantive part of the attachments.. it would be good to submit them upstream as well, however. [Note that locale_config is an obsolete configuration tool, replaced by redhat-config-languages not Red Hat's locale configuration :-)]
> We don't (in general) think it is appropriate to offer users a choice > of encodings when logging in; encodings are something that should > "just work"; the user shouldn't have to think about them I agree that in general that's a sound policy. The keyword here is 'in general', though as I expect you to agree :-) In this particular case of CJK UTF-8 locale, it might be necessary to do things a bit differently as an interim measure. > We didn't actually offer a choice between GB18030 and GB2312 in the > user interface in 8.0 - if you picked simplified chinese, you got > GB18030. Sure you don't. I guess you can't even if you want to because PRC government wouldn't let you sell RH 8.0 in China otherwise :-) My point of mentioning SC (GB18030) was , if it's not clear, that if SC(GB18030) is supported, there should be little reason Korean(UTF-8) and Japanese(UTF-8) cannot be offered because GB18030 and UTF-8 are just two different UTF's of a single coded character set ISO10646/Unicode. > We are hoping that soon we can just use UTF-8 locales for CJK as we > do for all other languages; at that point, we won't offer a choice > between UTF-8 and the traditional encoding; when you select the > language in gdm, redhat-config-languages, or the installer, you'll > just get The UTF-8 locale. Yup, that's definitely the way to go. Good bye to old encodings forever !!! And, Linux will be on par with MS Windows 2k/XP in Korean support. > I'm assigning this bug to XFree86 Thanks for assigning it to a more appropriate component. I had a hard time picking a component for this bug. There's a locale-ja, but there's no component 'locale' and ended up with 'locale-config'. > since the XLC_LOCALE patches are > the substantive part of the attachments.. it would be good to submit > them upstream as well, however. I'll do. It's very frustrating to wait several months for XF86 team to accept my patches. ;-). I hope this patch will get accepted in a timely manner, but in the meantime it'd be nice if RedHat can just go ahead with it.
>> since the XLC_LOCALE patches are >> the substantive part of the attachments.. it would be good to submit >> them upstream as well, however. > I'll do. I've submitted my patch and was given seq. 5421.
In addition to Ami (patched for UTF-8), there's a gtk input module for Korean, 'imhangul' by CHOI Hwan-jin (http://imhangul.kldp.net/). This fully supports UTF-8 input for Korean and works really great ! Ami (with UTF-8 patch) is still necessary for non-gtk applications, though. Anyway, the existence of 'imhangul' makes my case for Korean UTF-8 locale even stronger.
429. Add ko_KR.UTF-8 and ja_JP.UTF-8 XLC_LOCALE files (#5421, Jungshik Shin). Your patches were accepted into XFree86 CVS a while back. I've got CVS builds going into rawhide in RPM's soon, so I'm closing this as resolved in RAWHIDE. Thanks.
When I submitted my patch to XF86, I forgot to include XI18NOBJS files for ko_KR.UTF-8 and ja_JP.UTF-8 locales.(I had had them on my machine) They're identical to that for en_US.UTF-8. However, without their presence in ko_KR.UTF-8 and ja_JP.UTF-8 directories, X11 lib. complains that two locales are not supported by Xlib. Perhaps, you've already taken care of it.. Anyway, I'm gonna send a new patch to XFree86 to include them.
For the imhangul modules that you have mentioned, it works well in GNOME2 however, it doesn't work with GNOME1 applications such as XCHAT1.8 (1.9 is developer version) and XMMS and so on.. That is why it hasn't been implemented.