The glibc developers made the decision not to display accented (national) characters and display question marks instead, if the application does not initialize its locale. While this is basically correct and I can understand this from the glibc point of view, there is a lot of applications that do not bother with setlocale call and as the glibc messages are localised, they are subsequently mangled. To fix all of the apps will take a looong time. This change already has generated a bunch of questions on local mailing lists. The workaround is simple - set OUTPUT_CHARSET to desired character set. Maybe the installer should set this in /etc/sysconfig/i18n along with other variables (at least optionally after confirmation, maybe there are cases where this is not the best idea). Reproducing: unsetenv all LC_, LANG, OUTPUT_CHARSET env. variables, if present % gcc /nonexistent gcc: /nonexistent: No such file or directory gcc: No input files % setenv LANG sk_SK % gcc /nonexistent gcc: /nonexistent: Adres?r alebo s?bor neexistuje gcc: No input files % setenv OUTPUT_CHARSET sk_SK % gcc /nonexistent gcc: /nonexistent: Adresar alebo szbor neexistuje gcc: No input files
eh - the bugzilla form messes the accented characters too - the 'a' and 'u' are with accute accent in the next to last line: .. Adresa'r alebo su'bor ...
Assigning to a developer.
setting OUTPUT_CHARSET globally will break things. If you do something like "LANG=ja_JP.eucJP gnome-termintal" when it is set to OUTPUT_CHARSET=ISO-8859-2 globally you'll get '?' everywhere for the menus. That is, OUTPUT_CHARSET overrides the default charset when programs properly call setlocale()
Hmm, this is what I was afraid of and I don't see any easy way to fix this in a way that pleases everyone :-( Is the _optional_ OUTPUT_CHARSET setting in the installer a possible alternative? Most users of iso8859-1,2 (dunno about cyrillic environments) won't be using the scenario you have described and if they set the LANG manually before calling some program, they can also unset the OUTPUT_CHARSET. Anyway, it would be good to mention this behaviour somewhere where the user selecting another locale than US one does see it.
The only real fix is to have any program that uses i18n (even strerror et al) call setlocale.
Is this issue considered one that we won't fix?
I guess so.