Bug 23479 - question marks instead of accented characters - OUTPUT_CHARSET
Summary: question marks instead of accented characters - OUTPUT_CHARSET
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: installer
Version: 7.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Matt Wilson
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-01-06 10:45 UTC by stano
Modified: 2007-04-18 16:30 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-05-09 16:18:43 UTC
Embargoed:


Attachments (Terms of Use)

Description stano 2001-01-06 10:45:30 UTC
The glibc developers made the decision not to display
accented (national) characters and display question marks
instead, if the application does not initialize its locale.

While this is basically correct and I can understand this
from the glibc point of view, there is a lot of applications
that do not bother with setlocale call and as the glibc
messages are localised, they are subsequently mangled.
To fix all of the apps will take a looong time.

This change already has generated a bunch of questions
on local mailing lists.

The workaround is simple - set OUTPUT_CHARSET to desired
character set. Maybe the installer should set this in
/etc/sysconfig/i18n along with other variables (at least
optionally after confirmation, maybe there are cases where
this is not the best idea).

Reproducing:
unsetenv all LC_, LANG, OUTPUT_CHARSET env. variables,
if present

% gcc /nonexistent
gcc: /nonexistent: No such file or directory
gcc: No input files
% setenv LANG sk_SK
% gcc /nonexistent
gcc: /nonexistent: Adres?r alebo s?bor neexistuje
gcc: No input files
% setenv OUTPUT_CHARSET sk_SK
% gcc /nonexistent
gcc: /nonexistent: Adresar alebo szbor neexistuje
gcc: No input files

Comment 1 stano 2001-01-06 10:49:14 UTC
eh - the bugzilla form messes the accented characters too -
the 'a' and 'u' are with accute accent in the next to last line:
.. Adresa'r alebo su'bor ...

Comment 2 Michael Fulbright 2001-01-09 21:32:01 UTC
Assigning to a developer.

Comment 3 Matt Wilson 2001-01-09 21:49:53 UTC
setting OUTPUT_CHARSET globally will break things.  If you do something like
"LANG=ja_JP.eucJP gnome-termintal" when it is set to OUTPUT_CHARSET=ISO-8859-2
globally you'll get '?' everywhere for the menus.

That is, OUTPUT_CHARSET overrides the default charset when programs properly
call setlocale()


Comment 4 stano 2001-01-10 08:56:21 UTC
Hmm, this is what I was afraid of and I don't see any easy way to fix this in a 
way that pleases everyone :-(

Is the _optional_ OUTPUT_CHARSET setting in the installer a possible 
alternative? Most users of iso8859-1,2 (dunno about cyrillic environments) 
won't be using the scenario you have described and if they set the LANG 
manually before calling some program, they can also unset the OUTPUT_CHARSET.

Anyway, it would be good to mention this behaviour somewhere where the user 
selecting another locale than US one does see it.

Comment 5 Matt Wilson 2001-01-10 20:04:23 UTC
The only real fix is to have any program that uses i18n (even strerror et al)
call setlocale.


Comment 6 Brent Fox 2001-04-25 22:12:45 UTC
Is this issue considered one that we won't fix?

Comment 7 Brent Fox 2001-05-09 16:18:36 UTC
I guess so.


Note You need to log in before you can comment on or make changes to this bug.