Bug 23479 - question marks instead of accented characters - OUTPUT_CHARSET
question marks instead of accented characters - OUTPUT_CHARSET
Status: CLOSED WONTFIX
Product: Red Hat Linux
Classification: Retired
Component: installer (Show other bugs)
7.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Matt Wilson
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-01-06 05:45 EST by stano
Modified: 2007-04-18 12:30 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-05-09 12:18:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description stano 2001-01-06 05:45:30 EST
The glibc developers made the decision not to display
accented (national) characters and display question marks
instead, if the application does not initialize its locale.

While this is basically correct and I can understand this
from the glibc point of view, there is a lot of applications
that do not bother with setlocale call and as the glibc
messages are localised, they are subsequently mangled.
To fix all of the apps will take a looong time.

This change already has generated a bunch of questions
on local mailing lists.

The workaround is simple - set OUTPUT_CHARSET to desired
character set. Maybe the installer should set this in
/etc/sysconfig/i18n along with other variables (at least
optionally after confirmation, maybe there are cases where
this is not the best idea).

Reproducing:
unsetenv all LC_, LANG, OUTPUT_CHARSET env. variables,
if present

% gcc /nonexistent
gcc: /nonexistent: No such file or directory
gcc: No input files
% setenv LANG sk_SK
% gcc /nonexistent
gcc: /nonexistent: Adres?r alebo s?bor neexistuje
gcc: No input files
% setenv OUTPUT_CHARSET sk_SK
% gcc /nonexistent
gcc: /nonexistent: Adresar alebo szbor neexistuje
gcc: No input files
Comment 1 stano 2001-01-06 05:49:14 EST
eh - the bugzilla form messes the accented characters too -
the 'a' and 'u' are with accute accent in the next to last line:
.. Adresa'r alebo su'bor ...
Comment 2 Michael Fulbright 2001-01-09 16:32:01 EST
Assigning to a developer.
Comment 3 Matt Wilson 2001-01-09 16:49:53 EST
setting OUTPUT_CHARSET globally will break things.  If you do something like
"LANG=ja_JP.eucJP gnome-termintal" when it is set to OUTPUT_CHARSET=ISO-8859-2
globally you'll get '?' everywhere for the menus.

That is, OUTPUT_CHARSET overrides the default charset when programs properly
call setlocale()
Comment 4 stano 2001-01-10 03:56:21 EST
Hmm, this is what I was afraid of and I don't see any easy way to fix this in a 
way that pleases everyone :-(

Is the _optional_ OUTPUT_CHARSET setting in the installer a possible 
alternative? Most users of iso8859-1,2 (dunno about cyrillic environments) 
won't be using the scenario you have described and if they set the LANG 
manually before calling some program, they can also unset the OUTPUT_CHARSET.

Anyway, it would be good to mention this behaviour somewhere where the user 
selecting another locale than US one does see it.
Comment 5 Matt Wilson 2001-01-10 15:04:23 EST
The only real fix is to have any program that uses i18n (even strerror et al)
call setlocale.
Comment 6 Brent Fox 2001-04-25 18:12:45 EDT
Is this issue considered one that we won't fix?
Comment 7 Brent Fox 2001-05-09 12:18:36 EDT
I guess so.

Note You need to log in before you can comment on or make changes to this bug.