Red Hat Bugzilla – Bug 19495
Translated messages sometimes presented with wrong character set
Last modified: 2016-11-24 10:25:57 EST
I'm using the Swedish locale; LANG=sv_SE. With the recent glibc, I've
started getting error messages like
zic: Can't open #: AAtkomst nekas
That is not correct Swedish, it should be spelled with E (A-ring) as the
first character. This bug doesn't afflict commands. for example:
iconv: kan inte vppna utfil: Etkomst nekas
is quite correct.
Warning: what follows is my analysis, it might be completely wrong.
The message catalog appears correct. The problem seems to be commands
Zic from glibc-2.1.94-3 is one example, less from less-346-2 is another.
The problem seems to be quite common.
I understand why this happens. Since the command doesn't set the LC_CTYPE
category, gettext() will convert the string from the character set in the
message catalog (ISO-8859-1).
But I'm not quite sure which part is in error here. Are all programs
setting LC_MESSAGES without setting LC_CTYPE incorrect? Or should
setlocale(LC_MESSAGES, "") imply setlocale(LC_CTYPE, "")? The former seems
more regular, and has some support in the specification. ("If different
character sets are used by the locale categories, the results achieved by
an application utilising these categories are undefined." in
example.) On the other hand the latter would mean defining a lot of
programs as incorrect; and one could argue that using LC_MESSAGES or most
other categories means one will use characters, so LC_CTYPE should also be
(In either case there is a bug in the glibc PACKAGE. Either in the libc
LIBRARY or in the zic PROGRAM, but they both belong in glibc-2.1.94-3.)
I've fixed this in zic and zdump, the fix is in current CVS glibc and
will appear in the next glibc errata.
As for less, less needs to be fixed as well.