Bug 767016

Summary: locales missing from glibc-common
Product: Red Hat Enterprise Linux 6 Reporter: Catalyst Repository Systems <vendors>
Component: glibcAssignee: Jeff Law <law>
Status: CLOSED NOTABUG QA Contact: qe-baseos-tools-bugs
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: aoliva, fweimer, mfranc
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-28 05:26:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
example file none

Description Catalyst Repository Systems 2011-12-13 01:13:39 UTC
Created attachment 546010 [details]
example file

Description of problem:
We've been running into an issue trying to decompress SJIS files on our RHEL 6 systems.  Normally they decompress without issue on our RHEL 5 boxes and are automatically converted to UTF-8, however we're getting gibberish on RHEL 6.  This appears to stem from missing locale information in glibc-common.  The version of rar (rar_static) is identical on both machines.

Version-Release number of selected component (if applicable):
glibc-common-2.12-1.25.el6_1.3.x86_64 (current version)

How reproducible:
Extract example file on RHEL 5, you should see Japanese characters.  Extract example file on RHEL 6, you should get gibberish.

Steps to Reproduce:
1.  Extract files.
  
Actual results:
Document_?????? 645.txt

Expected results:
Document_調達センター 645.txt

Additional info:
This appears to be caused by missing locale information in glibc-common.  Observe this strace from the unrar command:
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99158752, ...}) = 0
mmap(NULL, 99158752, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fcd3b568000
close(3)                                = 0
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2512, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd3b567000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2512
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7fcd3b567000, 4096)            = 0
open("/usr/lib/locale/en_US.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en.utf8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/en_US.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/en_US.utf8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/en_US/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/en.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/en.utf8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/en/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)

If I copy the /usr/lib/locale/en_US.utf8 files from a RHEL5 box and extract them into that directory, it changes to the following:
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99158752, ...}) = 0
mmap(NULL, 99158752, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb34cf9b000
close(3)                                = 0
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2512, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb34cf9a000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2512
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7fb34cf9a000, 4096)            = 0
open("/usr/lib/locale/en_US.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_CTYPE", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=238548, ...}) = 0
mmap(NULL, 238548, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb34cf60000
close(3)                                = 0
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/gconv/gconv-modules", O_RDONLY) = -1 ENOENT (No such file or directory)

Once the en_US.utf8 files are in position, everything works correctly.

Comment 2 Jeff Law 2011-12-13 07:12:22 UTC
Could you please provide me with the output from running the following commands on your RHEL 5 and RHEL 6 machines?

rpm -q --whatprovides /usr/bin/unrar

If you could also send the output of the "printenv" command, it would be helpful.


Thanks,

Comment 3 Catalyst Repository Systems 2011-12-13 19:24:19 UTC
Both machines respond with "unrar-3.9.10-1.el6.rf.x86_64" and "unrar-3.9.10-1.el5.rf.x86_64" respectively.  I have also copied the version from the RHEL5 machine to the RHEL6 machine to verify that it is using the same version.  The executable is also identical to rar_static from the rar website.

Comment 4 Catalyst Repository Systems 2011-12-14 22:32:03 UTC
RHEL5:
HOSTNAME=fs09.caseshare.com
TERM=xterm
SHELL=/bin/bash
HISTSIZE=1000
SSH_CLIENT=192.168.12.91 2816 22
SSH_TTY=/dev/pts/0
USER=root
MAIL=/var/spool/mail/root
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
INPUTRC=/etc/inputrc
PWD=/root
LANG=en_US.UTF-8
SHLVL=1
HOME=/root
LOGNAME=root
SSH_CONNECTION=192.168.12.91 2816 172.16.3.100 22
LESSOPEN=|/usr/bin/lesspipe.sh %s
DISPLAY=localhost:10.0
G_BROKEN_FILENAMES=1
HISTTIMEFORMAT=%m/%d-%H:%M:%S
_=/usr/bin/printenv

RHEL6:
HOSTNAME=p2fsa.crs-tokyo.co.jp
TERM=xterm
SHELL=/bin/bash
HISTSIZE=1000
SSH_CLIENT=192.168.64.250 34539 22
SSH_TTY=/dev/pts/0
USER=root
MAIL=/var/spool/mail/root
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
PWD=/root
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
LOGNAME=root
SSH_CONNECTION=192.168.64.250 34539 192.168.68.50 22
LESSOPEN=|/usr/bin/lesspipe.sh %s
G_BROKEN_FILENAMES=1
_=/usr/bin/printenv

Comment 5 Alexandre Oliva 2012-02-03 20:26:52 UTC
There's something odd here.  AFAICT, RPMFusion's unrar does not ship rar_static, it builds a dynamic binary from source.  Could it be that someone replaced unrar's /usr/bin/unrar with rar_static?

Can you please run and post here the outpuf of the following commands?

rpm --verify unrar
sha1sum /usr/bin/unrar
file /usr/bin/unrar
/usr/bin/unrar 2>&1 | head

Comment 6 Catalyst Repository Systems 2012-02-03 21:03:01 UTC
We're not using the version from RPMFusion, we're using the official binary from the rarsoft website.

Comment 7 Alexandre Oliva 2012-02-11 05:25:55 UTC
I don't see where these RPMs can be downloaded in rarsoft's website.  Can you please supply the output of the commands I posted so that I can check whether they match the binaries I found?  Thanks in advance,

Comment 8 Catalyst Repository Systems 2012-02-13 17:40:14 UTC
[root@fsa ~]# rpm --verify unrar
package unrar is not installed
[root@fsa ~]# rpm --verify rar
[root@fsa ~]# sha1sum /usr/bin/unrar
465b7d98e1a00dd6788364e3993455ede39c760b  /usr/bin/unrar
[root@fsa ~]# file /usr/bin/unrar
/usr/bin/unrar: symbolic link to `/usr/bin/rar'
[root@fsa ~]# /usr/bin/unrar 2>&1 | head

RAR 3.90   Copyright (c) 1993-2009 Alexander Roshal   16 Aug 2009
Shareware version         Type RAR -? for help

Usage:     rar <command> -<switch 1> -<switch N> <archive> <files...>
               <@listfiles...> <path_to_extract\>

<Commands>
  a             Add files to archive
  c             Add archive comment
[root@fsa ~]#

Comment 9 Alexandre Oliva 2012-02-28 03:12:59 UTC
This sha1sum does not match any of the executables in rarlinux-3.9.0.tar.gz or rarlinux-x64-3.9.0.tar.gz.  I tried other versions as well, to no avail.  Unless you got these binaries from a different source than http://www.rarlab.com/rar/rarlinux-{,x64-}3.9.{0,1,2,3}.tar.gz, this tells me a few things:

- comment 3 says the binary we're talking about is rar_static, but this seems unlikely, for the sha1sum doesn't match, and it is unlikely that the binary got corrupted on both machines

- if it's rar rather than rar_static, it may have been modified by prelinking.  please run:

prelink -y --sha /usr/bin/rar

so that it prints the original sha1sum of the rar binary, and I can make sure I'm testing the same binary you are.

Comment 10 Alexandre Oliva 2012-02-28 05:05:45 UTC
The reason I insisted on locating the exact binary is that I've been unable to duplicate the problem using rarlinux-x64-3.9.0.tar.gz's rar, on a 6.1 box with glibc updated to the indicated version.

I know it's a x86_64 binary we're talking about because the straces have 64-bit addresses.  I concluded it was a dynamic library because it didn't match the sha1sums of the static binaries downloaded from the site, and the difference could be explained by prelinking.

The reason the locale files are not necessary is that they're all in /usr/lib/locale/locale-archive.  What puzzles me is that at least the file size (as indicated by fstat in the strace output) is correct: 99158752.  Since the locale information for en_US.UTF-8 is present in the archive, my runs don't even look for locale.alias or LC_CTYPE; they proceed to looking for /usr/lib64/gconv/gconv-modules.cache

Now, if I *do* use the just-downloaded x86_64 3.9.0 static binary, then I do observe the problem: AFAICT, the binary is statically linked to libc locale-handling code that is incompatible with the locale archive format.  This is not guaranteed to work in general; that's why static programs are strongly discouraged by glibc.  In this case, the program happens to work, but it might as well not have worked at all, if it is indeed the static program that you're using.

I strongly recommend switching to the dynamically-linked version of the program.  This should fix the problem without the need for copying files from earlier versions of the operating system that by chance happen to be compatible with the unrelated glibc code that ships included in the rar_static binary.

In theory, it should be possible to combine code specific to rar_static with alternate versions of GNU libc code and have such a static binary work, but it doesn't look like rarlab.com publishes object files that could be linked with whatever versions of GNU libc you like.

Comment 13 Jeff Law 2012-02-28 05:26:55 UTC
Given the comments in c#10, this is NOTABUG.

The static unrar binary (by virtue of linking libc statically) is carrying around a copy of glibc's old locale support code which is incompatible with locale archive database that is supplied in RHEL 6.

As Alex mentioned, static linking is strongly discouraged for this and a variety of other reasons.