Bug 19942

Summary: LANG=en_US (the default) appears to fold lowercase to uppercase before sorting
Product: [Retired] Red Hat Linux Reporter: Ronald Cole <ronald>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: Aaron Brown <abrown>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0CC: dr, fweimer
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2000-10-27 22:56:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ronald Cole 2000-10-27 21:02:51 UTC
$ LANG= bash -c 'echo -e "a\nB" | sort'
B
a
$ LANG=en bash -c 'echo -e "a\nB" | sort'
B
a
$ LANG=en_US bash -c 'echo -e "a\nB" | sort'
a
B

!!!oops, that can't be right!!!   The same commands run on AIX and HPUX all
produce
identical output (as I would expect no difference between "en" and "en_US"
in this regard).
The GNU C Library Reference Manual says that "[d]efining and installing
named locales
[other than "C" or "POSIX" is normally a responsibility of ... the person
who installed the
GNU C library".  I guess that that would fall on Red Hat for installing a
broken locale and
making it the default.

Comment 1 Jakub Jelinek 2000-10-28 06:17:00 UTC
Actually, it is right. Open any printed vocabulary (be it English,
German, Norwegian or Czech) and see how entries are sorted.
The fact that sorting has been broken on most of the OSes
does not change anything on that. There is no such locale as en, so
it defaults to C, that's why the output of the first two is identical.
If you rely on ASCII sorting, use C locale, if you want native language
collation, use your own locale.
If AIX and HPUX don't fold cases, they are broken.
E.g. Solaris with en_US locale sorts the same way as RHL 7.0.

Comment 2 Ronald Cole 2000-10-28 22:13:16 UTC
Well, then the bug is RedHat defaulting to LANG=en_US.  It should probably 
default to either "C" or "POSIX" and the user should change it to "en_US" if 
that's what they want.  According the the GNU C Library Reference Manual, "C" 
and "POSIX" are the only ones that can be considered "portable" as all others
are obviously vendor supplied and therefore, extensions.

Comment 3 Ronald Cole 2000-10-28 23:15:59 UTC
I have entered bug #19973 against package "initscripts".