Red Hat Bugzilla – Bug 144541
/bin/sort does not sort correctly in "en_US.UTF-8" locale
Last modified: 2007-11-30 17:10:57 EST
Description of problem:
It appears that /bin/sort makes a bizarre assumption that input lines
are filenames, and ignores leading dots. As a result, the output
strings are NOT sorted!
Version-Release number of selected component (if applicable):
sort (coreutils) 4.5.3
Written by Mike Haertel and Paul Eggert.
Copyright (C) 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
Steps to Reproduce:
Execute the following command:
c' | sort
I tried same test-case on Solaris, HP-UX, AIX, FreeBSD -- and it
worked correctly everywhere (but Linux)! WTF?
Found out something interesting. The bug description applies
to "en_US.UTF-8" locale (LANG env variable).
The following command:
c' | LANG="C" sort
Yields correct results:
Then, is string comparison in "en_US.UTF-8" locale messed up?
updated the Summary field to mention locale.
ISO 14651, which is the sorting standard, specifies this behaviour. You can
also find some information in the strcoll documentation. Punctuation is ignored
in your example.
In IEEE Std 1003.1, 2003 Edition, it says that the sorting "shall be performed
using the collating sequence of the current locale".
If you want ASCII sorting, set LC_CTYPE=C.
*** Bug 206456 has been marked as a duplicate of this bug. ***