Bug 144541 - /bin/sort does not sort correctly in "en_US.UTF-8" locale
/bin/sort does not sort correctly in "en_US.UTF-8" locale
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: coreutils (Show other bugs)
3
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Tim Waugh
:
: 206456 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-01-07 18:37 EST by Valeriy Ovechkin
Modified: 2007-11-30 17:10 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-10 03:57:43 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Valeriy Ovechkin 2005-01-07 18:37:03 EST
Description of problem:
It appears that /bin/sort makes a bizarre assumption that input lines 
are filenames, and ignores leading dots. As a result, the output 
strings are NOT sorted!

Version-Release number of selected component (if applicable):
sort (coreutils) 4.5.3
Written by Mike Haertel and Paul Eggert.
Copyright (C) 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There 
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.

How reproducible:
reliably

Steps to Reproduce:
Execute the following command:
echo 'a
.b
c' | sort

Actual results:
a
.b
c

Expected results:
.b
a
c

Additional info:
I tried same test-case on Solaris, HP-UX, AIX, FreeBSD -- and it 
worked correctly everywhere (but Linux)! WTF?
Comment 1 Valeriy Ovechkin 2005-01-07 19:06:19 EST
Found out something interesting. The bug description applies 
to "en_US.UTF-8" locale (LANG env variable).

The following command:
echo 'a
.b
c' | LANG="C" sort

Yields correct results:
.b
a
c

Then, is string comparison in "en_US.UTF-8" locale messed up?
Comment 2 Valeriy Ovechkin 2005-01-09 12:10:10 EST
updated the Summary field to mention locale.
Comment 3 Tim Waugh 2005-01-10 03:57:43 EST
ISO 14651, which is the sorting standard, specifies this behaviour.  You can
also find some information in the strcoll documentation.  Punctuation is ignored
in your example.

In IEEE Std 1003.1, 2003 Edition, it says that the sorting "shall be performed
using the collating sequence of the current locale".

If you want ASCII sorting, set LC_CTYPE=C.
Comment 4 Tim Waugh 2006-09-26 12:07:44 EDT
*** Bug 206456 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.