Bug 103539 - (sort) sort doesn't work w/ en_US locales
sort doesn't work w/ en_US locales
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: coreutils (Show other bugs)
9
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Tim Waugh
:
: 126131 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-09-01 21:42 EDT by sidlon
Modified: 2007-04-18 12:57 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-09-02 04:53:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description sidlon 2003-09-01 21:42:27 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows ME) Opera 7.20  [en]

Description of problem:
  I submitted bug #10117 in March, 2000 concerning Redhat 6.1, and it seems 
nothing has changed with 9.0 (w/ default locale: en_US.utf8, or en_US.iso885915, 
or just en_US).  I understand that things sort properly w/ LC_ALL set to posix.  
But after searching extensively in bugzilla & the usenet, I still haven't seen a 
good explanation of why any locale at all (let alone en_US) would sort this 
input the way it does:

3.456
34.500
345.600

Can someone use this space to explain to me and others w/ the same question why 
this sort order makes sense for any en_US locality, and why RedHat has allowed 
values to sort this badly for at least 3 years?  I suspect a good number of 
professionals blindly assume (as I did) that sort is dependable on a standard 
RedHat install.  Should this bug be filed somewhere else?


Version-Release number of selected component (if applicable):
coreutils-4.5.3-19

How reproducible:
Always

Steps to Reproduce:
1. echo 3.456 > /tmp/foo; echo 34.500 >> /tmp/foo;
2. echo 345.600 >> /tmp/foo
3. LC_ALL=en_US.utf8 sort /tmp/foo
    

Actual Results:  34.500
3.456
345.600

Expected Results:  3.456
34.500
345.600

Additional info:

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Comment 1 Tim Waugh 2003-09-02 04:53:43 EDT
ISO 14651, which is the sorting standard, specifies this behaviour.  You can
also find some information in the strcoll documentation.  Punctuation is ignored
in your example.

In IEEE Std 1003.1, 2003 Edition, it says that the sorting "shall be performed
using the collating sequence of the current locale".
Comment 2 Tim Waugh 2004-06-25 07:02:07 EDT
*** Bug 126131 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.