Bug 43564

Summary: /bin/sort is sorting by case-folded alphabetic order!
Product: [Retired] Red Hat Linux Reporter: Dan Reish <dreish>
Component: textutilsAssignee: Bernhard Rosenkraenzer <bero>
Status: CLOSED NOTABUG QA Contact: David Lawrence <dkl>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: stano
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-06-26 14:49:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Reish 2001-06-05 15:40:44 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.76 [en] (X11; U; Linux 2.4.2-2 i686)

Description of problem:
/bin/sort from the textutils-2.0.11-7 RPM without any options sorts
according to the alphabetic order of only the alphabetic characters in the
lines given to it.  This is terrible!  /bin/sort is a basic and essential
piece of Unix plumbing.  I don't feel safe using any Unix which doesn't
have a properly working /bin/sort installed.

How reproducible:
Always

Steps to Reproduce:
1. Pipe the following text into /bin/sort (with no options)

/foo/Baz
/foo/bar
/foobaa


Actual Results:  /foobaa
/foo/bar
/foo/Baz


Expected Results:  /foo/Baz
/foo/bar
/foobaa


Additional info:

I'm putting this down as "high" severity, because somebody could lose their
job for recommending Red Hat Linux with a basic utility bug like this. 
Shell scripts which use sort are silently screwing up data all around
the world as we sit here.

Comment 1 Dan Reish 2001-06-05 15:46:19 UTC
Okay, so this is caused by LC_ALL not being set to POSIX in .bashrc, but that's
still a severe bug.



Comment 2 Dan Reish 2001-06-05 16:15:41 UTC
Read through http://mail.gnu.org/pipermail/bug-textutils/ to see the bullshit
that various GNU volunteers have had to deal with because of this bug (going
back to at least October 1999).  I'm sure it's just an honest mistake, but it's
hard not to get angry at Red Hat about something like this, especially given
that the bug has existed for such a long time.


Comment 3 stano 2001-06-26 14:49:06 UTC
There is no point in getting angry. This is not a mistake or bug, the sort 
works like advertised - see the (texinfo) docs:
  Unless otherwise specified, all comparisons use the character
  collating sequence specified by the `LC_COLLATE' locale.

The way the sorting works is defined by locale. Being the author of the slovak 
locale in glibc (with the collating part copied from the czech one) I know 
these issues quite well. We use one of the fancier locales and our sorting 
standard is actually unimplementable (it even requires the knowledge of the 
history :-)).

Sort is _text_ sorting utility and it should sort exactly how the locale 
prescribes. Shell scripts that are screwing data because of this are broken and 
the collating order is the lesser problem - e.g. not resetting LC_NUMERIC or 
grepping for some strings in output can be even worse. There are many hidden 
gotchas like this - e.g. [A-Z]* will match foo in some locales.

Comment 4 Bernhard Rosenkraenzer 2002-01-22 14:56:11 UTC
If you don't like it,

echo "LC_COLLATE=C" >>/etc/sysconfig/i18n