Bug 43564 - /bin/sort is sorting by case-folded alphabetic order!
Summary: /bin/sort is sorting by case-folded alphabetic order!
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: textutils
Version: 7.1
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Bernhard Rosenkraenzer
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-06-05 15:40 UTC by Dan Reish
Modified: 2005-10-31 22:00 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2001-06-26 14:49:09 UTC
Embargoed:


Attachments (Terms of Use)

Description Dan Reish 2001-06-05 15:40:44 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.76 [en] (X11; U; Linux 2.4.2-2 i686)

Description of problem:
/bin/sort from the textutils-2.0.11-7 RPM without any options sorts
according to the alphabetic order of only the alphabetic characters in the
lines given to it.  This is terrible!  /bin/sort is a basic and essential
piece of Unix plumbing.  I don't feel safe using any Unix which doesn't
have a properly working /bin/sort installed.

How reproducible:
Always

Steps to Reproduce:
1. Pipe the following text into /bin/sort (with no options)

/foo/Baz
/foo/bar
/foobaa


Actual Results:  /foobaa
/foo/bar
/foo/Baz


Expected Results:  /foo/Baz
/foo/bar
/foobaa


Additional info:

I'm putting this down as "high" severity, because somebody could lose their
job for recommending Red Hat Linux with a basic utility bug like this. 
Shell scripts which use sort are silently screwing up data all around
the world as we sit here.

Comment 1 Dan Reish 2001-06-05 15:46:19 UTC
Okay, so this is caused by LC_ALL not being set to POSIX in .bashrc, but that's
still a severe bug.



Comment 2 Dan Reish 2001-06-05 16:15:41 UTC
Read through http://mail.gnu.org/pipermail/bug-textutils/ to see the bullshit
that various GNU volunteers have had to deal with because of this bug (going
back to at least October 1999).  I'm sure it's just an honest mistake, but it's
hard not to get angry at Red Hat about something like this, especially given
that the bug has existed for such a long time.


Comment 3 stano 2001-06-26 14:49:06 UTC
There is no point in getting angry. This is not a mistake or bug, the sort 
works like advertised - see the (texinfo) docs:
  Unless otherwise specified, all comparisons use the character
  collating sequence specified by the `LC_COLLATE' locale.

The way the sorting works is defined by locale. Being the author of the slovak 
locale in glibc (with the collating part copied from the czech one) I know 
these issues quite well. We use one of the fancier locales and our sorting 
standard is actually unimplementable (it even requires the knowledge of the 
history :-)).

Sort is _text_ sorting utility and it should sort exactly how the locale 
prescribes. Shell scripts that are screwing data because of this are broken and 
the collating order is the lesser problem - e.g. not resetting LC_NUMERIC or 
grepping for some strings in output can be even worse. There are many hidden 
gotchas like this - e.g. [A-Z]* will match foo in some locales.

Comment 4 Bernhard Rosenkraenzer 2002-01-22 14:56:11 UTC
If you don't like it,

echo "LC_COLLATE=C" >>/etc/sysconfig/i18n



Note You need to log in before you can comment on or make changes to this bug.