From Bugzilla Helper: User-Agent: Mozilla/4.76 [en] (X11; U; Linux 2.4.2-2 i686) Description of problem: /bin/sort from the textutils-2.0.11-7 RPM without any options sorts according to the alphabetic order of only the alphabetic characters in the lines given to it. This is terrible! /bin/sort is a basic and essential piece of Unix plumbing. I don't feel safe using any Unix which doesn't have a properly working /bin/sort installed. How reproducible: Always Steps to Reproduce: 1. Pipe the following text into /bin/sort (with no options) /foo/Baz /foo/bar /foobaa Actual Results: /foobaa /foo/bar /foo/Baz Expected Results: /foo/Baz /foo/bar /foobaa Additional info: I'm putting this down as "high" severity, because somebody could lose their job for recommending Red Hat Linux with a basic utility bug like this. Shell scripts which use sort are silently screwing up data all around the world as we sit here.
Okay, so this is caused by LC_ALL not being set to POSIX in .bashrc, but that's still a severe bug.
Read through http://mail.gnu.org/pipermail/bug-textutils/ to see the bullshit that various GNU volunteers have had to deal with because of this bug (going back to at least October 1999). I'm sure it's just an honest mistake, but it's hard not to get angry at Red Hat about something like this, especially given that the bug has existed for such a long time.
There is no point in getting angry. This is not a mistake or bug, the sort works like advertised - see the (texinfo) docs: Unless otherwise specified, all comparisons use the character collating sequence specified by the `LC_COLLATE' locale. The way the sorting works is defined by locale. Being the author of the slovak locale in glibc (with the collating part copied from the czech one) I know these issues quite well. We use one of the fancier locales and our sorting standard is actually unimplementable (it even requires the knowledge of the history :-)). Sort is _text_ sorting utility and it should sort exactly how the locale prescribes. Shell scripts that are screwing data because of this are broken and the collating order is the lesser problem - e.g. not resetting LC_NUMERIC or grepping for some strings in output can be even worse. There are many hidden gotchas like this - e.g. [A-Z]* will match foo in some locales.
If you don't like it, echo "LC_COLLATE=C" >>/etc/sysconfig/i18n