Bug 485715

Summary: sort -b --key 1.2,1.4M fails; sort --key 1.2b,1.4M works.
Product: [Fedora] Fedora Reporter: archimerged Ark submedes <archimerged>
Component: coreutilsAssignee: Ondrej Vasik <ovasik>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 9CC: kdudka, ovasik, redhat, twaugh
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 6.12-20.fc10 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-28 03:22:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description archimerged Ark submedes 2009-02-16 15:16:27 UTC
Description of problem:

In sort(1), when using the M flag on a key, the global -b flag is ignored, and must be specified on the first POS of the key.

Version-Release number of selected component (if applicable):

coreutils-6.10-33.fc9.i386

How reproducible:

every time

Steps to Reproduce:

$ printf -- '-Jan-\n -Mar-\n  -Feb-\n' | sort -b --key 1.2,1.4M
  -Feb-
 -Mar-
-Jan-
$ printf -- '-Jan-\n -Mar-\n  -Feb-\n' | sort --key 1.2b,1.4M
-Jan-
  -Feb-
 -Mar-
$ printf -- '-Jan-\n -Mar-\n  -Feb-\n' | sort --key 1.2,1.4bM
  -Feb-
 -Mar-
-Jan-
$ printf -- 'a -Jan-\na -Mar-\na -Feb-\n' | sort -b --key 2.2,2.4M
a -Feb-
a -Jan-
a -Mar-
$ printf -- 'a -Jan-\na -Mar-\na -Feb-\n' | sort --key 2.2b,2.4M
a -Jan-
a -Feb-
a -Mar-
$ printf -- 'a -Jan-\na -Mar-\na -Feb-\n' | sort --key 2.3,2.5M
a -Jan-
a -Feb-
a -Mar-
$ 

Actual results: as shown

Expected results: global -b should work the same as key suffix b.

Additional info:

Comment 1 Ondrej Vasik 2009-02-25 15:39:43 UTC
Thanks for report, problem actually solved this week by upstream (patch is here: http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00258.html), fixed in rawhide, built as coreutils-7.1-3.fc11, will include the fix in next coreutils update for already releassed Fedoras.

Comment 2 Fedora Update System 2009-02-26 16:53:04 UTC
coreutils-6.12-19.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/coreutils-6.12-19.fc10

Comment 3 Fedora Update System 2009-02-27 11:06:34 UTC
coreutils-6.10-34.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/coreutils-6.10-34.fc9

Comment 4 Fedora Update System 2009-02-28 03:21:53 UTC
coreutils-6.12-19.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 5 David 2009-03-01 09:46:29 UTC
Unless I am understanding the man page wrong, this update has caused a regression.

Consider the file:

1 2 3
2 3 4
1 50 60
100 200 300
1033 1 2
102 200 300

If I run "sort -n -k 1,1" on it, before the update (and on an OpenSUSE machine right now) I got this:

1 2 3
1 50 60
2 3 4
100 200 300
102 200 300
1033 1 2

Now I get this, which does not really conform to any logical ordering:

100 200 300
102 200 300
1033 1 2
1 2 3
1 50 60
2 3 4

Also, now there are files that are really messed up when sorted using the above command, where the keys aren't even grouped together (e.g. successive lines might start with 994, 9, 995, 995, 9, 996, ...). I can post some test files if necessary.

Comment 6 Ondrej Vasik 2009-03-02 11:08:26 UTC
It works correctly for C locales - which are recommended for reliable sort by info/man pages. With `LC_ALL=C sort -n -k1,1 test` I got correct results even with the updated sort. Problem seems to be in i18n patch (which never got upstream and is very unlikely to have it accepted by upstream unless someone will rewrite it from scratch) - and therefore in locales. Could you please provide locale which you use and confirm that the sort works correctly for C locale? If so, I will try to fix the i18n patch to make it work again as the sort should (mostly) work for non-C locales as well.

Comment 7 David 2009-03-02 11:53:29 UTC
Is LANG what you're asking for? If so, it's en_US.UTF-8 on my machine.

When I set LC_ALL=C, sort seems to be doing what I expect. However, I can't imagine it not being a bug for sort -n -k 1,1 to produce output like this:

...
 999,  65, 6, 60, 3/9/2009
 99,  97, 9, 13, 1/23/2009
 999,  98, 2, 87, 11/1/2008
 999,  98, 2, 87, 1/16/2009
...

It looks like it's kind-of-sort-of sorting by the second column as well as the first.

Comment 8 Ondrej Vasik 2009-03-02 12:50:00 UTC
Thanks for quick response, LANG is enough... it is obviously bug in multibyte patch (i18n) - Upstream fix for buggy sort behaviour was fixing only unibyte section and I forgot to adjust multibyte section provided by i18n ... `LC_ALL=C sort` workaround should work for every case now, multibyte patch (i18n) fixed in rawhide and now building in koji as coreutils-7.1-6.fc11.

Comment 9 Fedora Update System 2009-03-02 14:25:57 UTC
coreutils-6.12-20.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/coreutils-6.12-20.fc10

Comment 10 Fedora Update System 2009-03-16 19:42:13 UTC
coreutils-6.10-35.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 11 Fedora Update System 2009-03-16 19:47:08 UTC
coreutils-6.12-20.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.