Bug 1055597 - sort produces incorrectly ordered results
Summary: sort produces incorrectly ordered results
Keywords:
Status: CLOSED DUPLICATE of bug 1003544
Alias: None
Product: Fedora
Classification: Fedora
Component: coreutils
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ondrej Oprala
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-20 15:26 UTC by Tom Hughes
Modified: 2015-01-16 03:36 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-01-16 03:36:51 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Tom Hughes 2014-01-20 15:26:52 UTC
Use sort from coreutils-8.21-20.fc20.x86_64 seems to (sometimes) produce incorrectly ordered results. Using this input:

x 1 dsfdfdsf
x2 1 dsfdfdsf
x2 2 dsfdfdsf
x 2 dsfdfdsf

and "sort -k 1,1 -k 2n,2" does not seem to correctly order on the primary key, giving:

x 1 dsfdfdsf
x2 1 dsfdfdsf
x2 2 dsfdfdsf
x 2 dsfdfdsf

removing the trailing data from each line causes the results to be as expected, as does adding "i" to the first key so that "sort -k 1i,1 -k 2n,2" gives the expected result:

x 1 dsfdfdsf
x 2 dsfdfdsf
x2 1 dsfdfdsf
x2 2 dsfdfdsf

The sort from coreutils-8.21-11.fc19.x86_64 in F19 does not seem to have this problem.

Comment 1 Ondrej Vasik 2014-01-20 15:44:46 UTC
Seems to be related to the i18n patch. With LC_ALL=C I'm getting the same output as you expect.
(Used locales and --debug output is usually useful for sort reports)

Comment 2 Tom Hughes 2014-01-20 16:00:53 UTC
Locale is en_GB.utf8 if that helps.

Comment 3 Adri Verhoef 2014-03-28 10:24:58 UTC
Another example.

Inputfile is /tmp/a-a containing four lines with three tab-separated columns of which the middle one has these values:

AA
AAA
A
A A

$ cat /tmp/a-a 
2	AA	E
3	AAA	E
1	A	E
0	A A	E
$ cut -f2 /tmp/a-a | sort
A
AA
A A
AAA
$ cut -f2 /tmp/a-a | sort --debug
sort: using ‘en_US.UTF-8’ sorting rules
A
_
AA
__
A A
___
AAA
___
$ cut -f2 /tmp/a-a | sort --debug -r
sort: using ‘en_US.UTF-8’ sorting rules
AAA
___
A A
___
AA
__
A
_

All looks normal and is properly sorted till this far.

Now sort the file with the three columns, the middle one being the key to sort.
'$TAB' has the value of a real Tab character (^I).
$  TAB="	";echo x"$TAB"x
x	x
$ < /tmp/a-a sort -t "$TAB" -k 2,2 
3	AAA	E
0	A A	E
2	AA	E
1	A	E
$ < /tmp/a-a sort -t "$TAB" -k 2,2 -r
1	A	E
2	AA	E
0	A A	E
3	AAA	E
$ < /tmp/a-a sort -t "$TAB" -k 2,2 -i
1	A	E
2	AA	E
0	A A	E
3	AAA	E
$ < /tmp/a-a sort -t "$TAB" -k 2,2 --debug
sort: using ‘en_US.UTF-8’ sorting rules
3>AAA>E
  ___
_______
0>A A>E
  ___
_______
2>AA>E
  __
______
1>A>E
  _
_____

More info:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$ type sort
sort is hashed (/usr/bin/sort)
$ rpm -qf /usr/bin/sort /usr/share/i18n/locales/en_US
coreutils-8.21-21.fc20.x86_64
glibc-common-2.18-12.fc20.x86_64

Comment 4 Adri Verhoef 2015-01-15 19:41:54 UTC
The problem has been resolved for me in Fedora 21 with
$ rpm -qf /usr/bin/sort /usr/share/i18n/locales/en_US
coreutils-8.22-19.fc21.x86_64
glibc-common-2.20-7.fc21.x86_64

Comment 5 Tom Hughes 2015-01-16 00:06:28 UTC
Agreed that this seems to be correct in F21.

Comment 6 Pádraig Brady 2015-01-16 03:36:51 UTC

*** This bug has been marked as a duplicate of bug 1003544 ***


Note You need to log in before you can comment on or make changes to this bug.